FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Software Engineer – Site Reliability Engineering
The Home DepotSenior Software Engineer for Site Reliability Engineering at Home Depot. Building and operating internal platforms for store systems' reliability and observability.
Tech Stack
Tools & technologiesBigQueryCloudGoGoogle Cloud PlatformJavaScriptKubernetesPythonSeleniumSpinnakerTerraformTypeScript
About the role
Key responsibilities & impact- Develops, tests, deploys, and maintains software for internal platforms
- Designs, develops, and maintains tools for reliability engineering teams
- Extends internal reliability tools using Kubernetes, Terraform on Google Cloud Platform
- Deploys and maintains production logging, tracing, and profiling systems
- Identifies and automates repetitive operational tasks
- Maintains and extends SLO and Critical User Journey platforms
- Participates in on-call rotation and contributes to incident response
Requirements
What you’ll need- 3-5 years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or Infrastructure Engineering
- Hands-on experience with Google Cloud Platform (GCP), including GKE, GCS, BigQuery, Cloud Pub/Sub, Cloud Logging, IAM, and Workload Identity.
- Strong Kubernetes experience: deploying and managing workloads on GKE or similar managed Kubernetes services, writing and debugging Helm charts, managing namespaces, RBAC, service accounts, and troubleshooting issues
- Experience with infrastructure-as-code tools, particularly Terraform for cloud resource management.
- Proficiency in one or more of: Go, Python, JavaScript/TypeScript, YAML.
- Experience with observability platforms: deploying, configuring, or operating log aggregation, distributed tracing, metrics, dashboarding, or continuous profiling
- Practical understanding of SLOs, SLIs, and error budgets.
- Experience with synthetic monitoring or performance testing frameworks (k6, Playwright, Selenium, Locust, or similar).
- Familiarity with incident management and on-call practices: Blameless post-mortems, runbook development, and incident communication
- Experience with CI/CD pipelines using GitHub Actions, Spinnaker, ArgoCD, or similar.
- Understanding of deployment strategies (blue/green, canary, rolling).
Benefits
Comp & perks- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
- Remote work options
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesTerraformGoogle Cloud PlatformGoPythonJavaScriptTypeScriptYAMLobservability platformssynthetic monitoring
Soft Skills
incident responseon-call practicescommunicationblameless post-mortemsrunbook development