
Staff Site Reliability Engineer
Dave
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $208,000 - $330,000 per year
Job Level
About the role
- Lead architecture and automation across our GCP environment, ensuring reliability, scalability, security, and thoughtful cost management.
- Define and improve SLIs, SLOs, and error budgets using Cloud Monitoring and Datadog — connecting reliability goals to real business outcomes.
- Shape our multi-region, disaster recovery, and capacity planning strategies so the platform holds up as we grow.
- Design and optimize cloud networking, including VPC architecture, ingress/egress, Cloud Armor, VPN, and DNS to support internal systems, partner integrations, and member-facing services.
- Drive infrastructure-as-code and GitOps practices using Terraform, Kubernetes, Helm, and ArgoCD to make deployments predictable and repeatable.
- Mentor SREs and infrastructure engineers through design reviews, incident retros, and hands-on collaboration — strengthening technical depth across the team.
- Explore practical LLM-driven automation where it meaningfully reduces operational toil and shortens incident resolution time.
Requirements
- 8+ years in software, infrastructure, or site reliability engineering.
- 5+ years of hands-on experience operating production systems in GCP (compute, networking, storage, IAM, observability).
- Deep experience with Kubernetes (GKE), Helm, containerization, Terraform (IaC), and ArgoCD.
- Strong programming skills in Python, Go, or TypeScript/JavaScript for automation and internal tooling.
- Experience defining and operating against SLIs, SLOs, and error budgets.
- Strong knowledge of relational and distributed databases (e.g., MySQL, Cloud SQL, Cloud Spanner, Redis), including performance tuning and HA strategies.
- Experience leading incident response, root cause analysis, and systemic remediation.
Benefits
- Opportunity to tackle tough challenges, learn and grow from fellow top talent, and help millions of people reach their personal financial goals
- Flexible hours and virtual first work culture with a home office stipend
- Premium Medical, Dental, and Vision Insurance plans
- Generous paid parental and caregiver leave
- 401(k) savings plan with matching contributions
- Financial advisor and financial wellness support
- Flexible PTO and generous company holidays, including Juneteenth and Winter Break
- All-company in-person events once or twice a year and virtual events throughout to connect with your team members and leadership team
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GCPKubernetesTerraformHelmArgoCDPythonGoTypeScriptJavaScriptSLIs
Soft Skills
mentoringcollaborationincident responseroot cause analysissystemic remediation