FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Site Reliability Engineer
MozillaSenior Site Reliability Engineer establishing infrastructure and operational systems for Thunderbird's open-source email applications. Focusing on reliability improvements and collaboration with distributed teams.
Posted 6/8/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $123,000 - $144,000 per yearWebsite
Tech Stack
Tools & technologiesAWSGrafanaKubernetesTerraform
About the role
Key responsibilities & impact- Operate and evolve our EKS-based Kubernetes platform, supporting service migrations, platform improvements, and reliability initiatives.
- Design and develop CI/CD systems supporting websites, services, and Thunderbird desktop releases, contributing to pipeline reliability and OIDC-based authentication across GitHub Actions workflows.
- Write and maintain infrastructure in Pulumi and/or Terraform/OpenTofu across multiple AWS accounts.
- Operate and evolve our observability stack (VictoriaMetrics, VictoriaLogs, Grafana, Vector) and partner with engineering teams to incorporate instrumentation and monitoring into service design.
- Apply security-conscious infrastructure practices, including least-privilege IAM, secrets management via AWS Secrets Manager and External Secrets Operator, and network segmentation.
- Diagnose and debug production incidents; drive root-cause analysis and post-incident improvements to prevent recurring problems.
- Participate in on-call rotation and collaborate with SDEs and fellow SREs to ship, maintain, and monitor new builds and support service onboarding.
- Contribute to runbooks, architecture documentation, and team processes.
Requirements
What you’ll need- 7+ years of experience in infrastructure, platform engineering, or site reliability roles, including hands-on production Kubernetes experience in workload operations, troubleshooting, and cluster management.
- Hands-on experience with infrastructure-as-code on AWS using Terraform, OpenTofu, or Pulumi.
- Security awareness in day-to-day infrastructure work: identity, least privilege, secrets hygiene, and network controls.
- Demonstrated ownership mindset with the ability to proactively identify issues, drive work to completion, and communicate risks early.
- Excellent async written communication skills; comfortable working with a geographically distributed team.
- Ability to collaborate effectively with software engineers and non-engineering stakeholders to improve platform reliability and operational efficiency.
- Ability to learn, evaluate, and responsibly use emerging technologies, including AI-enabled tools, to improve work processes.
Benefits
Comp & perks- Fully remote work & schedule flexibility
- Company-provided laptop
- Annual bonus program
- Monthly remote work stipend
- Annual professional development stipend
- Industry conferences
- Company all-hands and team gatherings
- 24 days PTO per year (prorated)
- Your birthday
- Year-end company shutdown
- 9 wellbeing days
- Public holidays
- Other paid leave
- Quarterly wellbeing stipend for personal / family activities
- 401(k) / RRSP contributions
- Health, dental, & vision insurance
- Disability insurance
- Life insurance
- Employee assistance program
- Paid parental leave
- Paid sick days
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesCI/CDPulumiTerraformAWSOIDCVictoriaMetricsGrafanaVectorsecrets management
Soft Skills
ownership mindsetcommunicationcollaborationproblem-solvingproactive identification of issuesasync written communicationability to learnevaluation of technologies