Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Lambda

Senior Site Reliability Engineer – Observability

Lambda

Senior Site Reliability Engineer deploying observability platforms for AI cloud infrastructure at Lambda. Collaborating with engineering teams to enhance product reliability and system monitoring.

Posted 5/9/2026full-timeSan Francisco • California, Washington • 🇺🇸 United StatesSenior💰 $240,000 - $401,000 per yearWebsite

Tech Stack

Tools & technologies
GoKubernetes

About the role

Key responsibilities & impact
  • Deploy and operate observability platforms for logging, metrics, and distributed tracing.
  • Automate the deployment and operation of these observability systems.
  • Set up monitoring for modern AI/HPC cluster infrastructure.
  • Develop platform software to make observability adoptable and improve product reliability.
  • Lead members of other engineering teams in development of solutions for their monitoring challenges.

Requirements

What you’ll need
  • Have 8+ years of experience in software engineering, with 3+ years in Go
  • Have 5+ years of experience in Site Reliability Engineering practices
  • Possess proven understanding of Observability tools and practices
  • Have experience with application deployment and monitoring using Kubernetes
  • Have strong experience with modern devops practices
  • Expect quality and reliability from the solutions you build
  • Enjoy collaborating across team boundaries to help our engineering teams meet their observability needs

Benefits

Comp & perks
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible paid time off plan that we all actually use

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoSite Reliability EngineeringObservability toolsKubernetesDevOps practicesSoftware developmentMonitoringAutomationDistributed tracingLogging
Soft Skills
CollaborationLeadershipProblem-solvingQuality assuranceReliability focus