Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Vapi

Member of Technical Staff – Site Reliability Engineer

Vapi

Member of Technical Staff role focusing on Site Reliability Engineering in a voice AI platform company. Driving call completion rates and building reliability culture.

Posted 6/3/2026full-timeSan Francisco • California • 🇺🇸 United StatesLead💰 $200,000 - $270,000 per yearWebsite

Tech Stack

Tools & technologies
GoGrafanaKubernetesPrometheusTypeScript

About the role

Key responsibilities & impact
  • 30 Day: Join the oncall rotation. Walk the 15 stability-gap incidents and turn the patterns into a prioritized reliability backlog. Define the first set of SLOs for the call-completion path.
  • 60 Day: Stand up error budgets and SLO-based alerting in Chronosphere/Prometheus for the highest-impact services. Run the first proper load test against provider rate limits and per-org concurrency. Tune autoscaling for wscaler / workerpool-cron-scaler.
  • 90 Day: Ship a real platform service — capacity forecaster, auto-remediation, or oncall tooling — in Go or TypeScript. Own the postmortem process. Drive a measurable improvement in p99 call completion or MTTR.

Requirements

What you’ll need
  • You’ve run incident command and postmortem discipline at scale on a real oncall rotation.
  • You’ve operated SLOs and error budgets in Chronosphere, Prometheus, Grafana, or Datadog.
  • You’ve done capacity planning and load testing for production systems with real users.
  • You’re fluent in Kubernetes production ops: pod crash diagnosis, HPA/VPA tuning, PodDisruptionBudgets, graceful shutdown.
  • You know backpressure and autoscaling patterns — KEDA, custom metrics scaling.
  • Nice-to-haves: You ship code, not just scripts. You can build platform services in Go or TypeScript.

Benefits

Comp & perks
  • Comprehensive health coverage: medical, dental, and vision plans
  • Real stake: competitive salary and excellent equity ownership
  • Team love: We love hanging out, and we do quarterly off-sites
  • Flexible time off: take what you need
  • More: catered meals, transportation, gym, and a $10k annual L&D budget

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoTypeScriptKubernetesload testingcapacity planningautoscalingerror budgetsSLOsbackpressurepod crash diagnosis
Soft Skills
incident commandpostmortem disciplineownershipmeasurable improvement