Cursor

Software Engineer, Reliability

Cursor

full-time

Posted on:

Location Type: Office

Location: San FranciscoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

About the role

  • Own reliability work end-to-end, from user-facing symptoms (crashes, latency, streaming failures) to root causes in services, infrastructure, or vendor dependencies.
  • Design and implement resilience patterns for upstream dependency failures (for example model providers): fallbacks, routing strategies, and degraded-mode designs.
  • Build and maintain reliability guardrails that make teams faster and safer: deployment safety, rollbacks, operational playbooks, automated checks, and standards for production readiness.
  • Improve observability (metrics, logs, traces, and client telemetry) so engineers can quickly answer 'Is it up?' and 'What changed?'.
  • Reduce operational toil through automation and better tooling.
  • Partner with product and infrastructure engineering teams as a drop-in reliability multiplier: embed on the highest-impact problems and drive them to a durable technical outcome.
  • Participate in an on-call rotation and help improve incident response practices over time (severity definitions, runbooks, retrospectives, and clear ownership of follow-up fixes).
  • You will own a small set of high-leverage reliability 'themes' at a time (for example client crash rate, streaming reliability, deploy safety). You drive these end-to-end until the reliability bar measurably moves.

Requirements

  • Strong experience owning reliability for production systems, including both incident response and long-term engineering fixes.
  • Expert-level experience in at least one of: Go, Node/TypeScript, or Python.
  • Deep practical knowledge of cloud infrastructure (AWS) and modern deployment/orchestration patterns (Kubernetes and/or ECS).
  • Experience with observability systems and practices (metrics, logs, traces, and alerting).
  • Clear communication and cross-team leadership.
Benefits
  • Health insurance
  • 401(k) matching
  • Paid time off
  • Flexible work arrangements
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoNode.jsTypeScriptPythoncloud infrastructureKubernetesECSobservability systemsmetricsalerting
Soft Skills
clear communicationcross-team leadership