Pylon

Senior Site Reliability Engineer

Pylon

full-time

Posted on:

Location Type: Hybrid

Location: Palo AltoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $140,000 - $220,000 per year

Job Level

Tech Stack

About the role

  • You'll own reliability and operational excellence for Pylon's production systems.
  • Designing and implementing monitoring, alerting, and incident response processes that scale as we grow.
  • Building tooling that makes the entire engineering team more effective.
  • Establish on-call rotations and runbooks.
  • Ensure our platform can handle the demands of a regulated, high-stakes financial product.
  • Spend 50%+ of your time writing code: building infrastructure tooling, automating operational burden, making reliability improvements, and productivity tools.

Requirements

  • 4+ years experience in SRE, infrastructure, or platform engineering roles
  • Experience working on a team of SREs at a company with mature SRE practices (not solo SRE roles)
  • Real on-call experience at scale in a large production environment (you've carried the pager and lived through incidents)
  • Deep AWS expertise (ECS, RDS, networking, security)
  • Strong experience with declarative infrastructure (Terraform, CDK, or similar)
  • Nix experience (we use it and want to expand its adoption)
  • Track record of building reliability tooling and automation
  • Can design and implement monitoring, alerting, and observability systems from first principles
  • Comfortable working in a regulated environment where "breaking things" is not an option.
Benefits
  • Equity
  • Benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
SREinfrastructure engineeringplatform engineeringAWSTerraformCDKNixmonitoring systemsalerting systemsobservability systems
Soft Skills
operational excellenceteam collaborationincident responsereliability improvementsautomationtooling developmenton-call experiencedesign skillsproblem-solvingadaptability