
Senior Site Reliability Engineer
UJET
full-time
Posted on:
Location Type: Remote
Location: Texas • United States
Visit company websiteExplore more
Salary
💰 $100,000 - $120,000 per year
Job Level
About the role
- Lead efforts to improve system reliability, scalability, and performance across critical services
- Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities
- Design and develop observability systems (metrics, logging, tracing, alerting) that produce actionable alerts and data with minimal noise
- Lead complex incident response, acting as incident commander when needed
- Conduct postmortems focused on systemic causes rather than individual fault, and ensure corrective actions from those reviews are completed.
- Identify and eliminate toil through automation, tooling, and improved workflows
- Partner with product and platform teams on architecture decisions, production readiness, and designing systems that recover from failure
- Build reusable systems and “paved roads” that make it easier for teams to operate their services reliably
- Mentor other engineers and raise the overall operational maturity of the organization
Requirements
- 6 - 10+ years of experience in SRE, infrastructure, or backend systems engineering
- Demonstrated experience of owning reliability outcomes for complex, distributed systems
- Strong experience with cloud infrastructure (AWS, GCP, or Azure) and production-scale systems
- Deep understanding of observability, incident management, and system performance
- Proficiency in at least one programming language (e.g., Go, Python, Java) with a focus on automation and tooling
- Able to change how other teams work without having managerial authority over them
- Strong competency in making clear decisions during incidents by following a defined process without reacting emotionally.
Benefits
- Medical
- Dental
- Vision
- 401(k) plan
- Commuter benefits
- Comprehensive Benefits
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
SREinfrastructure engineeringbackend systems engineeringcloud infrastructureAWSGCPAzureobservabilityincident managementprogramming language
Soft Skills
mentoringdecision makingincident responsecollaborationleadershipcommunicationproblem solvingprocess adherenceorganizational maturitychange management