FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Site Reliability Engineer
UJETSenior Site Reliability Engineer improving system reliability and establishing best practices at AI-driven contact center UJET. Leading incident response and mentoring engineers for operational maturity.
Posted 4/20/2026full-timeRemote • Texas • 🇺🇸 United StatesSenior💰 $100,000 - $120,000 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureCloudDistributed SystemsGoGoogle Cloud PlatformJavaPython
About the role
Key responsibilities & impact- Lead efforts to improve system reliability, scalability, and performance across critical services
- Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities
- Design and develop observability systems (metrics, logging, tracing, alerting) that produce actionable alerts and data with minimal noise
- Lead complex incident response, acting as incident commander when needed
- Conduct postmortems focused on systemic causes rather than individual fault, and ensure corrective actions from those reviews are completed.
- Identify and eliminate toil through automation, tooling, and improved workflows
- Partner with product and platform teams on architecture decisions, production readiness, and designing systems that recover from failure
- Build reusable systems and “paved roads” that make it easier for teams to operate their services reliably
- Mentor other engineers and raise the overall operational maturity of the organization
Requirements
What you’ll need- 6 - 10+ years of experience in SRE, infrastructure, or backend systems engineering
- Demonstrated experience of owning reliability outcomes for complex, distributed systems
- Strong experience with cloud infrastructure (AWS, GCP, or Azure) and production-scale systems
- Deep understanding of observability, incident management, and system performance
- Proficiency in at least one programming language (e.g., Go, Python, Java) with a focus on automation and tooling
- Able to change how other teams work without having managerial authority over them
- Strong competency in making clear decisions during incidents by following a defined process without reacting emotionally.
Benefits
Comp & perks- Medical
- Dental
- Vision
- 401(k) plan
- Commuter benefits
- Comprehensive Benefits
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
SREinfrastructure engineeringbackend systems engineeringcloud infrastructureAWSGCPAzureobservabilityincident managementprogramming language
Soft Skills
mentoringdecision makingincident responsecollaborationleadershipcommunicationproblem solvingprocess adherenceorganizational maturitychange management