Senior Site Reliability Engineer

Employer Direct Healthcare

Senior Site Reliability Engineer managing Azure-based healthcare platform for Lantern. Defining SRE practices and ensuring system reliability and compliance.

Posted 4/23/2026full-timeDallas • Texas • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies

AWSAzureGoogle Cloud PlatformGrafanaPrometheusPythonTerraform

About the role

Key responsibilities & impact

Define and track SLOs/SLIs/error budgets for critical healthcare services
Build and maintain observability platforms (monitoring, logging, alerting, tracing) using Datadog and Azure Monitor
Lead incident management processes using Rootly, including on-call rotations, runbooks, and post-incident reviews
Automate operational toil through Infrastructure-as-Code (Terraform) and custom tooling
Design and implement disaster recovery and business continuity strategies
Collaborate with development teams to improve service reliability through architecture reviews and chaos engineering
Optimize system performance, capacity planning, and cost efficiency for Azure infrastructure
Ensure production systems meet HIPAA, SOC 2, and other regulatory requirements
Maintain and improve CI/CD pipelines to support safe, rapid deployments
Mentor junior engineers and foster a culture of reliability and operational excellence

Requirements

What you’ll need

Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent practical experience.
4+ years in SRE, DevOps, or production operations roles
3+ years with Microsoft Azure (AWS/GCP a plus)
Strong experience with observability tools (Datadog, Azure Monitor, Prometheus, Grafana, or similar)
Experience defining and managing SLOs/SLIs and error budgets
Proven incident management and on-call experience (Rootly or similar incident management platforms)
Hands-on with Infrastructure as Code (Terraform) and CI/CD (Azure DevOps, GitHub Actions)
Experience in regulated environments (healthcare/HIPAA preferred)
Strong scripting skills (Python, Bash, PowerShell)
Excellent communication and collaboration skills
If you don’t meet every requirement listed, we still encourage you to apply.

Benefits

Comp & perks

Medical Insurance
Dental Insurance
Vision Insurance
Short & Long Term Disability
Life Insurance
401k with company match
Flexible Time Off
Paid Parental Leave

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

SREDevOpsMicrosoft AzureTerraformCI/CDPythonBashPowerShellobservabilitychaos engineering

Soft Skills

communicationcollaborationmentoringincident managementoperational excellence