Manager, Site Reliability Engineering

RELX

full-time

Posted on: 12/9/2025

Location Type: Hybrid

Location: Raleigh • North Carolina, Pennsylvania • 🇺🇸 United States

Visit company website

✨ AI Apply

Apply

Salary

💰 $133,400 - $247,800 per year

Job Level

Mid-LevelSenior

Tech Stack

AnsibleAWSAzureCloudDockerEC2JavaKubernetes.NETPythonReactSplunkSQLSwiftTerraformTypeScriptVault

About the role

Hire, mentor, and lead a high-performing, globally distributed team of SRE and DevOps engineers.
Foster a culture of reliability, blameless postmortems, and continuous improvement.
Build and sustain a global SRE community of practice that aligns reliability standards across business units.
Drive cross-functional initiatives and influence enterprise-wide engineering practices.
Define and implement SRE best practices to improve reliability, scalability, and performance.
Establish and monitor key performance indicators (uptime, MTTR, SLO/SLI compliance).
Serve as an escalation point for major incidents, ensuring swift resolution and actionable post-incident reviews.
Partner with Product, Cloud Infrastructure, Security, and Architecture teams to ensure alignment with enterprise objectives.
Collaborate with Cloud Engineering and Architecture to build robust monitoring, alerting, and observability systems.
Lead modernization initiatives, including cloud migrations, IaC automation (Terraform, Kubernetes), and CI/CD pipeline improvements.
Drive cloud cost efficiency and governance (FinOps).
Ensure compliance with ISO 27001, NIST 800-53, and similar security frameworks.
Define and implement SLOs, SLIs, and SLAs for AI/ML pipelines, APIs, and model training systems.
Partner with AI/ML and Cloud teams to ensure the reliability, observability, and performance of AI workloads.
Lead reliability engineering for MLOps — orchestration, IaC, monitoring, and automated scaling.
Champion security, compliance, and fault tolerance across emerging AI platforms.
Provide clear direction, feedback, and professional growth opportunities for team members.
Encourage innovation, continuous learning, and adoption of new reliability and automation techniques.
Lead with a global mindset, balancing local autonomy with enterprise alignment.

Requirements

Bachelor’s degree in computer science, Engineering, or related field (advanced degree preferred).
Experience as a Sr. SRE, platform engineering, or DevOps, including several years in a global leadership role.
Proven experience leading distributed technical teams and aligning cross-functional stakeholders.
Strong expertise in Azure and/or AWS, Kubernetes (EKS/AKS), Terraform, and CI/CD tooling.
Background in observability, automation, incident management, and service reliability.
Experience with AI/ML infrastructure (Databricks, MLflow, MLOps).
Cloud & Infrastructure: Azure, AWS (EKS, EC2, S3, RDS, Lambda, Azure VMs, Functions)
Infrastructure as Code: Terraform (modules, workspaces, policies), Ansible, ARM/BICEP/HCL, Spacelift
Containers & Orchestration: Docker, Kubernetes, Helm, ArgoCD
Monitoring & Observability: Datadog, Splunk, Coralogix, CloudWatch, Azure Monitor
Automation & Scripting: Python, Bash, PowerShell, TypeScript
Security & Networking: Azure Key Vault, HashiCorp Vault, cloud security best practices
Programming Familiarity: Java, .NET/C#, SQL, React environments
Empathetic and motivational leader who develops technical talent and fosters collaboration.
Excellent communicator capable of engaging both technical and business stakeholders.
Deep commitment to transparency, reliability culture, and continuous improvement.

Benefits

Comprehensive, multi-carrier health plan benefits
Disability insurance
Dependent care and commuter spending accounts
Life and accident insurance
Retirement benefits (salary investment plan/employer stock purchase plan)
Modern family benefits, including adoption and surrogacy

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills

SREDevOpscloud migrationsIaCmonitoringobservabilityautomationincident managementAI/ML infrastructureservice reliability

Soft skills

leadershipmentoringcollaborationcommunicationinnovationcontinuous improvementempathymotivationtransparencyteam development

Certifications

Bachelor's degreeadvanced degree preferredISO 27001NIST 800-53