RELX

Manager, Site Reliability Engineering

RELX

full-time

Posted on:

Location Type: Hybrid

Location: Raleigh • North Carolina, Pennsylvania • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $133,400 - $247,800 per year

Job Level

Mid-LevelSenior

Tech Stack

AnsibleAWSAzureCloudDockerEC2JavaKubernetes.NETPythonReactSplunkSQLSwiftTerraformTypeScriptVault

About the role

  • Hire, mentor, and lead a high-performing, globally distributed team of SRE and DevOps engineers.
  • Foster a culture of reliability, blameless postmortems, and continuous improvement.
  • Build and sustain a global SRE community of practice that aligns reliability standards across business units.
  • Drive cross-functional initiatives and influence enterprise-wide engineering practices.
  • Define and implement SRE best practices to improve reliability, scalability, and performance.
  • Establish and monitor key performance indicators (uptime, MTTR, SLO/SLI compliance).
  • Serve as an escalation point for major incidents, ensuring swift resolution and actionable post-incident reviews.
  • Partner with Product, Cloud Infrastructure, Security, and Architecture teams to ensure alignment with enterprise objectives.
  • Collaborate with Cloud Engineering and Architecture to build robust monitoring, alerting, and observability systems.
  • Lead modernization initiatives, including cloud migrations, IaC automation (Terraform, Kubernetes), and CI/CD pipeline improvements.
  • Drive cloud cost efficiency and governance (FinOps).
  • Ensure compliance with ISO 27001, NIST 800-53, and similar security frameworks.
  • Define and implement SLOs, SLIs, and SLAs for AI/ML pipelines, APIs, and model training systems.
  • Partner with AI/ML and Cloud teams to ensure the reliability, observability, and performance of AI workloads.
  • Lead reliability engineering for MLOps — orchestration, IaC, monitoring, and automated scaling.
  • Champion security, compliance, and fault tolerance across emerging AI platforms.
  • Provide clear direction, feedback, and professional growth opportunities for team members.
  • Encourage innovation, continuous learning, and adoption of new reliability and automation techniques.
  • Lead with a global mindset, balancing local autonomy with enterprise alignment.

Requirements

  • Bachelor’s degree in computer science, Engineering, or related field (advanced degree preferred).
  • Experience as a Sr. SRE, platform engineering, or DevOps, including several years in a global leadership role.
  • Proven experience leading distributed technical teams and aligning cross-functional stakeholders.
  • Strong expertise in Azure and/or AWS, Kubernetes (EKS/AKS), Terraform, and CI/CD tooling.
  • Background in observability, automation, incident management, and service reliability.
  • Experience with AI/ML infrastructure (Databricks, MLflow, MLOps).
  • Cloud & Infrastructure: Azure, AWS (EKS, EC2, S3, RDS, Lambda, Azure VMs, Functions)
  • Infrastructure as Code: Terraform (modules, workspaces, policies), Ansible, ARM/BICEP/HCL, Spacelift
  • Containers & Orchestration: Docker, Kubernetes, Helm, ArgoCD
  • Monitoring & Observability: Datadog, Splunk, Coralogix, CloudWatch, Azure Monitor
  • Automation & Scripting: Python, Bash, PowerShell, TypeScript
  • Security & Networking: Azure Key Vault, HashiCorp Vault, cloud security best practices
  • Programming Familiarity: Java, .NET/C#, SQL, React environments
  • Empathetic and motivational leader who develops technical talent and fosters collaboration.
  • Excellent communicator capable of engaging both technical and business stakeholders.
  • Deep commitment to transparency, reliability culture, and continuous improvement.
Benefits
  • Comprehensive, multi-carrier health plan benefits
  • Disability insurance
  • Dependent care and commuter spending accounts
  • Life and accident insurance
  • Retirement benefits (salary investment plan/employer stock purchase plan)
  • Modern family benefits, including adoption and surrogacy

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
SREDevOpscloud migrationsIaCmonitoringobservabilityautomationincident managementAI/ML infrastructureservice reliability
Soft skills
leadershipmentoringcollaborationcommunicationinnovationcontinuous improvementempathymotivationtransparencyteam development
Certifications
Bachelor's degreeadvanced degree preferredISO 27001NIST 800-53