WEX

Senior SRE Manager

WEX

full-time

Posted on:

Location Type: Hybrid

Location: PortlandCaliforniaIllinoisUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $175,600 - $204,300 per year

Job Level

About the role

  • architect and oversee the implementation of mission-critical systems
  • define and enforce SRE best practices and operational standards
  • lead cross-functional initiatives to enhance system reliability and performance
  • serve as a technical advisor for engineering leadership
  • develop capacity planning and load testing strategies
  • design self-healing and auto-recovery mechanisms
  • drive cloud cost optimization and budgeting initiatives
  • lead one or more SRE teams responsible for a major platform or domain
  • partner with Engineering, Product, and Program stakeholders to align team delivery with business priorities

Requirements

  • 8+ years of experience with a focus on large-scale system reliability
  • expertise in system architecture, cloud platforms, and automation frameworks
  • deep knowledge of Kubernetes, service meshes, and distributed tracing
  • experience with monitoring and logging (Grafana, ELK stack, Splunk, etc.)
  • knowledge of containerization and orchestration (Docker, Kubernetes)
  • experience designing high-availability, fault-tolerant architectures
  • strong understanding of database reliability engineering (MySQL, PostgreSQL, NoSQL)
  • knowledge of networking, databases, and storage architectures
  • excellent incident command and crisis management skills
  • experience setting team OKRs and aligning reliability goals with product and platform engineering strategies
Benefits
  • health, dental and vision insurances
  • retirement savings plan
  • paid time off
  • health savings account
  • flexible spending accounts
  • life insurance
  • disability insurance
  • tuition reimbursement
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
system architecturecloud platformsautomation frameworksKubernetesservice meshesdistributed tracingmonitoringlogginghigh-availability architecturedatabase reliability engineering
Soft Skills
leadershipcross-functional collaborationtechnical advisingincident commandcrisis managementcapacity planningload testingcost optimizationteam alignmentgoal setting