Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
MSD

Senior Staff Reliability Engineer, Software Engineering

MSD

Senior Principal Reliability Engineer at global health care leader transforming reliability engineering practices. Overseeing Site Reliability Engineering and enhancing systems across the organization.

Posted 5/29/2026full-timeNew Jersey, Pennsylvania • 🇺🇸 United StatesSenior💰 $142,400 - $224,100 per yearWebsite

Tech Stack

Tools & technologies
CloudITSMOpen SourceSDLC

About the role

Key responsibilities & impact
  • Build relationships across the broader IT organization to increase adoption and maturity of SRE, Observability, and Resilience practices
  • Define and evolve the strategic vision for enterprise reliability engineering and ensure alignment across product, platform, and ITSM teams
  • Establish and enforce standards for Service Level Objectives, observability frameworks, and resilience engineering practices
  • Collaborate with engineering teams to ensure reliability is embedded into architecture, design, and delivery processes
  • Drive adoption of Service Level Objectives using Nobl9 as the system of record for reliability governance
  • Lead evaluation and introduction of new technologies that improve reliability outcomes while integrating with existing platforms
  • Apply AI capabilities to enhance reliability practices, including incident triage, diagnostics, and automation, in a governed and controlled manner
  • Collaborate within efforts to standardize observability across logs, metrics, traces, and events to improve system visibility and decision-making
  • Consult and promote resilience patterns including fault isolation, failover strategies, and recovery mechanisms
  • Guide improvements surrounding incident lifecycle effectiveness, including detection, response, root cause analysis, and continuous improvement
  • Lead and mentor a community of reliability practitioners to grow organizational capability and maturity
  • Represent reliability engineering practice in architecture reviews, governance forums, and key IT initiatives
  • Drive continuous improvement of reliability practices through research, innovation, and feedback from engineering teams

Requirements

What you’ll need
  • Bachelors degree in IT, Engineering, Computer Science, or related field
  • Minimum 7 years experience in site reliability engineering
  • Expertise in capacity management, system integration, software development, release management, network design, configuration management (CM), software development life cycle (SDLC), system administration, change controls, and solution architecture
  • Proficiency in designing, managing, developing, and maintaining technological products, particularly in the animal health domain
  • Strong expertise in hardware, mechanics, artificial intelligence, and software development
  • Experience in program management, including product definition, development, testing, maintenance, and tier 4 support
  • Ability to conduct technological and product research and drive innovation
  • Skilled in developing and managing CI/CD pipelines for product development cycles
  • Knowledge of performance optimization and server software management
  • Experience with application deployment to both cloud and on-premises production environments
  • Understanding of product security, company development policies, and open source usage
  • Strong leadership skills including strategic planning, entrepreneurship, innovation, and business savviness
  • Proven track record in coaching and development, talent growth, and execution excellence
  • Strong commitment to inclusion, with the ability to influence and motivate others
  • Excellent emotional intelligence, decision-making skills, and a strong sense of ownership and accountability

Benefits

Comp & perks
  • medical, dental, vision healthcare and other insurance benefits (for employee and family)
  • retirement benefits, including 401(k)
  • paid holidays
  • vacation
  • compassionate and sick days

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
site reliability engineeringcapacity managementsystem integrationsoftware developmentrelease managementnetwork designconfiguration managementsoftware development life cyclesystem administrationsolution architecture
Soft Skills
leadershipstrategic planninginnovationcoachingtalent growthemotional intelligencedecision-makinginfluencemotivationaccountability
Certifications
Bachelor's degree in ITBachelor's degree in EngineeringBachelor's degree in Computer Science