Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
MSD

Senior Staff Reliability Engineer, Software Engineering

MSD

Senior Principal Reliability Engineer at a global healthcare company. Leading reliability engineering practices and collaborating across IT teams to enhance system resilience and performance.

Posted 5/15/2026full-timeRahway • New Jersey, Pennsylvania • 🇺🇸 United StatesSenior💰 $142,400 - $224,100 per yearWebsite

Tech Stack

Tools & technologies
CloudITSMOpen SourceSDLC

About the role

Key responsibilities & impact
  • Build relationships across the broader IT organization to increase adoption and maturity of SRE, Observability, and Resilience practices
  • Define and evolve the strategic vision for enterprise reliability engineering and ensure alignment across product, platform, and ITSM teams
  • Establish and enforce standards for Service Level Objectives, observability frameworks, and resilience engineering practices
  • Collaborate with engineering teams to ensure reliability is embedded into architecture, design, and delivery processes
  • Drive adoption of Service Level Objectives using Nobl9 as the system of record for reliability governance
  • Lead evaluation and introduction of new technologies that improve reliability outcomes while integrating with existing platforms
  • Apply AI capabilities to enhance reliability practices, including incident triage, diagnostics, and automation, in a governed and controlled manner
  • Collaborate within efforts to standardize observability across logs, metrics, traces, and events to improve system visibility and decision-making
  • Consult and promote resilience patterns including fault isolation, failover strategies, and recovery mechanisms
  • Guide improvements surrounding incident lifecycle effectiveness, including detection, response, root cause analysis, and continuous improvement
  • Lead and mentor a community of reliability practitioners to grow organizational capability and maturity
  • Represent reliability engineering practice in architecture reviews, governance forums, and key IT initiatives
  • Drive continuous improvement of reliability practices through research, innovation, and feedback from engineering teams

Requirements

What you’ll need
  • Bachelors degree in IT, Engineering, Computer Science, or related field
  • Minimum 7 years experience in site reliability engineering
  • Expertise in capacity management, system integration, software development, release management, network design, configuration management (CM), software development life cycle (SDLC), system administration, change controls, and solution architecture
  • Proficiency in designing, managing, developing, and maintaining technological products, particularly in the animal health domain
  • Strong expertise in hardware, mechanics, artificial intelligence, and software development
  • Experience in program management, including product definition, development, testing, maintenance, and tier 4 support
  • Ability to conduct technological and product research and drive innovation
  • Skilled in developing and managing CI/CD pipelines for product development cycles
  • Knowledge of performance optimization and server software management
  • Experience with application deployment to both cloud and on-premises production environments
  • Understanding of product security, company development policies, and open source usage
  • Strong leadership skills including strategic planning, entrepreneurship, innovation, and business savviness
  • Proven track record in coaching and development, talent growth, and execution excellence
  • Strong commitment to inclusion, with the ability to influence and motivate others
  • Excellent emotional intelligence, decision-making skills, and a strong sense of ownership and accountability

Benefits

Comp & perks
  • medical, dental, vision healthcare and other insurance benefits (for employee and family)
  • retirement benefits, including 401(k)
  • paid holidays
  • vacation
  • compassionate and sick days

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
site reliability engineeringcapacity managementsystem integrationsoftware developmentrelease managementnetwork designconfiguration managementsoftware development life cyclesystem administrationsolution architecture
Soft Skills
leadershipstrategic planninginnovationcoachingtalent growthemotional intelligencedecision-makinginfluencemotivationaccountability