PayPal

Senior Manager, Site Reliability Engineering

PayPal

full-time

Posted on:

Location Type: Hybrid

Location: ScottsdaleArizonaCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $152,500 - $226,600 per year

Job Level

About the role

  • Manage and mentor a team of site reliability engineers, setting performance objectives, providing technical guidance, and ensuring alignment with business goals.
  • Oversee the execution of reliability initiatives, ensuring critical systems maintain high availability, resilience, and performance at scale.
  • Work with engineering, operations, and product teams to ensure seamless integration of reliability best practices into the development, deployment, and operational processes.
  • Lead incident management activities, including coordination of response efforts, root cause analysis, and implementing solutions to prevent future incidents.
  • Define and track key performance indicators (KPIs) related to system reliability, availability, and performance, reporting results to leadership regularly.
  • Promote and drive automation within the site reliability engineering team, ensuring processes are streamlined and systems operate with minimal manual intervention.
  • Manage capacity planning efforts, ensuring the scalability of systems and the ability to handle increasing traffic and resource demands effectively.
  • Ensure the development and testing of disaster recovery plans and procedures, minimizing downtime in the event of a failure.
  • Lead career development and mentorship efforts for team members, ensuring engineers have the tools and opportunities to grow their skills and advance their careers.

Requirements

  • 8+ years relevant experience and a Bachelor’s degree OR Any equivalent combination of education and experience.
  • Experience leading others
  • Bachelor’s degree in computer science, Information Technology, or related field; Master's preferred.
  • 8+ years of experience in infrastructure management, with at least 3 years in a leadership role.
  • Extensive experience with multiple cloud platforms (AWS, Azure, GCP) and on-premises infrastructure management.
  • Demonstrated experience building or scaling AI/ML-based automation for operations; including AIOps platforms, alert noise reduction, auto-remediation, and intelligent runbooks.
  • Strong background in incident management, ITIL frameworks, and operational best practices.
  • Experience with monitoring tools, automation platforms, and infrastructure-as-code technologies.
Benefits
  • generous paid time off
  • healthcare coverage for you and your family
  • resources to create financial security and support your mental health
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
site reliability engineeringinfrastructure managementcloud platformsAI/ML-based automationincident managementITIL frameworksmonitoring toolsautomation platformsinfrastructure-as-codedisaster recovery
Soft Skills
team managementmentorshiptechnical guidanceperformance objectivescommunicationleadershipcapacity planningproblem-solvingcollaborationcareer development
Certifications
Bachelor’s degree in computer scienceBachelor’s degree in Information TechnologyMaster's degree (preferred)