Lead end-to-end program management for major SRE programs, including incident management, release management, observability, automation, and capacity planning
Coordinate and orchestrate work across large, distributed teams of software engineers, SREs, DevOps teams, and stakeholders
Prepare and deliver executive-level presentations, dashboards, and reports that highlight project status, milestones, challenges, and outcomes
Partnering with Incident Commanders to oversee post-incident reviews, drive root cause analysis, and implement preventive measures
Communicate program status, risks, and outcomes to senior leadership and stakeholders
Requirements
15+ years of professional experience in the High-Tech Industry
12+ years of experience in technical program management or site reliability engineering (SRE)
Proven track record of managing large-scale, technically complex programs involving 50+ team members
Demonstrated ability to work effectively with executives, including presenting strategic plans and program updates to senior leadership
Experience with SRE principles, including observability, incident response, and infrastructure automation
Experience with distributed systems, cloud platforms (e.g., AWS, Azure, GCP), and container orchestration (e.g., Kubernetes)
Experience with CI/CD pipelines, infrastructure as code (e.g., Terraform, Ansible), and monitoring tools (e.g., Prometheus, Grafana)
Experience in building dashboards and data driven approach to projects
Experience with project management tools (e.g., Jira, Asana, Microsoft Project) and agile/scrum frameworks
Bachelor’s degree in Computer Science, Engineering, or a related technical field; advanced degree or equivalent experience
Benefits
Paid Time Off: earned time off, as well as paid company holidays based on region
Paid Parental Leave: take up to six months off with your child after birth, adoption or foster care placement
Full Health Benefits Plans: options for 100% employer paid and minimum employee contribution health plans from day one of employment
Retirement Plans: select retirement and pension programs with potential for employer contributions
Learning and Development: options for coaching, online courses and education reimbursements
Compassionate Care Leave: paid time off following the loss of a loved one and other life-changing events
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
program managementsite reliability engineeringincident managementrelease managementobservabilityautomationcapacity planninginfrastructure automationCI/CD pipelinesinfrastructure as code