
Associate Manager, Site Reliability Engineering
Red Hat
full-time
Posted on:
Location Type: Remote
Location: Australia
Visit company websiteExplore more
About the role
- Lead and grow a team of SREs maintaining the overall health of OpenShift hosted properties
- Own the health, reliability and availability of OpenShift hosted properties
- Provide coaching, oversight and escalation support to the regional team of SREs
- Ensure that incidents are managed and resolved quickly, and that retrospectives and root-cause analysis is completed within expected timelines
- Oversee the creation and maintenance of knowledge article and standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in the environment
- Manage regional shift schedules, ensuring 24x7 resource availability
- Participate in sprint planning and release cycles of SRE tooling
- Schedule maintenance windows, considering customer and SRE resource requirements
- Coordinate with teams across the organization to reduce operational friction and automate wherever possible
- Resolve customer issues in cooperation with Red Hat's global customer support team
- Identify and advocate for resources (e.g., training, licenses for new tools, dedicated time for exploration) to support the team's ongoing AI literacy and adoption.
- Ensure your team understands and applies guidelines for the ethical use of AI within the team, addressing concerns such as data privacy, bias mitigation, intellectual property, and responsible disclosure.
- Foster a safe environment for experimentation and learning with AI technologies by supporting projects and experiments that encourage efficiency and simplicity – this could include: automating repetitive tasks, analyzing code metrics, or improving development processes; support the team to quickly test and implement as well as recover through failures.
Requirements
- 1+ years experience managing engineering teams
- Must be comfortable managing distributed, remote staff
- Ability to understand and discuss deep technical issues with engineers
- Demonstrated experience with contemporary project management methodologies such as Agile, kanban and / or scrum
- 1+ years of experience with cloud providers such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure
- 1+ year(s) of experience with Kubernetes is a plus
- 1+ year(s) of experience with docker-based containers is a plus
Benefits
- Flexible working hours
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
OpenShiftKubernetesDockerAI literacyroot-cause analysisAgileKanbanScrumcloud computingincident management
Soft Skills
leadershipcoachingcommunicationproblem-solvingteam managementcollaborationadaptabilitycritical thinkingmentoringorganizational skills