
Site Reliability Engineer
Red Hat
full-time
Posted on:
Location Type: Hybrid
Location: Raleigh • Massachusetts, North Carolina • 🇺🇸 United States
Visit company websiteSalary
💰 $94,550 - $151,170 per year
Job Level
Mid-LevelSenior
Tech Stack
CloudDNSGoGrafanaJenkinsKubernetesLinuxOpenShiftPrometheusPythonTCP/IPVault
About the role
- Design, write, and maintain software (primarily in Python and Golang) that automates the deployment, monitoring, and maintenance of Red Hat managed services
- Onboarding of new services onto our OpenShift-based platform
- Adhering to cloud-native design principles & best practices to ensure reliability, scalability, and security
- Contribute to documents, like standard operating procedures (SOPs) and playbooks, that assist in issue resolution and new-service onboarding
- Proactively utilize AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, auto-completion, and intelligent suggestions to accelerate development cycles and enhance code quality
- Participate in an Agile Scrum team that scopes, prioritizes, and allocates work items
- Participate in an on-call rotation that is responsible for responding to service incidents
Requirements
- Background writing object-oriented automation software in Python, experience with Golang is only plus
- Background administering production cloud-native services, preferably containerized and deployed via a container-orchestration system like Kubernetes or OpenShift
- Experience diagnosing service failures and carrying out incident response procedures
- Familiarity with Linux operating system and its configuration
- Ability to effectively work in a globally distributed team
- Understanding of computer networking and protocols, including TCP/IP and DNS
- Understanding of computer security and cryptography basics, including certificates, TLS, and credential-storage systems like Vault is a plus
- Familiarity with CI/CD pipeline concepts and systems, like Jenkins and Tekton/Argo is a plus
- Familiarity with observability tools like Prometheus and Grafana, and how to define metrics that can be used to measure service health and reliability is a plus
Benefits
- Comprehensive medical, dental, and vision coverage
- Flexible Spending Account - healthcare and dependent care
- Health Savings Account - high deductible medical plan
- Retirement 401(k) with employer match
- Paid time off and holidays
- Paid parental leave plans for all new parents
- Leave benefits including disability, paid family medical leave, and paid military leave
- Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonGolangobject-oriented programmingcloud-native servicesKubernetesOpenShiftincident responseLinuxcomputer networkingcomputer security
Soft skills
collaborationcommunicationproblem-solvingadaptabilityteamwork