Red Hat

Site Reliability Engineer

Red Hat

full-time

Posted on:

Location Type: Hybrid

Location: Raleigh • Massachusetts, North Carolina • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $94,550 - $151,170 per year

Job Level

Mid-LevelSenior

Tech Stack

CloudDNSGoGrafanaJenkinsKubernetesLinuxOpenShiftPrometheusPythonTCP/IPVault

About the role

  • Design, write, and maintain software (primarily in Python and Golang) that automates the deployment, monitoring, and maintenance of Red Hat managed services
  • Onboarding of new services onto our OpenShift-based platform
  • Adhering to cloud-native design principles & best practices to ensure reliability, scalability, and security
  • Contribute to documents, like standard operating procedures (SOPs) and playbooks, that assist in issue resolution and new-service onboarding
  • Proactively utilize AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, auto-completion, and intelligent suggestions to accelerate development cycles and enhance code quality
  • Participate in an Agile Scrum team that scopes, prioritizes, and allocates work items
  • Participate in an on-call rotation that is responsible for responding to service incidents

Requirements

  • Background writing object-oriented automation software in Python, experience with Golang is only plus
  • Background administering production cloud-native services, preferably containerized and deployed via a container-orchestration system like Kubernetes or OpenShift
  • Experience diagnosing service failures and carrying out incident response procedures
  • Familiarity with Linux operating system and its configuration
  • Ability to effectively work in a globally distributed team
  • Understanding of computer networking and protocols, including TCP/IP and DNS
  • Understanding of computer security and cryptography basics, including certificates, TLS, and credential-storage systems like Vault is a plus
  • Familiarity with CI/CD pipeline concepts and systems, like Jenkins and Tekton/Argo is a plus
  • Familiarity with observability tools like Prometheus and Grafana, and how to define metrics that can be used to measure service health and reliability is a plus
Benefits
  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
PythonGolangobject-oriented programmingcloud-native servicesKubernetesOpenShiftincident responseLinuxcomputer networkingcomputer security
Soft skills
collaborationcommunicationproblem-solvingadaptabilityteamwork