Red Hat

Senior Site Reliability Engineer

Red Hat

full-time

Posted on:

Origin:  • 🇺🇸 United States • Colorado

Visit company website
AI Apply
Apply

Salary

💰 $111,260 - $183,580 per year

Job Level

Senior

Tech Stack

AnsibleAWSAzureChefCloudDistributed SystemsDNSDockerGoGoogle Cloud PlatformJavaKubernetesLinuxOpenShiftOpen SourcePrometheusPuppetPythonTCP/IPUnix

About the role

  • Develop, scale, and operate OpenShift managed cloud services
  • Contribute code to increase the scalability and reliability of the service
  • Contribute software tests and participate in peer review to increase the quality of our codebase
  • Help and develop peers’ capabilities through knowledge sharing, mentoring, and collaboration
  • Participate in a regular on-call schedule, including occasional paid weekends and holidays
  • Practice sustainable incident response and blameless postmortems
  • Resolve customer issues escalated from the Red Hat Global Support team
  • Work within a small agile team to develop and improve SRE software, support your peers, plan and self-improve
  • Enable customer self-service, make monitoring more sustainable, and eliminate work through automation

Requirements

  • Bachelor's degree in Computer Science or a related technical field required (hands-on experience may be considered in lieu of degree)
  • Experience programming in at least one language: Python, Golang, C, or another object-oriented language
  • Experience working with public clouds such as AWS, GCP, or Azure
  • Ability to collaboratively troubleshoot and solve problems in a team setting
  • Experience troubleshooting an as-a-service offering (SaaS, PaaS) and working with complex distributed systems (preferred)
  • Basic understanding of Unix/Linux operating systems
  • 5+ years of experience managing Linux servers running RHEL, CentOS, or Fedora hosted at cloud providers such as AWS, GCE, or Azure [desired]
  • 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus [desired]
  • 3+ years of experience with enterprise configuration management software like Ansible, Puppet, or Chef [desired]
  • 2+ years of experience programming with at least one object-oriented language; Golang, Java, or Python preferred [desired]
  • 2+ years of experience delivering a hosted service [desired]
  • Demonstrated ability to quickly and accurately troubleshoot system issues
  • Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP
  • Solid communications skills and experience working directly with and presenting to customers
  • 1+ year(s) of experience with Kubernetes is a plus
  • 1+ year(s) of experience with docker-based containers is a plus
NVIDIA

Senior Site Reliability Engineer

NVIDIA
Seniorfull-time$208k–$334k / yearCalifornia · 🇺🇸 United States
Posted: 19 days agoSource: nvidia.wd5.myworkdayjobs.com
AnsibleAWSAzureChefCloudDistributed SystemsDNSGoGoogle Cloud PlatformGrafanaKubernetesLinux+7 more
Content Conspiracy

Senior Site Reliability Engineer – Operational Platforms

Content Conspiracy
Seniorfull-time$121k–$141k / yearColorado · 🇺🇸 United States
Posted: 12 days agoSource: boards.greenhouse.io
AnsibleAWSAzureChefCloudCyber SecurityDockerGoGoogle Cloud PlatformJenkinsKubernetesLinux+7 more
NVIDIA

Senior Site Reliability Engineer

NVIDIA
Seniorfull-time🇮🇳 India
Posted: 15 days agoSource: nvidia.wd5.myworkdayjobs.com
AnsibleAWSAzureChefCloudGoGoogle Cloud PlatformGrafanaKubernetesLinuxMicroservicesPrometheus+5 more
CCR GROUP

System Engineer

CCR GROUP
Mid · Seniorfull-timeIowa · 🇺🇸 United States
Posted: 5 days agoSource: recruiting.paylocity.com
AnsibleAWSAzureCloudCyber SecurityDNSDockerFirewallsGoogle Cloud PlatformJavaScriptKubernetesLinux+8 more
Coalfire

Cloud Infrastructure Administrator II

Coalfire
Juniorfull-time$64k–$112k / year🇺🇸 United States
Posted: 14 days agoSource: jobs.lever.co
AnsibleAWSAzureCloudCyber SecurityDNSDockerGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheus+4 more