Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Coinbase

Senior Site Reliability Engineer, Core AI Infrastructure

Coinbase

Senior Site Reliability Engineer managing AI infrastructure at Coinbase. Driving automation, reliability, and observability in critical AI operations.

Posted 6/9/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $186,065 - $218,900 per yearWebsite

Tech Stack

Tools & technologies
AWSCloudDockerGoKubernetesPythonRuby

About the role

Key responsibilities & impact
  • Own the reliability, monitoring, and incident response lifecycle for AI infrastructure services, including on-call support for AWS deployment pipelines, root cause analysis, and blameless retros.
  • Build automation and tooling to streamline operational IT workflows, eliminate manual tasks, and improve deployment velocity across CI/CD frameworks and Kubernetes environments.
  • Partner with the Coinbase Infrastructure team to extend CI/CD frameworks supporting IT services and enterprise network platforms, and with Security and Compliance to integrate surveillance tooling into deployment pipelines.
  • Strengthen observability and documentation standards across IT engineering by defining metrics, implementing monitoring solutions, and maintaining technical documentation that sets a standard of excellence.
  • Develop full-stack applications that power internal AI products and infrastructure with Go or Python.

Requirements

What you’ll need
  • 5+ years of experience automating and supporting cloud infrastructure (AWS) and network environments
  • Proven experience deploying, managing, and troubleshooting containerized workloads using Docker and Kubernetes in production environments
  • Proficiency in at least one scripting or programming language (Python, Bash, Ruby, or Go) and version control workflows using Git-based CI/CD pipelines
  • Track record of leading incident response in environments with strict SLAs, including root cause analysis, blameless retros, and measurable reliability improvements
  • Utilizes generative AI responsibly, maintaining human oversight to deliver business-ready outputs and drive measurable improvements in workflow efficiency, cost, and quality.

Benefits

Comp & perks
  • medical
  • dental
  • vision
  • 401(k)

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSKubernetesDockerPythonGoBashRubyCI/CDversion controlautomation
Soft Skills
incident responseroot cause analysisblameless retrospectivescollaborationdocumentationobservabilityleadershipworkflow efficiencycommunicationproblem-solving