Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
OpenAI

Tech Lead – Deployment & Operations, Custom Infrastructure

OpenAI

. Lead a team responsible for deployment and operations of OpenAI’s custom silicon and systems in data center environments .

Posted 5/16/2026full-timeSan Francisco • California • 🇺🇸 United StatesSenior💰 $342,000 - $445,000 per yearWebsite

About the role

Key responsibilities & impact
  • Lead a team responsible for deployment and operations of OpenAI’s custom silicon and systems in data center environments
  • Own the path from hardware bring-up and validation through production deployment, operational readiness, and sustained fleet support
  • Partner closely with silicon, systems, software, infrastructure, networking, data center, supply chain, and external partner teams to ensure successful deployment at scale
  • Define deployment processes, operational playbooks, technical readiness criteria, escalation paths, and reliability practices for new hardware platforms
  • Drive cross-functional execution across lab bring-up, rack/system integration, data center deployment, fleet monitoring, debugging, and issue resolution
  • Stay hands-on technically through architecture reviews, deployment planning, failure analysis, operational debugging, and critical system-level decision-making
  • Identify gaps in tooling, observability, automation, validation coverage, and operational processes, and build plans to close them
  • Establish clear metrics for deployment readiness, reliability, performance, maintainability, and operational health
  • Build a strong engineering culture grounded in ownership, technical rigor, operational excellence, and high-velocity execution
  • Ensure OpenAI’s custom hardware platforms can be deployed and operated reliably, repeatably, and safely at scale
  • Be a contributor and technical driver for the architecture and design of future ML systems

Requirements

What you’ll need
  • 8+ years of engineering experience in hardware systems, infrastructure, data center deployment, production operations, systems engineering, silicon bring-up, or related technical domains
  • Strong technical depth in one or more of: hardware deployment, data center operations, rack-scale systems, silicon bring-up, systems validation, fleet operations, reliability engineering, infrastructure automation, or hardware/software integration
  • Experience bringing complex hardware systems from development or validation into production environments
  • Experience working closely with silicon, systems, software, infrastructure, networking, or data center teams
  • Experience with deployment planning, operational readiness, incident response, debugging, and root-cause analysis for production systems
  • Experience building tooling, automation, observability, or operational processes that improve deployment quality and fleet reliability
  • Demonstrated ability to hire, develop, and lead senior technical talent
  • Ability to move fluidly between people leadership, technical strategy, and hands-on operational problem solving
  • Strong written and verbal communication skills, especially in high-urgency, cross-functional technical environments
  • Experience working in fast-moving environments.

Benefits

Comp & perks
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
hardware systemsinfrastructuredata center deploymentproduction operationssystems engineeringsilicon bring-upreliability engineeringinfrastructure automationhardware/software integrationdeployment planning
Soft Skills
leadershiptechnical strategyproblem solvingcommunicationcross-functional collaborationteam developmentoperational excellenceownershiptechnical rigorhigh-velocity execution