Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
OpenAI

Datacenter Hardware Operations Technician Lead – Industrial Compute

OpenAI

Datacenter Hardware Technician Lead overseeing hardware reliability, supporting AI infrastructure at OpenAI. Driving improvements and coordinating with vendors at flagship AI campuses in Texas.

Posted 6/20/2026full-timeAbilene • Texas • 🇺🇸 United StatesSenior💰 $86,400 - $228,000 per yearWebsite

Tech Stack

Tools & technologies
Oracle

About the role

Key responsibilities & impact
  • Serve as OpenAI’s senior on-site hardware operations lead for server, GPU, storage, and rack-level infrastructure.
  • Drive technical triage and resolution of complex hardware failures impacting production systems.
  • Partner with Fleet Health Engineering to investigate recurring hardware issues, identify failure patterns, and improve fleet reliability.
  • Lead root cause analysis (RCA) efforts for critical hardware incidents and develop corrective and preventive action plans.
  • Collaborate with Oracle operations teams and OEM vendors to coordinate repairs, replacements, upgrades, and hardware lifecycle activities.
  • Establish and continuously improve hardware maintenance procedures, operational runbooks, and troubleshooting standards.
  • Analyze hardware failure trends and operational metrics to identify reliability risks and improvement opportunities.
  • Support new hardware introductions, validation activities, and production readiness reviews.
  • Coordinate spare parts strategy and inventory planning with supply chain and operations teams.
  • Partner with Hardware Engineering, Manufacturing, and Infrastructure teams to provide field feedback that improves future platform designs.
  • Develop scalable operational standards and best practices that can be deployed across future Stargate campuses.
  • Mentor technicians and partner teams on advanced troubleshooting methodologies and hardware operational excellence.

Requirements

What you’ll need
  • 8+ years of experience supporting large-scale datacenter hardware infrastructure, with experience in a senior technician, sustaining engineering, or hardware operations leadership role.
  • Deep expertise with server platforms, GPU systems, storage infrastructure, rack integration, and datacenter hardware architecture.
  • Strong experience diagnosing complex hardware failures and leading repair efforts in production environments.
  • Experience conducting root cause analysis and driving long-term corrective actions.
  • Strong understanding of hardware reliability engineering principles and fleet-health management.
  • Proven ability to partner effectively across engineering, operations, manufacturing, and vendor organizations.
  • Comfortable operating independently in high-priority production environments with significant operational responsibility.
  • Excellent written and verbal communication skills with the ability to influence technical and operational decisions.
  • Experience developing operational processes, maintenance standards, and technical documentation.
  • Ability to travel occasionally to support new campus deployments and operational readiness activities.

Benefits

Comp & perks
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
server platformsGPU systemsstorage infrastructurerack integrationdatacenter hardware architecturehardware reliability engineeringroot cause analysistroubleshooting methodologiesoperational processestechnical documentation
Soft Skills
communication skillsinfluencementoringcollaborationindependent operationproblem-solvingleadershipanalytical skillsorganizational skillsadaptability