FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal Operations Engineer – Hardware, Data Center Operations
FluidStackPrincipal Operations Engineer overseeing operational hardware fleet across hyperscale AI data centers. Ensuring reliability and continuous improvement of GPU systems and supporting hardware at scale.
Tech Stack
Tools & technologiesCloud
About the role
Key responsibilities & impact- Serve as the most senior technical authority for the operational hardware fleet across our hyperscale AI data center portfolio.
- Ensure that the GPU systems, servers, and supporting hardware we deploy at scale are operated, maintained, and continuously improved.
- Lead site assessments and operational audits.
- Drive the technical readiness of teams ahead of site activation.
- Review hardware platforms and integration designs from an operational lens.
- Feed operational learnings back into the hardware engineering, deployment, and supply chain organizations as we shift toward a productized, repeatable build model.
Requirements
What you’ll need- 10+ years of hands-on experience operating mission-critical hardware infrastructure, with at least 5 years as the senior technical voice on a site, campus, or fleet.
- Data center operations experience strongly preferred; hyperscale, large HPC, cloud, or other mission-critical compute infrastructure experience considered.
- Deep working command of GPU systems, server platforms, storage infrastructure, firmware lifecycle management, and hardware diagnostics — earned in the field, not from a textbook.
- Demonstrated ability to author, approve, and execute high-risk MOPs and change records in live production environments.
- A track record of leading root cause analysis on significant hardware events and driving corrective actions to closure.
- A track record of holding OEMs, ODMs, service vendors, and deployment partners accountable — you know how to enforce a standard without burning the relationship.
- Strong written communication: operational health assessments, RCAs, procedure reviews, and design review feedback are second nature.
- Comfort operating as the senior technical voice across operations, hardware engineering, network, facilities, supply chain, and customer-facing teams.
- Willingness to travel extensively across the fleet. 50-75%.
Benefits
Comp & perks- Competitive total compensation package (salary + equity).
- Retirement or pension plan, in line with local norms.
- Health, dental, and vision insurance.
- Generous PTO policy, in line with local norms.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GPU systemsserver platformsstorage infrastructurefirmware lifecycle managementhardware diagnosticsmission-critical hardware infrastructureroot cause analysishigh-risk MOPschange recordsoperational audits
Soft Skills
leadershipwritten communicationaccountabilitycollaborationproblem-solvingtechnical readinessoperational assessmentsfeedback deliveryrelationship managementadaptability