
Senior Data Center Operations Engineer
FluidStack
full-time
Posted on:
Location: New York • 🇺🇸 United States
Visit company websiteJob Level
Senior
Tech Stack
Cloud
About the role
- Own regional data center operations end-to-end. Manage power, cooling, and rack infrastructure across multiple sites.
- Develop high-efficiency rack designs. Maximize space and power utilization to save millions in energy and infrastructure costs.
- Build automation tools that eliminate routine tasks. Every manual process is an opportunity to code a solution.
- Configure and manage global PDU infrastructure. Integrate with monitoring systems to calculate PUE, generate alerts, and create reports.
- Lead DCIM implementation and adoption. Track assets across multiple locations, reduce asset retrieval time, and maintain 99.99%+ accuracy.
- Drive automation initiatives. Integrate tracking systems with DCIM for real-time asset lifecycle updates.
- Design reports and dashboards. Identify rack and power density improvements to drive efficiency and capacity optimization.
- Maintain policy and procedure documents. Update SOPs and MOPs for compliance and efficiency.
- Utilize ticketing and knowledge base applications. Leverage Jira, and Confluence to manage workflow and documentation.
- Document everything. Write clear procedures that enable others to execute flawlessly.
Requirements
- 5+ years managing data center operations at scale. Experience with hardware integration, capacity planning, and infrastructure optimization.
- Proven track record achieving 99.99%+ accuracy in physical audits across multiple regions and countries.
- Experience managing DCIM implementations. You've tracked 100K+ assets and reduced retrieval times by 75%.
- Strong vendor management skills. Experience with ITAD relationships, hardware disposals, and generating revenue from e-waste.
- Expertise in power infrastructure. Knowledge of PDU configuration, PUE calculations, and energy optimization.
- Experience leading large-scale migrations. You've executed 10+ full-cage relocations maintaining continuous service.
- Automation mindset. You've integrated RFID tracking with DCIM and improved data accuracy from 80% to 99.999%.
- Excellent vendor management skills. You negotiate effectively and hold partners accountable.
- Strong technical documentation skills. Experience creating and maintaining SOPs, MOPs, and training materials.
- Data-driven approach. You create dashboards and reports that drive infrastructure decisions.
- Extreme ownership mentality. You see problems through from identification to resolution.
- Experience with GPU infrastructure and high-performance computing environments (Nice to have).
- Familiarity with AI/ML workloads and their infrastructure requirements (Nice to have).
- Knowledge of liquid cooling systems for high-density compute (Nice to have).
- Experience building custom monitoring and automation tools (Nice to have).
- Background in hyperscale or cloud data center operations (Nice to have).