FluidStack

Data Center Operations Manager

FluidStack

full-time

Posted on:

Origin:  • 🇺🇸 United States • New York

Visit company website
AI Apply
Manual Apply

Job Level

Senior

Tech Stack

Cloud

About the role

  • Lead and manage all datacenter operations during assigned shift, ensuring 24/7 reliability of GPU supercomputing infrastructure
  • Oversee a team of technicians and provide hands-on leadership
  • Train and mentor junior technicians to enhance skills and team performance
  • Oversee incident response and troubleshooting, providing technical guidance and escalation support
  • Develop and implement operational procedures and best practices to improve efficiency and reduce downtime
  • Coordinate with other shift leads to ensure seamless handovers and consistent operations across shifts
  • Ensure highest levels of reliability and performance for GPU supercomputing infrastructure

Requirements

  • 5+ years of experience in datacenter operations
  • At least 2 years in a leadership or supervisory role
  • Strong technical background in datacenter infrastructure (servers, networking, power, cooling)
  • Proven track record of leading and developing technical teams
  • Excellent communication and interpersonal skills
  • Experience with incident management, root cause analysis, and implementing corrective actions
  • Ability to work 24/7 shift-based operations (shift lead responsibilities)
  • Nice to have:
  • Experience with GPU clusters and high-performance computing environments
  • Familiarity with DCIM tools and automation platforms
  • Experience working with hyperscale or colocation datacenter environments
  • Previous experience in a 24/7 shift-based operation