Backend Engineer

• Define, architect, and implement Hydra Host’s first production storage platform tailored for bare-metal GPU clusters and AI/HPC workloads.
• Lead all technical decisions around storage stack design, from hardware infrastructure to parallel file system orchestration and performance tuning.
• Select, build, and maintain storage solutions spanning both block (NVMe, SAN, Ceph, etc.) and object storage (S3-compatible, custom, or Ceph Object Gateway) layers.
• Design for high-throughput, low-latency access, supporting large datasets, rapid checkpointing, and parallel access for distributed AI training workloads.
• Integrate and optimize parallel file systems such as Lustre, BeeGFS, Spectrum Scale, WekaIO, or CephFS, ensuring maximum performance and fault tolerance.
• Ensure compatibility across Hydra’s diverse GPU/OEM ecosystem, accounting for unique firmware, BMC/Redfish APIs, and hardware configurations.
• Develop automation, observability, and management tooling for storage, focusing on reliability, scalability, and efficiency.
• Act as a builder and architect: deeply hands-on in deployment, troubleshooting, and optimization, while guiding long-term storage roadmap.
• Collaborate cross-functionally with GPU, HPC, and platform engineering teams to integrate storage with compute and network layers.
• Interface with customers and product leadership to define feature priorities, performance benchmarks, and future enhancements.

Storage Engineer

Salary

Job Level

Tech Stack

About the role

Requirements

Applicant Tracking System Keywords

Hard skills

Soft skills