Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Hydra Host

HPC Solutions Engineer

Hydra Host

HPC Solutions Engineer designing and managing high-performance GPU computing environments for premium clients at Hydra Host. Collaborating with teams and optimizing systems for large-scale environments.

Posted 7/2/2026full-timeRemote • 🇺🇸 United StatesSeniorLeadWebsite

Tech Stack

Tools & technologies
AnsiblePythonTerraform

About the role

Key responsibilities & impact
  • Work with customers in technical discovery to help define requirements and deliverables for their use cases and help them to effectively utilize distributed GPU computing resources.
  • Identify and recommend the best tools for each customer, while building boilerplate and reference implementations that can be reused for subsequent customers.
  • Manage GPU clusters and coordinate with IT to ensure efficient operation of NVIDIA GPUs, utilizing technologies such as InfiniBand for high-speed networking.
  • Oversee the deployment and maintenance of machine learning environments using virtual storage solutions and distributed computing (HPC) tools such as SLURM.
  • Automate infrastructure provisioning and management using tools such as Ansible and Terraform.
  • Collaborate with data scientists and engineers to ensure seamless integration of ML models into production environments.
  • Conduct performance tuning and optimization of systems to maximize throughput and reduce latency.
  • Stay current with the latest industry trends in machine learning technologies and HPC to ensure the use of best practices in infrastructure setup and model development.
  • Document and maintain operational procedures and system configurations.

Requirements

What you’ll need
  • 7+ years of high performance compute, distributed machine learning, GPU computing, and/or system architecture experience
  • Proficient in managing NVIDIA GPU environments and a familiarity with GPU computing frameworks and libraries
  • Strong experience with high-speed networking technologies, specifically InfiniBand
  • Experience with HPC job schedulers, preferably SLURM
  • Expertise in automating environment setup and maintenance using Ansible and Terraform
  • Demonstrated ability in deploying and managing virtual storage solutions
  • Strong coding skills in Python and familiarity with machine learning libraries and frameworks
  • Excellent problem-solving, communication, and teamwork skills.

Benefits

Comp & perks
  • You will work with the most diverse hardware configurations and locations available to anyone in the industry as well as cutting edge GPU use cases
  • This role is fully remote with a high accountability and high agency culture
  • This role offers a competitive salary, equity and benefits
  • This role offers flexible PTO

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
High Performance ComputingDistributed Machine LearningGPU ComputingSystem ArchitectureVirtual Storage SolutionsPerformance TuningMachine Learning LibrariesDistributed ComputingTechnical DiscoveryOperational Procedures Documentation
Soft Skills
Problem-SolvingCommunicationTeamwork