Metsi Technologies

Principal Engineer – Service Delivery

Metsi Technologies

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

Job Level

Tech Stack

About the role

  • Provide world-class delivery support to our customers
  • Deploy, configure, and validate GPU‑accelerated compute clusters for AI, ML, and HPC with NVIDIA Base Command Manager
  • Perform benchmarking with HPL GPU, HPL MxP, STREAM, NCCL, RCCL, and related tools
  • Produce as-built documentation, performance reports, and share best practices amongst the team
  • Configure and secure RHEL, Ubuntu, Rocky for GenAI or HPC workloads
  • Constantly learn and work with the latest GenAI platforms and infrastructure

Requirements

  • 7+ years with HPC or GenAI clusters, GPU based systems, AI infrastructure, or related fields
  • Deep hands‑on experience with GPU deployment, configuration, and multi-node testing
  • Proficiency with benchmarking tools: HPL, STREAM, NCCL, RCCL, MxP
  • Red Hat certification (RHCSA/RHCE) or 7+ years of relevant RH distros experience
  • Experience with GenAI/HPC networking (InfiniBand and/or RoCE)
  • Bachelor’s degree (desirable)
  • Strong proven ability to lead sub-teams (desirable)
Benefits
  • Health insurance
  • Flexible working hours
  • Professional development
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GPU deploymentGPU configurationmulti-node testingbenchmarking toolsHPLSTREAMNCCLRCCLGenAIHPC
Soft Skills
leadership
Certifications
Red Hat certificationRHCSARHCE