NVIDIA

Senior HPC Scheduler Engineer

NVIDIA

full-time

Posted on:

Location Type: Hybrid

Location: Santa ClaraCaliforniaUtahUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $224,000 - $356,500 per year

Job Level

Tech Stack

About the role

  • Provide engineering solutions and prototypes to enable efficient resource management and job scheduling for large scale clusters
  • Drive next generation requirements and features for schedulers in at scale clusters
  • Ensure technical relationships with internal and external engineering teams
  • Assist system architects and machine learning/deep learning engineers in building creative solutions based on NVIDIA technology
  • Be an internal reference for scheduling and resource management concepts and methodologies among the NVIDIA technical community
  • Test, evaluate, and benchmark new technologies and products and work with vendors, partners and peers to improve functionality and optimize performance

Requirements

  • BS, MS, or PhD in Engineering, Mathematics, Physics, Computer Science, or equivalent experience
  • 12+ years of experience designing and running scheduling and resource management systems in large datacenter/AI/HPC solutions
  • Knowledge and experience with resource management / scheduling code bases: SLURM preferred, other implementations (LSF, SGE, Torque...)
  • Proven understanding of performance clusters, infrastructure and workload patterns
  • Experience using and installing Linux-based server platforms
  • C/Python/Bash/Lua programming/scripting experience
  • Experience working with engineering or academic research community supporting HPC or deep learning
  • Strong teamwork and both verbal and written communication skills.
Benefits
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
scheduling systemsresource management systemsSLURMLSFSGETorqueLinuxCPythonBash
Soft Skills
teamworkverbal communicationwritten communication
Certifications
BS in EngineeringMS in EngineeringPhD in EngineeringBS in MathematicsMS in MathematicsPhD in MathematicsBS in PhysicsMS in PhysicsPhD in PhysicsBS in Computer Science