
Senior HPC Scheduler Engineer
NVIDIA
full-time
Posted on:
Location Type: Hybrid
Location: Santa Clara • California • Utah • United States
Visit company websiteExplore more
Salary
💰 $224,000 - $356,500 per year
Job Level
About the role
- Provide engineering solutions and prototypes to enable efficient resource management and job scheduling for large scale clusters
- Drive next generation requirements and features for schedulers in at scale clusters
- Ensure technical relationships with internal and external engineering teams
- Assist system architects and machine learning/deep learning engineers in building creative solutions based on NVIDIA technology
- Be an internal reference for scheduling and resource management concepts and methodologies among the NVIDIA technical community
- Test, evaluate, and benchmark new technologies and products and work with vendors, partners and peers to improve functionality and optimize performance
Requirements
- BS, MS, or PhD in Engineering, Mathematics, Physics, Computer Science, or equivalent experience
- 12+ years of experience designing and running scheduling and resource management systems in large datacenter/AI/HPC solutions
- Knowledge and experience with resource management / scheduling code bases: SLURM preferred, other implementations (LSF, SGE, Torque...)
- Proven understanding of performance clusters, infrastructure and workload patterns
- Experience using and installing Linux-based server platforms
- C/Python/Bash/Lua programming/scripting experience
- Experience working with engineering or academic research community supporting HPC or deep learning
- Strong teamwork and both verbal and written communication skills.
Benefits
- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
scheduling systemsresource management systemsSLURMLSFSGETorqueLinuxCPythonBash
Soft Skills
teamworkverbal communicationwritten communication
Certifications
BS in EngineeringMS in EngineeringPhD in EngineeringBS in MathematicsMS in MathematicsPhD in MathematicsBS in PhysicsMS in PhysicsPhD in PhysicsBS in Computer Science