
Senior Deep Learning Software Engineer
NVIDIA
full-time
Posted on:
Location Type: Hybrid
Location: Santa Clara • California • Washington • United States
Visit company websiteExplore more
Salary
💰 $224,000 - $356,500 per year
Job Level
About the role
- Play a pivotal role in defining of a modular, scalable platform to seamlessly bridge training and deployment workflows—enabling tight integration of deployment tooling with training frameworks such as Megatron and Nemo
- Leverage and build upon the torch 2.0 ecosystem (TorchDynamo, torch.export, torch.compile, etc...) to analyze and extract standardized model graph representation from arbitrary torch models for our automated deployment solution
- Develop support for inference optimization techniques such as speculative decoding and LoRA
- Collaborate with teams across NVIDIA to use performant kernel implementations within the automated deployment solution
- Analyze and profile GPU kernel-level performance to identify hardware and software optimization opportunities
- Continuously innovate on the inference performance to ensure NVIDIA's inference software solutions (TRT, TRT-LLM, TRT Model Optimizer) can maintain and increase its leadership in the market.
Requirements
- Masters, PhD, or equivalent experience in Computer Science, AI, Applied Math, or related field
- 8+ years of relevant work or research experience in Deep Learning
- Excellent software design skills, including debugging, performance analysis, and test design
- Strong proficiency in Python, PyTorch, and related ML tools
- Strong algorithms and programming fundamentals
- Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment.
Benefits
- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Deep LearningPythonPyTorchTorchDynamotorch.exporttorch.compileinference optimizationspeculative decodingLoRAperformance analysis
Soft Skills
software designdebuggingtest designwritten communicationverbal communicationindependent workcollaborative workfast-paced environment
Certifications
MastersPhD