
Senior System Software Engineer – Dynamo Tools
NVIDIA
full-time
Posted on:
Location Type: Office
Location: Santa Clara • California • Texas • United States
Visit company websiteExplore more
Salary
💰 $184,000 - $356,500 per year
Job Level
About the role
- Lead the design, development, and roadmap of AI-Perf, defining benchmarking methodologies, performance metrics, and reproducible experimental workflows.
- Build scalable and high-performance features to measure latency, throughput, and efficiency across AI models and distributed systems.
- Partner with AI researchers, platform teams, and engineers to translate experimental challenges into robust, user-friendly performance tooling.
- Integrate AI-Perf with the Dynamo Inference Stack, other NVIDIA inference stacks, and open-source inference frameworks, delivering end-to-end performance insights for researchers and production users.
Requirements
- Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, or related field—or equivalent experience.
- 8+ years of experience in systems software, distributed performance engineering, or AI infrastructure research.
- Expert-level Python skills, including profiling, optimization, automation, and debugging of complex systems.
- Deep knowledge of distributed systems concepts, including scalability, concurrency, fault tolerance, and performance trade-offs.
- Experience designing or maintaining performance benchmarking frameworks or tooling for AI/ML systems.
- Hands-on experience with LLMs and deep learning frameworks such as PyTorch, TensorFlow, TensorRT, or ONNX Runtime.
- Contributions to open-source or research projects in AI performance, infrastructure, or distributed systems.
- Experience running large-scale inference experiments across cloud and on-prem environments (AWS, Azure, GCP, bare metal).
Benefits
- equity
- comprehensive benefits package
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Pythonperformance benchmarkingprofilingoptimizationautomationdebuggingdistributed systemsAI/ML systemsLLMsdeep learning frameworks