Senior Deep Learning Engineer – Autonomous Vehicles

NVIDIA

Senior Deep Learning Systems Engineer building and scaling training libraries for autonomous driving at NVIDIA. Collaborating with research and platform teams on high-performance distributed systems.

Posted 6/29/2026full-timeSanta Clara • California, Colorado • 🇺🇸 United StatesSenior💰 $224,000 - $356,500 per yearWebsite

Tech Stack

Tools & technologies

Distributed SystemsKubernetesPythonPyTorch

About the role

Key responsibilities & impact

Crafting, scaling, and hardening deep learning infrastructure libraries and frameworks for training on multi-thousand GPU clusters.
Improving efficiency throughout the training stack: data loaders, distributed training, scheduling, and performance monitoring.
Building robust training pipelines and libraries to handle massive video datasets and enable rapid experimentation.
Collaborating with researchers, model engineers, and internal platform teams to enhance efficiency, minimize stalls, and improve training availability.
Owning core infrastructure components such as orchestration libraries, distributed training frameworks, and fault-resilient training systems.
Partnering with leadership to ensure infrastructure scales with growing GPU capacity and dataset size while maintaining developer efficiency and stability.

Requirements

What you’ll need

BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, or a related field, or equivalent experience.
12+ years of professional experience building and scaling high-performance distributed systems, ideally in ML, HPC, or large-scale data infrastructure.
Extensive knowledge in deep learning frameworks (PyTorch is preferred), large scale training (DDP/FSDP, NCCL, tensor/pipeline parallelism), and performance profiling.
Strong systems background: datacenter networking (RoCE, IB), parallel filesystems (Lustre), storage systems, schedulers (Slurm, Kubernetes, etc.).
Proficiency in Python and C++, with experience writing production-grade libraries, orchestration layers, and automation tools.
Ability to work closely with multi-functional teams (ML researchers, infra engineers, product leads) and translate requirements into robust systems.

Benefits

Comp & perks

Equity
Benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Deep LearningDistributed TrainingPerformance ProfilingProduction-Grade LibrariesOrchestration LayersAutomation ToolsLarge Scale TrainingFault-Resilient Training SystemsData LoadersScheduling

Soft Skills

CollaborationCommunication