NVIDIA

Senior Deep Learning Engineer – Autonomous Vehicles

NVIDIA

full-time

Posted on:

Location Type: Remote

Location: Remote • California, Colorado, New York, Texas, Washington • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $224,000 - $356,500 per year

Job Level

Senior

Tech Stack

Distributed SystemsKubernetesPythonPyTorch

About the role

  • Crafting, scaling, and hardening deep learning infrastructure libraries and frameworks for training on multi-thousand GPU clusters.
  • Improving efficiency throughout the training stack: data loaders, distributed training, scheduling, and performance monitoring.
  • Building robust training pipelines and libraries to handle massive video datasets and enable rapid experimentation.
  • Collaborating with researchers, model engineers, and internal platform teams to enhance efficiency, minimize stalls, and improve training availability.
  • Owning core infrastructure components such as orchestration libraries, distributed training frameworks, and fault-resilient training systems.
  • Partnering with leadership to ensure infrastructure scales with growing GPU capacity and dataset size while maintaining developer efficiency and stability.

Requirements

  • BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, or a related field, or equivalent experience.
  • 12+ years of professional experience building and scaling high-performance distributed systems, ideally in ML, HPC, or large-scale data infrastructure.
  • Extensive knowledge in deep learning frameworks (PyTorch is preferred), large scale training (DDP/FSDP, NCCL, tensor/pipeline parallelism), and performance profiling.
  • Strong systems background: datacenter networking (RoCE, IB), parallel filesystems (Lustre), storage systems, schedulers (Slurm, Kubernetes, etc.).
  • Proficiency in Python and C++, with experience writing production-grade libraries, orchestration layers, and automation tools.
  • Ability to work closely with multi-functional teams (ML researchers, infra engineers, product leads) and translate requirements into robust systems.
Benefits
  • equity
  • benefits 📊 Resume Score Upload your resume to see if it passes auto-rejection tools used by recruiters Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
deep learningdistributed trainingperformance monitoringdata loadersorchestration librariesfault-resilient training systemsPythonC++large scale trainingperformance profiling
Soft skills
collaborationcommunicationproblem-solvingleadershipteamwork
General Dynamics Information Technology

Senior Principal AI/ML Engineer/Architect – SECRET

General Dynamics Information Technology
Seniorfull-time$140k–$190k / year🇺🇸 United States
Posted: 1 hour agoSource: gdit.wd5.myworkdayjobs.com
AWSAzureCloudGoogle Cloud PlatformKubernetesPyTorchTensorflow
NVIDIA

Senior MLOps Engineer

NVIDIA
Seniorfull-time$184k–$357k / yearCalifornia · 🇺🇸 United States
Posted: 2 hours agoSource: nvidia.wd5.myworkdayjobs.com
AirflowCloudGoGrafanaKubernetesPrometheusPythonPyTorchRustTensorflow
Zillow

Senior Machine Learning Engineer, Agentic AI

Zillow
Seniorfull-time$169k–$269k / yearCalifornia, Colorado, Connecticut, District of Columbia, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, Nevada, New Jersey, New York, Rhode Island, Vermont, Washington · 🇺🇸 United States
Posted: 1 day agoSource: zillow.wd5.myworkdayjobs.com
Sentient Foundation

AI Research Scientist/Engineer

Sentient Foundation
Mid · Seniorfull-timeCalifornia · 🇺🇸 United States
Posted: 1 day agoSource: jobs.ashbyhq.com
Python