
Senior Software Engineer – Deep Learning Compiler Verification, Infrastructure
NVIDIA
full-time
Posted on:
Location Type: Remote
Location: California • Oregon • United States
Visit company websiteExplore more
Salary
💰 $140,000 - $224,250 per year
Job Level
About the role
- Drive CI and infrastructure capabilities that make deep learning compiler development fast, reliable, and scalable.
- This includes improving signal-to-noise (flake reduction, reproducibility, and richer diagnostics), accelerating iteration cycles, scaling capacity and coverage across models/hardware/software configurations, and building strong observability (metrics, logging, tracing, dashboards) so failures are easy to understand and fix.
- Explore practical uses of AI to enhance CI workflows—such as smarter test selection, automated triage/summarization, and faster issue isolation—ultimately increasing the quality and speed of deep learning compiler development, testing, and release.
Requirements
- BS, MS, or PhD (or equivalent experience) in Computer Science, Computer/Electrical Engineering, Mathematics, or related field
- 3+ years of professional experience designing and scaling CI/CD, build/release, or developer productivity infrastructure for DL/GPU software environments
- Strong software engineering skills (Python required) with ability to architect, implement, and debug complex systems end-to-end
- Hands-on experience building CI/MLOps platform capabilities—pipeline orchestration, artifact/package management, and production-grade observability (logs/metrics/dashboards)—with strong reliability and maintainability
- Experience with deep learning frameworks/runtime stacks (e.g., PyTorch, JAX, vLLM, SGLang, TensorRT, NeMo) and running real workloads in production-like environments
- Working knowledge of Linux-based development and debugging across complex software/hardware stacks (drivers, CUDA libraries, containers, cluster schedulers, etc.)
Benefits
- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonCI/CDMLOpspipeline orchestrationartifact managementobservabilitydeep learning frameworksCUDAdebuggingscalable systems
Soft Skills
problem-solvingcommunicationcollaborationanalytical thinkingattention to detail