FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Manager, Distinguished Engineer – DGX Systems Software
NVIDIA. End-to-End Stack Readiness: Ensure every DGX platform is ready for the full NVIDIA software stack—firmware, DGX OS, GPU drivers, CUDA toolkit, DCGM, DOCA/OFED, and management tools—as a validated, production-quality product.
Posted 4/21/2026full-timeSanta Clara • California • 🇺🇸 United StatesSeniorLead💰 $320,000 - $488,750 per yearWebsite
About the role
Key responsibilities & impact- End-to-End Stack Readiness: Ensure every DGX platform is ready for the full NVIDIA software stack—firmware, DGX OS, GPU drivers, CUDA toolkit, DCGM, DOCA/OFED, and management tools—as a validated, production-quality product.
- Platform Firmware Development: Lead development of the manageability firmware stack (BMC, BIOS) for all DGX platforms.
- Validation Strategy: Define validation strategy proving each DGX platform is production-ready: end-to-end system validation including firmware regression, NVQual certification, DL workload performance, OS/CUDA stack testing, multi-user scenarios, power/thermal validation, and field upgrade reliability.
- Platform Bring-Up & Architecture: Drive platform bring-up for each new DGX system—coordinating first boot across new silicon (CPU, GPU), board design, and firmware teams.
- Customer Deployment & Enablement: Ensure firmware release flows meet CSP and enterprise deployment requirements.
- Product Delivery Lifecycle: Own the complete DGX delivery lifecycle—system architecture, firmware development, integration, full-stack validation, GA release, and customer deployment—for every DGX product.
- Cross-Org Alignment: Serve as single point of accountability for DGX platform readiness across NVIDIA—aligning GPU, CPU, networking, security, OS, and AI software teams to deliver on schedule.
- Quality & Vendor Management: Own RCCA processes for field issues.
- Team Leadership: Build and lead a world-class engineering organization.
Requirements
What you’ll need- BS or MS in Computer Science, Electrical Engineering, or related field or equivalent experience.
- 12+ overall years in systems firmware/software engineering, with 5+ years in engineering leadership.
- Deep expertise in server system stack including SBIOS, BMC, OS, applications and system-level integration of complex multi-component products.
- Proven track record delivering multi-generation server or data center platforms from architecture through customer deployment.
- Experience managing engineering organizations across multiple geographies in a matrix environment.
- Strong understanding of server hardware: CPU, GPU, interconnect, memory, PCIe, power delivery.
- Experience owning end-to-end product quality—from firmware validation through full-stack system testing to field deployment.
Benefits
Comp & perks- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
firmware developmentBMCBIOSCUDA toolkitDCGMDOCAOS testingsystem validationmulti-user scenariospower/thermal validation
Soft Skills
team leadershipcross-organizational alignmentaccountabilitycustomer enablementvendor management
Certifications
BS in Computer ScienceMS in Electrical Engineering