FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal Software Engineer, GPU Firmware, GPU System Software
NVIDIAPrincipal Software Engineer driving GPU firmware and software work streams in CSP Engagements team at NVIDIA. Collaborating with engineers to optimize GPU operations for hyperscale customers.
Posted 6/27/2026full-timeSanta Clara • California, Oregon, Texas • 🇺🇸 United StatesLead💰 $272,000 - $431,250 per yearWebsite
About the role
Key responsibilities & impact- Drive GPU firmware & software work streams with CSP engineering teams — ensuring they understand GPU firmware architecture (VBIOS, InfoROM, microcontroller firmware), update sequencing, recovery procedures, and GPU power management
- Gather and synthesize CSP feedback on GPU firmware/software — covering manageability, observability, security requirements (e.g., multi-tenancy isolation, secure boot, attestation), and performance — and champion those priorities into NVIDIA's GPU firmware/software feature roadmap and delivery plan
- Drive GPU firmware update orchestration for large-scale deployments — multi-GPU update sequencing, rollback strategy, failure handling, and validation across hundreds of GPUs per rack
- Serve as the technical focal point between NVIDIA and CSP firmware/software engineering — ensuring GPU behaviors (error recovery flows, thermal protection, power state transitions) are well-documented and accessible for customer integration
- Identify cross-CSP GPU SW/FW issue patterns — common update failures, recovery gaps, and configuration problems — and drive documentation, tooling, and test strategy improvements
Requirements
What you’ll need- 15+ years of experience in GPU system software, GPU firmware, or accelerator platform engineering
- BS or MS in Computer Science, Electrical Engineering, or related field (or equivalent experience)
- Deep understanding of GPU architecture internals: streaming multiprocessors, GEMM execution, compute kernels, memory hierarchy, and how firmware/driver decisions impact GPU compute performance
- Understanding of multi-GPU fabric architectures (NVLink, or similar) and how firmware coordinates across multiple GPUs in a rack-scale system
- Understanding of GPU firmware architecture: VBIOS, GPU microcontroller firmware, InfoROM, and their interaction with the GPU driver stack
- Experience with firmware update lifecycle management at scale: multi-device update sequencing, A/B updates, rollback, staged rollout, emergency recovery
- Understanding of GPU error handling and recovery flows — how firmware-level errors propagate through the driver stack to application-visible failures
- Experience with GPU health monitoring and telemetry: Xid errors, thermal events, power events, ECC counters, and their significance for firmware/software teams
- Customer obsession — genuine passion for simplifying GPU firmware integration for fleet-scale customers.
- Proven success influencing engineering teams to improve quality and fleet manageability
Benefits
Comp & perks- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GPU FirmwareGPU SoftwareGPU ArchitectureMulti-GPU Fabric ArchitectureFirmware Update Lifecycle ManagementError HandlingHealth MonitoringTelemetryCompute KernelsMemory Hierarchy
Soft Skills
Customer ObsessionInfluencingCollaborationDocumentationProblem Solving
Certifications
BS in Computer ScienceMS in Electrical Engineering