FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal Software Engineer – E2E Performance, Goodput
NVIDIAPrincipal Software Engineer focusing on end-to-end performance for CSP Engagements team at NVIDIA. Collaborating with engineering teams to achieve performance targets on NVIDIA platforms.
Posted 6/27/2026full-timeSanta Clara • California, Oregon, Texas • 🇺🇸 United StatesLead💰 $272,000 - $431,250 per yearWebsite
Tech Stack
Tools & technologiesPandasPython
About the role
Key responsibilities & impact- Drive performance characterization work streams with engineering teams of key CSP/hyperscale customers — ensuring they understand platform performance expectations, profiling methodology, and tuning options for their specific workloads
- Gather and synthesize CSP performance feedback — identify gaps between expected and actual throughput, and champion optimization priorities back into NVIDIA's CUDA, NCCL, driver, and firmware teams
- Ensure key open-source performance and stress tools (e.g., STREAM, GPU Burn, GPU BLAST) are updated and validated for the latest NVIDIA rack-scale systems, GPU architectures, and CPU platforms — so customers and internal teams have reliable baseline measurements from day one
- Work closely with CSPs to ensure their own performance and validation tooling reflects the latest GPU capabilities, memory hierarchy changes, and platform-specific tuning parameters
- Conduct cross-CSP performance comparison and pattern analysis — identify configuration, software, or workload differences that explain performance gaps between deployments
- Collaborate with CSPs to ensure performance-related integration work (profiling infrastructure, benchmark harnesses, config validation) is ready ahead of deployment milestones
- Define test strategies and tooling requirements for performance validation — both for NVIDIA internal certification and customer acceptance
Requirements
What you’ll need- 15+ years of experience in systems performance engineering, ideally in GPU/HPC/ML infrastructure.
- BS or MS in Computer Science, Computer Engineering, or related field (or equivalent experience)
- Proficiency in GPU workload profiling: nsight systems, nsight compute, DCGM metrics, or equivalent instrumentation
- Understanding of distributed training performance dynamics: computation/communication overlap, pipeline bubbles, memory bandwidth utilization, collective efficiency
- Statistical methods for performance analysis: regression detection, confidence intervals, A/B comparison at scale
- Understanding of how the full software stack impacts performance: driver overhead, collective algorithm selection, memory allocation, scheduling, firmware power management
- Strong data analysis and visualization skills (Python, pandas, dashboards).
- Customer obsession — genuine passion for understanding why customers aren't achieving expected performance and driving solutions
- Ability to communicate performance findings to both deep technical audiences and executive leadership
- Demonstrated success influencing multiple engineering teams to prioritize performance improvements
Benefits
Comp & perks- equity
- benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Performance CharacterizationGPU Workload ProfilingStatistical Methods for Performance AnalysisUnderstanding of Distributed Training DynamicsData AnalysisPerformance ValidationTest Strategy DefinitionPerformance OptimizationBenchmarkingConfiguration Validation
Soft Skills
Customer ObsessionCommunication SkillsInfluencing SkillsCollaborationProblem-SolvingAnalytical ThinkingLeadershipPassion for Performance Improvement