Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
NVIDIA

Senior Systems Software Engineer, Kubernetes Scale - DGX Cloud

NVIDIA

Senior Systems Software Engineer optimizing AI infrastructure at NVIDIA. Driving performance and scalability for distributed systems using Kubernetes and cloud technologies.

Posted 6/11/2026full-timeSanta Clara • California, Washington • 🇺🇸 United StatesSenior💰 $184,000 - $356,500 per yearWebsite

Tech Stack

Tools & technologies
AWSAzureCloudDistributed SystemsGoGoogle Cloud PlatformKubernetesPython

About the role

Key responsibilities & impact
  • Drive end-to-end performance and scale characterization for the NVIDIA DGX Cloud software stack
  • Collaborate with AI researchers, developers and customers to develop innovative, automated tests
  • Deep dive into performance and scale issues in complex distributed systems
  • Design and develop monitoring, reporting and analysis tools for performance and scale testing
  • Triage, debug and root cause issues related to operating Kubernetes clusters at ultra-large scale
  • Build and maintain a high-velocity framework that enables continuous, always-on performance and scale testing
  • Document research, methodologies and results clearly
  • Engage efficiently with upstream communities

Requirements

What you’ll need
  • 8+ years of experience
  • Bachelors/Masters in Engineering (preferably, Electrical Engineering, Computer Engineering, or Computer Science) or equivalent experience
  • Expertise in Kubernetes and familiarity with related CNCF projects
  • Background in working with large scale parallel and distributed accelerator-based systems
  • Expertise optimizing performance and AI workloads on large scale systems
  • Experience with performance modeling and benchmarking at scale
  • Proficiency in Golang/Python
  • Background with the NVIDIA software ecosystem in both training and inference domains
  • Expertise with at least one of public CSP infrastructure (GCP, AWS, Azure, OCI for example)

Benefits

Comp & perks
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesGolangPythonperformance modelingbenchmarkingdistributed systemsAI workloads optimizationlarge scale systemsNVIDIA software ecosystemCNCF projects
Soft Skills
collaborationproblem-solvingdocumentationcommunicationengagement
Certifications
Bachelors in EngineeringMasters in Engineering