NVIDIA

Distinguished Engineer – GPU Fleet Operations Automation

NVIDIA

full-time

Posted on:

Location Type: Remote

Location: Remote • California, Colorado, Illinois, New York, Texas • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $308,000 - $471,500 per year

Job Level

SeniorLead

Tech Stack

CloudKubernetes

About the role

  • Various Architectural Work: define and drive the technical implementation for DGX Cloud operations practice for GPU fleet lifecycle.
  • Collaborate on Cross Domain Disciplines: drive the technical strategy and awareness for best practices and technical capabilities into DGX Cloud engineering practices.
  • Accelerate Integration: Guide the technical delivery into DGX Cloud across all delivery environments: enterprise, public cloud, and high security, isolated, sovereign.
  • Engage Stakeholders: Collaborate with customers, infrastructure providers, and partners to ensure NVIDIA’s solutions set the industry standard for operational excellence.
  • Full Software and System Lifecycle: From ideation to architecture, design, development, deployment, operations, and full lifecycle management, lead all technical aspects of planning and continuous evolution of large technical scope.

Requirements

  • 15-18+ overall years in technical roles with a focus on operations and automation for cloud infrastructure, platforms, and applications.
  • 5-10+ years of lead experience
  • BS/MS or higher or equivalent experience in systems / software engineering, or related engineering fields
  • Technical proficiency in multi-tenant data center and cloud-native architectures, with bare metal, virtualization, containerization, and higher level abstractions (IaaS, Kubernetes, Slurm), AI/ML platforms and applications.
  • Shown success delivering high-impact technically complex solutions that achieve high levels of transparency into resource utilization, performance, and operational insights.
  • Technical Leadership: Ability to synthesize multi-functional needs into architecture and design while guiding internal execution across complementary teams.
  • Communication and Partnership: Strong collaboration and influence skills, capable of leading engineering engagement, presenting with peers, partners, and working with high performance accelerated computing customers.
Benefits
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
cloud infrastructureautomationmulti-tenant data center architecturecloud-native architecturebare metalvirtualizationcontainerizationIaaSKubernetesAI/ML platforms
Soft skills
technical leadershipcollaborationinfluencecommunication
Certifications
BS in systems engineeringMS in software engineering