Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
NVIDIA

Senior Systems Software Engineer – Kubernetes Node Lifecycle

NVIDIA

Senior Systems Software Engineer specializing in Kubernetes node engineering for NVIDIA's DGX Cloud. Managing node lifecycle and ensuring scalability for AI workloads with deep technical expertise.

Posted 6/11/2026full-timeSanta Clara • California, Washington • 🇺🇸 United StatesSenior💰 $184,000 - $356,500 per yearWebsite

Tech Stack

Tools & technologies
AWSAzureBootstrapCloudGoGoogle Cloud PlatformKubernetesNode.jsPackerPython

About the role

Key responsibilities & impact
  • Direct the building and refinement of CAPI providers for NVIDIA Kubernetes Engine, maintaining steady, consistent, and scalable node provisioning across DGX Cloud and NCP environments
  • Develop and maintain bring-your-own-node workflows that allow customers to integrate different NVIDIA hardware into NKE clusters while ensuring high operational consistency
  • Coordinate OS image generation, packaging, deployment, and update processes for NKE nodes
  • Ensure images are fine-tuned for NVIDIA GPU workloads and satisfy enterprise- and cloud-grade security and compliance criteria
  • Develop and sustain node image hardening pipelines, incorporating CIS benchmarks, automated CVE remediation, and promotion gates connected to security posture
  • Develop and maintain automated test suites for node images
  • Handle nodepool lifecycle at scale, including provisioning, upgrades, drain and cordon workflows, and seamless node replacement across very large clusters with diverse NVIDIA hardware
  • Examine, resolve, and determine underlying causes of node-layer faults in production NKE clusters
  • Communicate your progress and findings at internal and external gatherings such as KubeCon and GTC

Requirements

What you’ll need
  • 8 years of experience with a background in systems software, cloud infrastructure, or Kubernetes node engineering
  • Bachelor’s or Master’s degree in Engineering (Electrical, Computer Engineering, Computer Science) or equivalent experience
  • Deep expertise in Cluster API (CAPI), including provider development and full machine lifecycle from provisioning to deletion
  • Extensive experience with OS image build pipelines, node image packaging, and delivery systems for Kubernetes nodes (for example image-builder, containerd, cloud-init, packer)
  • Practical experience with bring-your-own-node models and integrating diverse hardware into live Kubernetes environments, including large-scale nodepool lifecycle management and upgrades
  • Strong understanding of kubelet configuration, node bootstrap, and the Kubernetes node registration lifecycle
  • Experience with node image security, including vulnerability scanning, patch automation, and compliance gating as part of image build pipelines
  • Proficiency in Golang and/or Python, and hands-on experience with at least one major public cloud provider (GCP, AWS, Azure, OCI or equivalent)

Benefits

Comp & perks
  • equity
  • health insurance
  • professional development

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Cluster API (CAPI)KubernetesOS image build pipelinesnode image packagingGolangPythonvulnerability scanningpatch automationcloud-initcontainerd
Soft Skills
communicationproblem-solvingcoordination