FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Systems Software Engineer – Kubernetes Node Lifecycle
NVIDIASenior Systems Software Engineer specializing in Kubernetes node engineering for NVIDIA's DGX Cloud. Managing node lifecycle and ensuring scalability for AI workloads with deep technical expertise.
Posted 6/11/2026full-timeSanta Clara • California, Washington • 🇺🇸 United StatesSenior💰 $184,000 - $356,500 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureBootstrapCloudGoGoogle Cloud PlatformKubernetesNode.jsPackerPython
About the role
Key responsibilities & impact- Direct the building and refinement of CAPI providers for NVIDIA Kubernetes Engine, maintaining steady, consistent, and scalable node provisioning across DGX Cloud and NCP environments
- Develop and maintain bring-your-own-node workflows that allow customers to integrate different NVIDIA hardware into NKE clusters while ensuring high operational consistency
- Coordinate OS image generation, packaging, deployment, and update processes for NKE nodes
- Ensure images are fine-tuned for NVIDIA GPU workloads and satisfy enterprise- and cloud-grade security and compliance criteria
- Develop and sustain node image hardening pipelines, incorporating CIS benchmarks, automated CVE remediation, and promotion gates connected to security posture
- Develop and maintain automated test suites for node images
- Handle nodepool lifecycle at scale, including provisioning, upgrades, drain and cordon workflows, and seamless node replacement across very large clusters with diverse NVIDIA hardware
- Examine, resolve, and determine underlying causes of node-layer faults in production NKE clusters
- Communicate your progress and findings at internal and external gatherings such as KubeCon and GTC
Requirements
What you’ll need- 8 years of experience with a background in systems software, cloud infrastructure, or Kubernetes node engineering
- Bachelor’s or Master’s degree in Engineering (Electrical, Computer Engineering, Computer Science) or equivalent experience
- Deep expertise in Cluster API (CAPI), including provider development and full machine lifecycle from provisioning to deletion
- Extensive experience with OS image build pipelines, node image packaging, and delivery systems for Kubernetes nodes (for example image-builder, containerd, cloud-init, packer)
- Practical experience with bring-your-own-node models and integrating diverse hardware into live Kubernetes environments, including large-scale nodepool lifecycle management and upgrades
- Strong understanding of kubelet configuration, node bootstrap, and the Kubernetes node registration lifecycle
- Experience with node image security, including vulnerability scanning, patch automation, and compliance gating as part of image build pipelines
- Proficiency in Golang and/or Python, and hands-on experience with at least one major public cloud provider (GCP, AWS, Azure, OCI or equivalent)
Benefits
Comp & perks- equity
- health insurance
- professional development
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Cluster API (CAPI)KubernetesOS image build pipelinesnode image packagingGolangPythonvulnerability scanningpatch automationcloud-initcontainerd
Soft Skills
communicationproblem-solvingcoordination