NVIDIA

Senior Software Engineer – Container and Cloud Infrastructure

NVIDIA

full-time

Posted on:

Origin:  • 🇺🇸 United States • California

Visit company website
AI Apply
Apply

Salary

💰 $184,000 - $356,500 per year

Job Level

Senior

Tech Stack

CloudDockerKubernetesPython

About the role

  • Design, build, and harden containers for NIM runtimes, inference backends; enable reproducible, multi-arch, CUDA-optimized builds.
  • Develop Python tooling and services for build orchestration, CI/CD integrations, Helm/Operator automation, and test harnesses; enforce quality with typing, linting, and unit/integration tests.
  • Help design and evolve Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi-cluster rollouts.
  • Optimize container performance: layer layout, startup time, build caching, runtime memory/IO, network, and GPU utilization; instrument with metrics and tracing.
  • Evolve the base image strategy, dependency management, and artifact/registry topology.
  • Collaborate across research, backend, SRE, and product teams to ensure day-0 availability of new models.
  • Mentor teammates; set high engineering standards for container quality, security, and operability.
  • Build enterprise-grade software and tooling for container build, packaging, and deployment; improve reliability, performance, and scale across thousands of GPUs.
  • Support disaggregated LLM inference and emerging deployment patterns.

Requirements

  • 10+ years building production software with a strong focus on containers and Kubernetes.
  • Strong Python skills building production-grade tooling/services
  • Experience with Python SDKs and clients for Kubernetes and cloud services
  • Expert knowledge of Docker/BuildKit, containerd/OCI, image layering, multi-stage builds, and registry workflows.
  • Deep experience operating workloads on Kubernetes.
  • Strong understanding on LLM inference features, including structured output, KV-cache, and LoRa adapter
  • Hands-on experience building and running GPU workloads in k8s, including NVIDIA device plugin, MIG, CUDA drivers/runtime, and resource isolation.
  • Excellent collaboration and communication skills; ability to influence cross-functional design.
  • A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience.
  • Expertise with Helm chart design systems, Operators, and platform APIs serving many teams (preferred).
  • Experience with OpenAI API, Hugging Face API as well as understanding different inference backends (vLLM, SGLang, TRT-LLM) (preferred).
  • Background in benchmarking and optimizing inference container performance and startup latency at scale (preferred).
  • Prior experience designing multi-tenant, multi-cluster, or edge/air-gapped container delivery (preferred).
  • Contributions to open-source container, k8s, or GPU ecosystems (preferred).
Twelve Labs

Applied AI Engineer – Field Engineering

Twelve Labs
Mid · Seniorfull-time🇺🇸 United States
Posted: 16 days agoSource: jobs.ashbyhq.com
AWSAzureCloudDockerFFmpegGoogle Cloud PlatformKubernetesPythonRust
Island

Senior C++ Developer

Island
Seniorfull-timeFlorida · 🇺🇸 United States
Posted: 12 days agoSource: www.comeet.com
AndroidAWSCloudDistributed SystemsGoiOSJavaScriptLinuxMacOSReact
SD Solutions

Software Architect

SD Solutions
Mid · Seniorfull-time🇬🇪 Georgia
Posted: 6 days agoSource: sd-solutions.breezy.hr
CloudDistributed SystemsJavaKafkaNoSQLRabbitMQSQL
Coinbase

Engineering Manager (Consumer - Coinbase One)

Coinbase
Senior · Leadfull-time$218k–$257k / year🇺🇸 United States
Posted: 35 days agoSource: boards.greenhouse.io
Distributed Systems
S-PRO

AI Engineer

S-PRO
Mid · Seniorfull-time🇨🇭 Switzerland
Posted: 17 days agoSource: spro.recruitee.com
Python