Salary
💰 $184,000 - $356,500 per year
Tech Stack
AWSAzureCloudDistributed SystemsGoogle Cloud PlatformKubernetesMicroservicesPythonRust
About the role
- NVIDIA seeking Senior System Software Engineer for user-facing tools for Dynamo Inference Server
- Build and maintain distributed model management systems, including Rust-based runtime components for large-scale AI inference workloads
- Implement inference scheduling and deployment solutions on Kubernetes and Slurm; drive advances in scaling, orchestration, and resource management
- Collaborate with infrastructure engineers and researchers to develop scalable APIs, services, and end-to-end inference workflows
- Create monitoring, benchmarking, automation, and documentation processes to ensure low-latency, robust, production-ready inference systems on GPU clusters
- Work in a remote-friendly, fast-paced team focused on GPU-accelerated deep learning software
Requirements
- Bachelor’s, Master’s, or PhD in Computer Science, ECE, or related field (or equivalent experience)
- 6+ years of professional systems software development experience
- Strong programming expertise in Rust (C++ and Python are a plus)
- Deep knowledge of distributed systems, runtime orchestration, and cluster-scale services
- Hands-on experience with Kubernetes, container-based microservices, and integration with Slurm
- Proven ability to excel in fast-paced R&D environments and collaborate across functions
- (Nice-to-have) Experience with Dynamo Inference Server, TensorRT, ONNX Runtime, and LLM inference pipelines at scale
- (Nice-to-have) Contributions to large-scale, low-latency distributed systems and GPU inference performance tuning (CUDA, cloud-native/hybrid environments)