Tech Stack
CloudDistributed SystemsKubernetesLinuxPythonPyTorchRayTensorflow
About the role
- Part of the team that helps productize the SW stack for the AI compute engine
- Develop, enhance, and maintain next-generation AI deployment software
- Work across all aspects of the full-stack toolchain and optimize hardware-software co-design
- Build and scale software deliverables under tight development windows
- Build out deployment infrastructure and collaborate closely with ML, compiler, and hardware experts
- Contribute system software expertise to design distributed, high-performance deployment systems
Requirements
- BS in Computer Science, Engineering, Math, Physics or related degree with 5+ years of industry software development experience
- MS in Computer Science, Engineering, Math, Physics or related degree preferred with 4+ years
- Strong grasp of system software, data structures, computer architecture, and machine learning fundamentals
- Proficient in C/C++/Python development in Linux environment and using standard development tools
- Experience with distributed, high-performance software design and implementation
- Self-motivated team player with a strong sense of ownership and leadership
- Preferred: MS or PhD in Computer Science, Electrical Engineering, or related fields
- Preferred: Experience with inference servers/model serving frameworks (TensorRT-LLM, vLLM, SGLang, etc.)
- Preferred: Experience with deep learning frameworks (PyTorch, TensorFlow)
- Preferred: Experience with deep learning runtimes (ONNX Runtime, TensorRT)
- Preferred: Experience with distributed systems collectives such as NCCL and OpenMPI
- Preferred: Experience with software testing fundamentals
- Preferred: Experience deploying ML workloads (LLMs, VLMs, NLP, etc.) on distributed systems
- Preferred: Experience with Kubernetes, Ray or other MLOps tools and techniques
- Preferred: Prior startup, small team or incubation experience
- Preferred: Work experience at a cloud provider or AI compute/subsystem company