
Software Engineer, ML Platform – ML Serving
Zoox
full-time
Posted on:
Location Type: Hybrid
Location: Foster City • California • United States
Visit company websiteExplore more
Salary
💰 $189,000 - $258,000 per year
Tech Stack
About the role
- Build the off-vehicle inference service powering our Foundational models (LLMs & VLMs) and the models that improve our rider experiences.
- Lead the design, implementation, and operation of a robust and efficient ML serving infrastructure to enable the serving and monitoring of ML models.
- Collaborate closely with cross-functional teams, including ML researchers, software engineers, and data engineers, to define requirements and align on architectural decisions.
- Enable the junior engineers in the team to grow their careers by providing technical guidance and mentorship
Requirements
- 4+ years of ML model serving infrastructure experience
- Experience building large-scale model serving using GPU and/or high QPS, low latency serving use cases.
- Experience with GPU-accelerated inference using RayServe, vLLM, TensorRT, Nvidia Triton, or PyTorch.
- Experience working with cloud providers like AWS and working with K8s
Benefits
- Health insurance
- Long-term care insurance
- Long-term and short-term disability insurance
- Life insurance
- Paid time off (e.g. sick leave, vacation, bereavement)
- Unpaid time off
- Zoox Stock Appreciation Rights
- Amazon Restricted Stock Units (RSUs)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
ML model serving infrastructureGPU-accelerated inferenceRayServevLLMTensorRTNvidia TritonPyTorchhigh QPS servinglow latency servingcloud computing
Soft Skills
technical guidancementorshipcollaborationcross-functional teamworkarchitectural decision-making