
Staff / Principal Machine Learning Engineer
Inworld AI
full-time
Posted on:
Location Type: Hybrid
Location: Mountain View • California • United States
Visit company websiteExplore more
Salary
💰 $270,000 - $500,000 per year
Job Level
Tech Stack
About the role
- Make unclear problems clear through design and prototyping.
- Treat performance, latency, and reliability as product features.
- Engage in in-person collaboration to solve complex problems and foster team culture.
- Support sharing work and open-source contributions to advance the field.
Requirements
- Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
- Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
- Proficiency in C++, CUDA, Rust, or highly optimized Python.
- Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
- Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
- Full-cycle ownership of model deployment from research to production.
- PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
Benefits
- relocation assistance
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C++CUDARustPythonquantizationdistillationcaching strategiescontinuous batchingpaged attentionspeculative decoding
Soft Skills
problem-solvingcollaborationteam culture
Certifications
PhD in CSPhD in PhysicsPhD in Math