Inworld AI

Staff / Principal Machine Learning Engineer

Inworld AI

full-time

Posted on:

Location Type: Hybrid

Location: Mountain ViewCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $270,000 - $500,000 per year

Job Level

About the role

  • Make unclear problems clear through design and prototyping.
  • Treat performance, latency, and reliability as product features.
  • Engage in in-person collaboration to solve complex problems and foster team culture.
  • Support sharing work and open-source contributions to advance the field.

Requirements

  • Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
  • Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
  • Proficiency in C++, CUDA, Rust, or highly optimized Python.
  • Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
  • Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
  • Full-cycle ownership of model deployment from research to production.
  • PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
Benefits
  • relocation assistance
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
C++CUDARustPythonquantizationdistillationcaching strategiescontinuous batchingpaged attentionspeculative decoding
Soft Skills
problem-solvingcollaborationteam culture
Certifications
PhD in CSPhD in PhysicsPhD in Math