
Staff / Principal Machine Learning Engineer
Inworld AI
full-time
Posted on:
Location Type: Remote
Location: Switzerland
Visit company websiteExplore more
Job Level
About the role
- Develop best-in-class real-time multimodal models and the orchestration platform optimized for thousands of queries per second.
- Tackle unclear problems and find solutions that ensure performance, latency, and reliability as core product features.
- Collaborate with global teams to design benchmarks or prototypes that uncover detailed insights on projects.
- Ensure that all engineering outputs are stable and ship products that meet market needs.
Requirements
- Inference Optimization: Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
- Model Acceleration: Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
- High-Performance Systems: Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs.
- Distributed Systems & Scaling: Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
- Public work: Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
- Full-cycle ownership: You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.
- Background: PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
- Professional fluency in English (written and spoken) is required, as you will be collaborating daily with our US-based leadership and engineering teams.
Benefits
- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
- Remote work options
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
inference optimizationmodel accelerationquantizationdistillationcaching strategiesC++CUDARustPythonKubernetes
Soft Skills
problem-solvingcollaborationcommunicationperformance optimizationreliability assurance
Certifications
PhD in CSPhD in PhysicsPhD in Math