
Machine Learning Engineer – Distillation
Featherless AI
full-time
Posted on:
Location Type: Remote
Location: Anywhere in the World
Visit company websiteExplore more
Tech Stack
About the role
- Design and implement knowledge distillation pipelines (teacher–student, self-distillation, multi-teacher, etc.)
- Distill large foundation models into smaller, faster, and cheaper models for inference
- Run and analyze large-scale training experiments to evaluate quality, latency, and cost tradeoffs
- Collaborate with research to translate new distillation ideas into production-ready code
- Optimize training and inference performance (memory, throughput, latency)
- Contribute to internal tooling, evaluation frameworks, and experiment tracking
- (Optional) Contribute back to open-source models, tooling, or research
Requirements
- Strong background in machine learning or deep learning
- Hands-on experience with model distillation (LLMs or other neural networks)
- Solid understanding of training dynamics, loss functions, and optimization
- Experience with PyTorch (or JAX) and modern ML tooling
- Comfort running experiments on multi-GPU or distributed setups
- Ability to reason about model quality vs. performance tradeoffs
- Pragmatic mindset: you care about shipping, not just papers
Benefits
- Competitive compensation
- Meaningful equity
- Remote-friendly, async-first environment
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
knowledge distillationmachine learningdeep learningmodel distillationtraining dynamicsloss functionsoptimizationmulti-GPU setupsdistributed setupsPyTorch
Soft Skills
collaborationreasoningpragmatic mindset