FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior ML Engineer, Token Factory
Nebius GroupSenior ML Engineer developing fast, reliable, and effortless models for AI deployment. Involves enhancing methodologies for fine-tuning and inference optimization in Nebius Cloud's GPU environment.
Tech Stack
Tools & technologiesCloudPython
About the role
Key responsibilities & impact- Token Factory is a part of Nebius Cloud, one of the world’s largest GPU clouds, running tens of thousands of GPUs.
- We are building an inference & fine-tuning platform that makes every kind of foundation model — text, vision, audio, and emerging multimodal architectures — fast, reliable, and effortless to train & deploy at massive scale.
- Enhancing fine-tuning methodologies - both LoRA-based and full-parameter - for cutting-edge LLMs (e.g., GPT-OSS, Kimi K2.5, DeepSeek V3.1/V3.2, GLM-4.7), focusing on both model quality and training efficiency.
- Identifying LLM inference bottlenecks to drive production speedups.
- Building model training and evaluation pipelines in JAX for speculative decoding, experimenting with architectures (dense/MoE, auto-regressive/parallel), and deriving scaling laws to guide resource allocation.
- Investigating low-precision (FP8, NVFP4/MXFP4) methodologies for supervised fine-tuning and reinforcement learning - spanning both inference and training - optimized for modern hardware
Requirements
What you’ll need- A profound understanding of theoretical foundations of machine learning and reinforcement learning.
- Deep expertise in modern deep learning for language processing and generation
- Experience with training large models on multiple computational nodes
- Reasonable understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features etc.)
- Strong software engineering skills (we mostly use Python)
- Deep experience with modern deep learning frameworks (we use JAX)
- Proficiency in contemporary software engineering approaches, including CI/CD, version control and unit testing
- Strong communication and leadership abilities
Benefits
Comp & perks- Competitive compensation
- Career growth and learning opportunities
- Flexibility and ownership
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Machine LearningReinforcement LearningDeep LearningModel TrainingModel EvaluationPerformance OptimizationCI/CDVersion ControlUnit TestingLow-Precision Methodologies
Soft Skills
Strong CommunicationLeadership Abilities