Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Nebius Group

Senior ML Engineer, Token Factory

Nebius Group

Senior ML Engineer developing fast, reliable, and effortless models for AI deployment. Involves enhancing methodologies for fine-tuning and inference optimization in Nebius Cloud's GPU environment.

Posted 7/1/2026full-timeRemote • 🇩🇪 GermanySeniorWebsite

Tech Stack

Tools & technologies
CloudPython

About the role

Key responsibilities & impact
  • Token Factory is a part of Nebius Cloud, one of the world’s largest GPU clouds, running tens of thousands of GPUs.
  • We are building an inference & fine-tuning platform that makes every kind of foundation model — text, vision, audio, and emerging multimodal architectures — fast, reliable, and effortless to train & deploy at massive scale.
  • Enhancing fine-tuning methodologies - both LoRA-based and full-parameter - for cutting-edge LLMs (e.g., GPT-OSS, Kimi K2.5, DeepSeek V3.1/V3.2, GLM-4.7), focusing on both model quality and training efficiency.
  • Identifying LLM inference bottlenecks to drive production speedups.
  • Building model training and evaluation pipelines in JAX for speculative decoding, experimenting with architectures (dense/MoE, auto-regressive/parallel), and deriving scaling laws to guide resource allocation.
  • Investigating low-precision (FP8, NVFP4/MXFP4) methodologies for supervised fine-tuning and reinforcement learning - spanning both inference and training - optimized for modern hardware

Requirements

What you’ll need
  • A profound understanding of theoretical foundations of machine learning and reinforcement learning.
  • Deep expertise in modern deep learning for language processing and generation
  • Experience with training large models on multiple computational nodes
  • Reasonable understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features etc.)
  • Strong software engineering skills (we mostly use Python)
  • Deep experience with modern deep learning frameworks (we use JAX)
  • Proficiency in contemporary software engineering approaches, including CI/CD, version control and unit testing
  • Strong communication and leadership abilities

Benefits

Comp & perks
  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and ownership
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Machine LearningReinforcement LearningDeep LearningModel TrainingModel EvaluationPerformance OptimizationCI/CDVersion ControlUnit TestingLow-Precision Methodologies
Soft Skills
Strong CommunicationLeadership Abilities