ML Research Engineer

White Circle

ML Research Engineer training and evaluating LLMs for AI Safety company White Circle. Collaborating on safety and moderation tasks, deploying models to production quickly in a hybrid work environment.

Posted 7/2/2026full-timeParis • 🇫🇷 FranceMid-LevelSenior💰 $120,000 - $250,000 per yearWebsite

Tech Stack

Tools & technologies

PythonPyTorchSQL

About the role

Key responsibilities & impact

Train and post-train LLMs for safety and moderation tasks: SFT, RLHF, DPO, and related alignment methods
Build and train reward models from human and synthetic preference data
Design and run high-throughput data pipelines: collection, synthetic generation, filtering, deduplication, and quality control at very large scale
Run distributed training on multi-GPU clusters and debug what goes wrong when it does
Build evaluation systems and benchmarks that actually measure model behavior, and use them to drive training decisions
Optimize models for production inference: quantization, speculative decoding, serving with vLLM/TensorRT or similar
Move fast from experiment to production – your models ship, and you see their effect on real traffic

Requirements

What you’ll need

Have hands-on experience with modern LLM post-training – SFT, RLHF, DPO, or related methods – on models you trained yourself
Have worked with data at genuinely large scale: building pipelines for training corpora, preference data, or synthetic data generation
Have trained models on distributed multi-GPU setups and are comfortable in PyTorch or JAX
Have built or worked with reward models and preference data
Understand evaluation deeply: you know why benchmarks lie, and how to build ones that don't
Have experience optimizing inference: quantization, speculative decoding, vLLM, TensorRT, Triton, or similar
Are strong in Python and comfortable with SQL-like data tooling for large-scale data work
Have a strong ownership mindset: you can take an ambiguous modeling problem, make it concrete, ship a working model, and improve it from real feedback

Benefits

Comp & perks

Paid time off in line with your local regulations, no matter where you work from
Comprehensive medical insurance for our France-based team
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: we've recently been to the Alps and to Saint-Tropez

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

SFTRLHFDPOData Pipeline DesignModel EvaluationQuantizationSpeculative DecodingMulti-GPU ClustersPyTorchJAX

Soft Skills

Ownership Mindset