ML Infrastructure Engineer

White Circle

ML Infrastructure Engineer at White Circle developing scalable infrastructures for RL and LLM workflows. Collaborating close to researchers to influence AI safety and product quality.

Posted 7/2/2026full-timeParis • 🇫🇷 FranceMid-LevelSenior💰 $180,000 - $350,000 per yearWebsite

Tech Stack

Tools & technologies

PythonPyTorch

About the role

Key responsibilities & impact

Build robust, flexible, and scalable RL and post-training pipelines, including smoke tuning runs for quality testing and approach ablations
Design data control systems that govern what the model sees, when it sees it, and how training data flows through rollouts, replay, filtering, evaluation, and policy updates
Tune training and inference end-to-end for high throughput across the systems that matter: networking, memory, compute scheduling, data loading, storage, checkpointing, and I/O
Investigate how infrastructure choices affect learning dynamics, eval quality, model behavior, and training stability – staying close to the state of the art in LLMs, RL, and post-training
Build infrastructure for model iteration: experiment runs, artifacts, evals, dashboards, failure inspection, reproducibility, and cost visibility
Work on inference infrastructure where it affects post-training and evaluation loops
Build and improve agentic development environments: coding-agent harnesses, browser/tool integrations, terminal/runtime sandboxes, repo-aware workflows, and multi-agent orchestration
Work closely with the team: plan future steps, discuss tradeoffs, share context early, and stay in touch while building

Requirements

What you’ll need

Have designed, built, or maintained distributed RL/post-training systems at scale and are fluent in their moving parts: rollouts, replay buffers, reward signals, data filtering, policy updates, evaluation loops, and failure analysis
Are familiar with deep learning frameworks such as PyTorch or JAX
Are proficient in Python, including concurrency, asynchronous programming, multiprocessing, and performance optimization
Can debug distributed GPU workloads across CUDA runtime, container runtime, driver versions, NCCL or equivalent communication layers, networking, storage, scheduling, and checkpointing
Have experience with profiling tools across the stack, for example py-spy, PyTorch profiler, Nsight, perf, tracing, metrics, logs, or custom instrumentation
Have experience with inference stacks such as vLLM, SGLang, TensorRT-LLM, Dynamo, or custom serving infrastructure
Can reason from system metrics back to model behavior: when latency, queueing, sampling, data order, rollout throughput, or infrastructure failures affect learning
Have a strong ownership mindset: you can take an ambiguous infrastructure problem, make it concrete, ship a working system, and improve it from real feedback.

Benefits

Comp & perks

Paid time off in line with your local regulations, no matter where you work from.
Comprehensive medical insurance for our France-based team.
All the hardware, tools, and services you need.
Covered subscriptions for AI agents and IDEs.
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez.

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Reinforcement LearningPost-Training PipelinesConcurrencyAsynchronous ProgrammingPerformance OptimizationProfiling Tools (py-spy, Nsight)Data FilteringPolicy UpdatesFailure AnalysisModel Evaluation

Soft Skills

Strong Ownership MindsetCollaborationProblem-Solving