FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

ML Infrastructure Engineer
White CircleML Infrastructure Engineer at White Circle developing scalable infrastructures for RL and LLM workflows. Collaborating close to researchers to influence AI safety and product quality.
Tech Stack
Tools & technologiesPythonPyTorch
About the role
Key responsibilities & impact- Build robust, flexible, and scalable RL and post-training pipelines, including smoke tuning runs for quality testing and approach ablations
- Design data control systems that govern what the model sees, when it sees it, and how training data flows through rollouts, replay, filtering, evaluation, and policy updates
- Tune training and inference end-to-end for high throughput across the systems that matter: networking, memory, compute scheduling, data loading, storage, checkpointing, and I/O
- Investigate how infrastructure choices affect learning dynamics, eval quality, model behavior, and training stability – staying close to the state of the art in LLMs, RL, and post-training
- Build infrastructure for model iteration: experiment runs, artifacts, evals, dashboards, failure inspection, reproducibility, and cost visibility
- Work on inference infrastructure where it affects post-training and evaluation loops
- Build and improve agentic development environments: coding-agent harnesses, browser/tool integrations, terminal/runtime sandboxes, repo-aware workflows, and multi-agent orchestration
- Work closely with the team: plan future steps, discuss tradeoffs, share context early, and stay in touch while building
Requirements
What you’ll need- Have designed, built, or maintained distributed RL/post-training systems at scale and are fluent in their moving parts: rollouts, replay buffers, reward signals, data filtering, policy updates, evaluation loops, and failure analysis
- Are familiar with deep learning frameworks such as PyTorch or JAX
- Are proficient in Python, including concurrency, asynchronous programming, multiprocessing, and performance optimization
- Can debug distributed GPU workloads across CUDA runtime, container runtime, driver versions, NCCL or equivalent communication layers, networking, storage, scheduling, and checkpointing
- Have experience with profiling tools across the stack, for example py-spy, PyTorch profiler, Nsight, perf, tracing, metrics, logs, or custom instrumentation
- Have experience with inference stacks such as vLLM, SGLang, TensorRT-LLM, Dynamo, or custom serving infrastructure
- Can reason from system metrics back to model behavior: when latency, queueing, sampling, data order, rollout throughput, or infrastructure failures affect learning
- Have a strong ownership mindset: you can take an ambiguous infrastructure problem, make it concrete, ship a working system, and improve it from real feedback.
Benefits
Comp & perks- Paid time off in line with your local regulations, no matter where you work from.
- Comprehensive medical insurance for our France-based team.
- All the hardware, tools, and services you need.
- Covered subscriptions for AI agents and IDEs.
- Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Reinforcement LearningPost-Training PipelinesConcurrencyAsynchronous ProgrammingPerformance OptimizationProfiling Tools (py-spy, Nsight)Data FilteringPolicy UpdatesFailure AnalysisModel Evaluation
Soft Skills
Strong Ownership MindsetCollaborationProblem-Solving