Research Scientist – AI Behaviours

White Circle

Research Scientist studying how LLM agents fail and contributing to AI safety research at White Circle. Own research projects and develop automated audit agents in hybrid environment.

Posted 6/29/2026full-timeParis • 🇫🇷 FranceMid-LevelSenior💰 $150,000 - $250,000 per yearWebsite

About the role

Key responsibilities & impact

Own research projects end to end — from an unclear concern ("how do we even define sloppy research outputs?") to a falsifiable experiment, clean baselines, and a result you can defend.
Develop automated audit agents that discover and characterise suspect model behaviour at scale.
Study how misalignment and bias actually show up when real users interact with agents, and turn what you find into evals our products can ship.
Pressure-test frontier agents in realistic, high-stakes scenarios to find where they break before our customers do.
Run white-box and block-box investigations to understand how AI models fail.
Publish what you learn as public blog posts and conference papers, and feed the rest back into our internal guardrails.

Requirements

What you’ll need

A track record of empirical research in agent behaviour, model evaluation, alignment, or a closely adjacent area.
Strong ML engineering. You can independently build a research MVP involving fine-tuning, agent inference, and evals, without waiting on a platform team.
Evidenced skills in experimental design under real conditions: isolating agent failure modes, calibrating judges and baselines, and distinguishing genuine signal from artifact.
You can take a vague behavioural question and define the experiment that answers it, when there's no playbook — then run it fast and iterate.
An AI power-user — fluent with frontier models and coding agents in your daily work.
Published research at A* venues (NeurIPS / ICML / ICLR / ACL and similar).
Interpretability depth — familiarity with modern interp tooling and concepts (NLAs, SAEs, persona vectors, etc.) and the ability to run whitebox investigations on our internal and open-source models.
An MSc or PhD in machine learning, computer science, cognitive science, computational neuroscience, physics, or a related quantitative field.
AI safety fellowship (MATS, ASTRA, Anthropic Fellows, etc.), or a comparable self-directed research record.

Benefits

Comp & perks

Paid time off in line with your local regulations, no matter where you work from.
Work from Paris (hybrid) with a relocation package available, or work from London (note: we are currently unable to provide relocation support or medical insurance for London-based roles).
Comprehensive medical insurance for our France-based team.
All the hardware, tools, and services you need.
Covered subscriptions for AI agents and IDEs.
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez.

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Model EvaluationAgent InferenceFine-TuningExperimental DesignCalibrating JudgesDistinguishing Signal From ArtifactWhite-Box InvestigationsAutomated Audit AgentsDefining ExperimentsRunning Iterative Experiments

Soft Skills

Problem SolvingCritical ThinkingCommunication

Certifications

MSc In Machine LearningPhD In Computer ScienceAI Safety Fellowship