FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Applied AI Scientist
RoSenior Applied AI Scientist evaluating and optimizing AI systems for Ro healthcare. Designing frameworks for LLM-powered products across patient experience and operations.
Posted 6/27/2026full-timeNew York City • New York • 🇺🇸 United StatesSenior💰 $182,300 - $220,000 per yearWebsite
Tech Stack
Tools & technologiesPythonSQL
About the role
Key responsibilities & impact- Design and own evaluation frameworks for production LLM features, including LLM-as-a-judge evaluations, regression suites, synthetic datasets, golden datasets, and human review workflows.
- Analyze production behavior to identify quality issues, hallucinations, latency bottlenecks, cost regressions, and emerging failure modes.
- Design and run experiments including prompt variations, workflow changes, retrieval improvements, and model comparisons; and quantify their impact on quality, operational metrics, and user outcomes.
- Define the metrics that matter and build dashboards that make AI performance visible across the organization.
- Partner with engineering to determine which optimizations should be productionized and how to measure ongoing success.
- Mentor teammates on experimental design, statistical rigor, evaluation methodology, and measurement best practices.
Requirements
What you’ll need- 5+ years of experience in data science, applied machine learning, experimentation, or a closely related field, with at least the last year focused on applied LLMs or AI evaluation.
- Strong Python and SQL skills with experience working on production data pipelines and experimentation.
- You have experience designing reproducible evaluation frameworks rather than relying on manual spot checks or qualitative assessments.
- You have strong statistical intuition: you think in terms of distributions, confidence intervals, variance, and sample sizes rather than anecdotes.
- You’re comfortable working closely with engineers and product teams to translate experimental findings into production improvements
- Bonus: Experience with evaluation platforms (e.g. Braintrust, LangSmith, OpenAI Evals), experimentation platforms, causal inference, healthcare, or operations-heavy environments.
Benefits
Comp & perks- Competitive equity and benefits package
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonSQLdata scienceapplied machine learningexperimental designstatistical rigorevaluation methodologyproduction data pipelinesevaluation frameworkscausal inference
Soft Skills
mentoringcollaborationcommunicationanalytical thinkingproblem-solving