FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Member of Technical Staff – Evals
MagicMember of Technical Staff on Evals developing evaluation systems for AI models at Magic. Building trustworthy evaluations that inform research and product decisions while providing critical infrastructure.
Posted 6/27/2026full-timeSan Francisco • California • 🇺🇸 United StatesLead💰 $200,000 - $550,000 per yearWebsite
About the role
Key responsibilities & impact- Build and maintain the internal evals platform used across Magic
- Design, implement, and validate eval tasks for pre-training, post-training, reinforcement learning, inference, and product systems
- Develop infrastructure for running large-scale evaluations
- Build systems to measure dataset quality and identify opportunities to improve training data
- Improve evaluation correctness, reproducibility, and reliability
- Audit and improve upon public benchmarks, evaluation methodologies, and open-source implementations
- Partner with research, data, inference, and product teams to define metrics that accurately reflect model quality
- Build tooling and frameworks that enable teams across Magic to make decisions based on trustworthy measurements
Requirements
What you’ll need- Experience building production systems, internal platforms, or developer infrastructure
- Experience working with machine learning systems, evaluation frameworks, data infrastructure, or research tooling
- Track record of owning technical projects end-to-end
- Skepticism toward results that cannot be reproduced, validated, or explained
- Ability to reason critically about benchmarks, metrics, and experimental methodology
- Experience designing, implementing, or operating systems that run at scale
- Comfortable navigating ambiguity and determining whether a measurement is actually capturing the behavior it claims to measure
- Excitement about helping researchers and engineers make better decisions through trustworthy measurements
Benefits
Comp & perks- Equity is a significant part of total compensation, in addition to salary
- 401(k) plan with 6% salary matching
- Generous health, dental, and vision insurance for you and your dependents
- Unlimited paid time off
- Visa sponsorship and relocation support for candidates moving to San Francisco
- A small, fast-moving, highly collaborative team working on frontier AI systems
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learning systemsevaluation frameworksdata infrastructureproduction systemsinternal platformsdeveloper infrastructureevaluation methodologiesopen-source implementationslarge-scale evaluationsexperimental methodology
Soft Skills
critical reasoningskepticismdecision-makingnavigating ambiguityproject ownershipcollaborationtrustworthinessmeasurement accuracyproblem-solvingcommunication