Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Magic

Member of Technical Staff – Evals

Magic

Member of Technical Staff on Evals developing evaluation systems for AI models at Magic. Building trustworthy evaluations that inform research and product decisions while providing critical infrastructure.

Posted 6/27/2026full-timeSan Francisco • California • 🇺🇸 United StatesLead💰 $200,000 - $550,000 per yearWebsite

About the role

Key responsibilities & impact
  • Build and maintain the internal evals platform used across Magic
  • Design, implement, and validate eval tasks for pre-training, post-training, reinforcement learning, inference, and product systems
  • Develop infrastructure for running large-scale evaluations
  • Build systems to measure dataset quality and identify opportunities to improve training data
  • Improve evaluation correctness, reproducibility, and reliability
  • Audit and improve upon public benchmarks, evaluation methodologies, and open-source implementations
  • Partner with research, data, inference, and product teams to define metrics that accurately reflect model quality
  • Build tooling and frameworks that enable teams across Magic to make decisions based on trustworthy measurements

Requirements

What you’ll need
  • Experience building production systems, internal platforms, or developer infrastructure
  • Experience working with machine learning systems, evaluation frameworks, data infrastructure, or research tooling
  • Track record of owning technical projects end-to-end
  • Skepticism toward results that cannot be reproduced, validated, or explained
  • Ability to reason critically about benchmarks, metrics, and experimental methodology
  • Experience designing, implementing, or operating systems that run at scale
  • Comfortable navigating ambiguity and determining whether a measurement is actually capturing the behavior it claims to measure
  • Excitement about helping researchers and engineers make better decisions through trustworthy measurements

Benefits

Comp & perks
  • Equity is a significant part of total compensation, in addition to salary
  • 401(k) plan with 6% salary matching
  • Generous health, dental, and vision insurance for you and your dependents
  • Unlimited paid time off
  • Visa sponsorship and relocation support for candidates moving to San Francisco
  • A small, fast-moving, highly collaborative team working on frontier AI systems

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
machine learning systemsevaluation frameworksdata infrastructureproduction systemsinternal platformsdeveloper infrastructureevaluation methodologiesopen-source implementationslarge-scale evaluationsexperimental methodology
Soft Skills
critical reasoningskepticismdecision-makingnavigating ambiguityproject ownershipcollaborationtrustworthinessmeasurement accuracyproblem-solvingcommunication