EvolutionIQ

Senior Data Scientist – LLM Evaluation

EvolutionIQ

full-time

Posted on:

Location Type: Remote

Location: New YorkUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $200,000 - $240,000 per year

Job Level

About the role

  • Design and implement comprehensive scorecards and benchmarking suites for LLM-based extraction, summarization, and chat interfaces
  • Act as the technical lead in working with Subject Matter Experts to codify their expertise into evaluation datasets and "ground truth" labels
  • Design the statistical guardrails to scale both our human and automated labeling efforts
  • Provide clear, data-driven "Go/No-Go" recommendations for model deployment

Requirements

  • 5+ years of experience in Data Science with a strong background in traditional statistics
  • 2+ years of focused experience working with LLMs, specifically in evaluation, benchmarking, and prompt auditing
  • Master’s or PhD in Statistics, Mathematics, or a related quantitative field
  • Proficient in Python (Pandas, Scikit-learn, Statsmodels) and SQL
  • Familiarity with LLM evaluation frameworks is a major plus
  • Proven ability to work with non-technical SMEs to translate their qualitative feedback into quantitative metrics
Benefits
  • Health insurance
  • 401k matching
  • Paid time off
  • 100% paid parental leave
  • Flexible schedule for new parents returning to work
  • $1,000/year for each employee for professional development
  • Tuition reimbursement
  • Catered lunches
  • Happy hours
  • Pet-friendly spaces
  • Monthly technology stipend

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Data Sciencetraditional statisticsLLMsevaluationbenchmarkingprompt auditingPythonPandasScikit-learnSQL
Soft skills
technical leadcommunicationcollaborationdata-driven decision makingtranslating qualitative feedback
Certifications
Master’s in StatisticsPhD in StatisticsPhD in Mathematics