Netflix

Engineering Manager, Machine Learning, Model Evaluations, Data Curation

Netflix

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $190,000 - $920,000 per year

Job Level

SeniorLead

About the role

  • Partner with downstream AI application teams to define shared evaluations that codify application expectations of LLMs and other foundation models, ensuring progress can be transparently tracked against real-world needs.
  • Design rigorous benchmarks and evaluation methodologies across ranking & recommendations, content understanding, and language/text generation — grounded in a deep technical understanding of LLMs, their strengths, limitations, and failure modes.
  • Lead the development of evaluators and strong baselines to ensure in-house LLMs and other foundation models demonstrate clear advantages over off-the-shelf alternatives.
  • Build scalable, reproducible data and evaluation systems that make dataset creation and evaluation design as nimble and experiment-friendly as model development itself.
  • Hire, grow, and nurture a world-class team, fostering an inclusive, high-performing culture that balances research innovation with engineering excellence.
  • Work closely with the teams developing Netflix’s foundation models (including our core LLM) to ensure evaluation and data insights are folded back into the cadence of model development.
  • Proactively influence the ML Platform and Data Engineering teams at key interfaces.

Requirements

  • 8+ years of overall experience, including 3+ years in engineering management.
  • Experience with large-scale ML systems and foundation models, especially LLMs.
  • Strong technical expertise in LLMs, their evaluation, and practical methods for ensuring robustness, reproducibility, and quality.
  • Broad knowledge of machine learning fundamentals and evaluation methodologies, including benchmark design, model-based evaluators, and offline/online metrics.
  • Experience driving cross-functional projects, including close collaboration with AI application teams to translate product needs into evaluation frameworks.
  • Excellent written and verbal communication skills, able to bridge technical and non-technical audiences.
  • Advanced degree in Computer Science, Statistics, or a related quantitative field.
Benefits
  • Health Plans
  • Mental Health support
  • 401(k) Retirement Plan with employer match
  • Stock Option Program
  • Disability Programs
  • Health Savings and Flexible Spending Accounts
  • Family-forming benefits
  • Life and Serious Injury Benefits
  • Paid leave of absence programs
  • Flexible time off for salaried employees

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
large-scale ML systemsfoundation modelsLLMsevaluation methodologiesbenchmark designmodel-based evaluatorsoffline metricsonline metricsrobustnessreproducibility
Soft skills
engineering managementcross-functional collaborationwritten communicationverbal communicationteam buildinginclusive culturehigh-performing cultureresearch innovationengineering excellenceinfluence
Certifications
advanced degree in Computer Scienceadvanced degree in Statisticsadvanced degree in related quantitative field
CrowdStrike

Engineering Manager – Technical Operations, Data & Cost Optimization

CrowdStrike
Senior · Leadfull-time$140k–$215k / yearCalifornia · 🇺🇸 United States
Posted: 26 minutes agoSource: crowdstrike.wd5.myworkdayjobs.com
AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformSQL
Rad AI

Engineering Manager

Rad AI
Senior · Leadfull-time$180k–$200k / year🇺🇸 United States
Posted: 3 hours agoSource: jobs.ashbyhq.com
Magnet Forensics

Engineering Manager – GKOS Team

Magnet Forensics
Mid · Seniorfull-time$154k–$264k / year🇺🇸 United States
Posted: 5 hours agoSource: jobs.lever.co
CloudLinux
By Referrals Only

Senior Engineering Manager – Consumer Products, Payments

By Referrals Only
Seniorfull-time🇺🇸 United States
Posted: 22 hours agoSource: boards.greenhouse.io
Distributed Systems