Senior Data Scientist, AI/ML Systems

Zigsaw

Senior Data Scientist at Pinterest designing evaluation frameworks and measurement strategies for foundational AI models. Collaborating across teams and influencing model investment decisions and innovation pace.

Posted 4/29/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $139,764 - $287,749 per yearWebsite

Tech Stack

Tools & technologies

PythonSparkSQL

About the role

Key responsibilities & impact

Design and execute system-level measurement frameworks for foundational model improvements spanning offline evaluation benchmarks, online A/B experiments, and longitudinal impact tracking across surfaces.
Define, and own the success metrics that quantify foundational model value.
Build causal inference methodologies to isolate the incremental impact of individual model components within a complex, multi-model production system where changes co-occur and interact.
Work cross-functionally to build relationships, proactively communicate key findings, and collaborate closely with ML Engineers, Applied Scientists, Homefeed and Surface teams to ensure measurement rigor is embedded in every model launch.
Relentlessly focus on impact, whether through sharpening investment decisions with data, raising the bar for launch criteria, accelerating experimentation velocity, or surfacing hidden inefficiencies in the model ecosystem.

Requirements

What you’ll need

5+ years of experience analyzing data in a fast-paced, data-driven environment with proven ability to apply scientific methods to solve real-world problems on web-scale data.
Strong interest and hands-on experience in one or more of: ML system evaluation, recommender system measurement, A/B experimentation at scale, causal inference
Deep familiarity with large-scale recommendation or ranking systems and their evaluation including an understanding of how representation learning, retrieval, ranking, and re-ranking stages interact and compound in production.
Experience designing and executing A/B experiments for complex ML systems, including multi-surface holdouts, metric decomposition, long-run effect estimation, and interference/spillover mitigation.
Strong quantitative programming (Python) and data manipulation skills (SQL/Spark); experience with ML pipelines, feature stores, and large-scale experimentation platforms.
Ability to work independently, drive ambiguous projects end-to-end, and operate with high ownership in a fast-moving research-to-production environment.
Excellent written and verbal communication skills, with the ability to translate complex system-level findings into clear narratives for technical and non-technical partners including leadership-level investment recommendations.
A team player eager to partner across teams to turn measurement insights into better models and faster launches.

Benefits

Comp & perks

Equity
Health insurance
Professional development

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

ML system evaluationrecommender system measurementA/B experimentationcausal inferencequantitative programmingdata manipulationPythonSQLSparkmetric decomposition

Soft Skills

communicationcollaborationindependenceownershipproblem-solvingteamworknarrative translationrelationship buildingimpact focusproject management