OpenAI

Data Scientist, Integrity Measurement

OpenAI

full-time

Posted on:

Location Type: Hybrid

Location: LondonUnited Kingdom

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • own measurement and quantitative analysis for a group of severe, actor- and network-based usage harm verticals
  • develop and implement AI-first methods for prevalence measurement and other productionised safety metrics, which may necessarily include off-platform indicators or other non-standard datasets
  • build metrics that can be used for goaling or A/B tests when prevalence or other top line metrics are not suitable
  • own dashboards and metrics reporting for harm verticals
  • conduct analyses and generate insights that inform improvements to review, detection, or enforcement, and that influence roadmaps
  • optimise LLM prompts for the purpose of measurement
  • collaborate w/ other safety teams to understand key safety concerns and create relevant policies that will support safety needs
  • provide metrics for leadership and external reporting
  • develop automation to scale yourself, leveraging our agentic products

Requirements

  • are a senior DS with trust and safety experience that can drive measurement direction
  • have deep statistics skills, specifically around sampling methods and prevalence estimation of complicated problem areas (ideally activity- rather than content-based)
  • have experience working with severe and sensitive harm areas like child safety or violence
  • are capable in data programming languages (R or python, SQL)
  • (ideally) have experience with AI harms or leveraging AI for measurement
Benefits
  • We are committed to providing reasonable accommodations to applicants with disabilities
  • Background checks for applicants will be administered in accordance with applicable law
  • Equal opportunity employer
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
measurement and quantitative analysisAI-first methodsprevalence measurementA/B testingdashboard reportingLLM prompt optimisationdata programming languagesstatisticssampling methodsprevalence estimation
Soft Skills
collaborationinsight generationcommunicationleadership