
Data Scientist, Integrity Measurement
OpenAI
full-time
Posted on:
Location Type: Hybrid
Location: London • United Kingdom
Visit company websiteExplore more
About the role
- own measurement and quantitative analysis for a group of severe, actor- and network-based usage harm verticals
- develop and implement AI-first methods for prevalence measurement and other productionised safety metrics, which may necessarily include off-platform indicators or other non-standard datasets
- build metrics that can be used for goaling or A/B tests when prevalence or other top line metrics are not suitable
- own dashboards and metrics reporting for harm verticals
- conduct analyses and generate insights that inform improvements to review, detection, or enforcement, and that influence roadmaps
- optimise LLM prompts for the purpose of measurement
- collaborate w/ other safety teams to understand key safety concerns and create relevant policies that will support safety needs
- provide metrics for leadership and external reporting
- develop automation to scale yourself, leveraging our agentic products
Requirements
- are a senior DS with trust and safety experience that can drive measurement direction
- have deep statistics skills, specifically around sampling methods and prevalence estimation of complicated problem areas (ideally activity- rather than content-based)
- have experience working with severe and sensitive harm areas like child safety or violence
- are capable in data programming languages (R or python, SQL)
- (ideally) have experience with AI harms or leveraging AI for measurement
Benefits
- We are committed to providing reasonable accommodations to applicants with disabilities
- Background checks for applicants will be administered in accordance with applicable law
- Equal opportunity employer
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
measurement and quantitative analysisAI-first methodsprevalence measurementA/B testingdashboard reportingLLM prompt optimisationdata programming languagesstatisticssampling methodsprevalence estimation
Soft Skills
collaborationinsight generationcommunicationleadership