FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

People Research Data Scientist, AI Fairness, Bias
OpenAIPeople Data Scientist focused on AI fairness and bias testing helping OpenAI evaluate AI-assisted People systems. Collaborating with cross-functional teams to mitigate potential bias across decision processes.
Posted 6/15/2026full-timeSan Francisco • California • 🇺🇸 United StatesMid-LevelSenior💰 $198,000 - $220,000 per yearWebsite
Tech Stack
Tools & technologiesPythonSQL
About the role
Key responsibilities & impact- Define and lead fairness and bias-testing strategies for AI-assisted People processes, models, agents, and decision-support systems from development through deployment and ongoing monitoring.
- Design rigorous algorithmic audits and validation studies, including adverse-impact analysis, subgroup and intersectional evaluation, error-rate analysis, calibration, measurement invariance, reliability, criterion-related validity, and sensitivity testing.
- Identify the appropriate fairness criteria for each use case, evaluate tradeoffs among competing definitions of fairness, and clearly document the assumptions, limitations, and residual risks of each approach.
- Evaluate end-to-end human-AI decision systems, including model outputs, user behavior, human overrides, escalation pathways, and whether AI assistance changes the quality, consistency, or equity of decisions.
- Develop evaluation approaches for generative and agentic AI, including test-set design, counterfactual testing, behavioral evaluation, human-rating studies, robustness testing, and analysis of disparate performance across populations and contexts.
- Investigate the sources of observed disparities, including data representation, label and measurement bias, proxy variables, model design, decision thresholds, workflow design, and differential adoption or usage.
- Partner with engineering, People Operations, Legal, Privacy, Security, and People Systems teams to recommend and evaluate mitigations such as data improvements, model changes, threshold adjustments, workflow redesign, monitoring controls, and additional human oversight.
- Build scalable fairness-evaluation infrastructure, including reusable datasets, automated validation pipelines, regression tests, monitoring systems, self-service tools, and standardized reporting.
- Establish research and documentation standards for fairness test plans, dataset and model documentation, validation reports, limitations, monitoring plans, and decision records.
- Translate complex findings into concise, decision-ready narratives, helping leaders understand the significance of identified risks, the strength of the evidence, available mitigation options, and remaining uncertainty.
Requirements
What you’ll need- Deep expertise in algorithmic fairness, bias measurement, responsible AI, psychometrics, applied statistics, or the evaluation of high-impact decision systems.
- Exceptional strength in research design, measurement, experimentation, causal inference, and statistical modeling.
- Hands-on experience applying methods such as subgroup and intersectional analysis, adverse-impact testing, equalized-odds and equal-opportunity analysis, demographic-parity assessment, calibration analysis, counterfactual testing, measurement invariance, reliability analysis, and validation studies.
- Strong judgment about the limitations of fairness metrics, including the ability to determine which measures are appropriate for a particular decision context rather than applying a single universal definition of fairness.
- Experience evaluating machine-learning models, generative AI systems, agents, or human-AI workflows using quantitative and qualitative evidence.
- High proficiency in Python or R and SQL, with experience working across complex, sensitive, and imperfect datasets.
- Experience building reproducible evaluation pipelines, automated testing frameworks, analytical tools, monitoring systems, or governed research workflows.
- Ability to distinguish statistical disparities from their potential causes and to communicate findings without overstating certainty or making unsupported causal or legal conclusions.
- Ability to work effectively with technical, operational, legal, privacy, and executive stakeholders and influence consequential decisions through evidence and sound judgment.
- Deep curiosity, intellectual humility, strong attention to detail, and a commitment to developing AI systems and organizational processes that work well for people across different backgrounds and circumstances.
Benefits
Comp & perks- Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
- Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
- 401(k) retirement plan with employer match
- Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
- Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
- 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
- Mental health and wellness support
- Employer-paid basic life and disability coverage
- Annual learning and development stipend to fuel your professional growth
- Daily meals in our offices, and meal delivery credits as eligible
- Relocation support for eligible employees
- Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
algorithmic fairnessbias measurementresponsible AIpsychometricsapplied statisticssubgroup analysisintersectional analysisadverse-impact testingstatistical modelingPython
Soft Skills
research designmeasurementexperimentationcausal inferencestrong judgmentcommunicationattention to detailintellectual humilitycuriosityinfluence