METR

Senior Research Engineer / Research Scientist

METR

full-time

Posted on:

Location Type: Hybrid

Location: Berkeley • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $250,000 - $450,000 per year

Job Level

Senior

Tech Stack

Python

About the role

  • We're seeking a researcher to help us better understand AI capabilities.
  • Previous work in this vein includes agent time horizons, a commonly-used metric for measuring AI progress, and RCTs on open-source developer productivity.
  • Lead a project investigating transcripts as a source of evidence about agent capabilities.
  • Improve METR's time-horizon metric to make it more externally valid, more interpretable, and more predictive on threat-model relevant capabilities.
  • Design and build experiments testing agent capabilities in the wild.
  • Lead large-scale human-subjects experiments measuring the impacts of AI agents on economically-valuable R&D.

Requirements

  • You can write code. At the very least, you should be able to quickly write a write a data analysis script in Python to answer an important question. Bonus points if you can write a clean PR too.
  • You're excited to get your hands dirty. METR researchers often interact with LLMs in a wide variety of scenarios, read lots of agent transcripts, and closely review human outputs (e.g. video recordings of developers in our productivity RCT).
  • You are undaunted by open-ended mandates. You can take a confusing or ill-posed question and produce insightful and helpful frameworks/proposals/results.
  • You should be able to read, understand, and critique a research proposal. You're able to understand how particular projects fit into METR's overall mission.
  • You're a good written communicator. Bonus points if you can write a great paper.
  • You work fast and are highly reliable.
Benefits
  • $21k referral bonus

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Pythondata analysisexperiment designstatistical analysisprogrammingRCTsagent capabilitiestranscript analysishuman-subjects experimentsclean PR
Soft skills
written communicationproblem-solvingframework developmentreliabilityadaptabilitycritical thinkinginsight generationproject leadershipinterpersonal skillstime management