FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Research Scientist, RL Training
Snorkel AIResearch Scientist focusing on reinforcement learning for training large language models at Snorkel AI. Collaborating with research and engineering teams to advance RL data capabilities.
Posted 4/20/2026full-timeRedwood City • California • 🇺🇸 United StatesMid-LevelSenior💰 $200,000 - $275,000 per yearWebsite
Tech Stack
Tools & technologiesCloudNode.jsPythonPyTorch
About the role
Key responsibilities & impact- Research and implement reinforcement learning techniques including GRPO, RLHF, RLAIF, DPO, and reward modeling
- Design and build data pipelines that generate high-quality training signal for RL workflows
- Prototype and iterate on end-to-end RL training recipes
- Work closely with research scientists, ML engineers, and delivery teams
- Stay current with the latest developments in large-scale muli-node LLM training
Requirements
What you’ll need- Deep expertise in reinforcement learning from human or AI feedback
- Experience training or fine-tuning 30B+ large language models at scale
- Strong proficiency in Python and ML frameworks, especially PyTorch and HuggingFace
- Solid software engineering fundamentals
- Familiarity with ML infrastructure and cloud platforms
- Comfort operating in a high-iteration environment
- Ph.D. in machine learning, reinforcement learning, or a related field strongly preferred
Benefits
Comp & perks- Health insurance
- Professional development opportunities
- Flexible work arrangements
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
reinforcement learningGRPORLHFRLAIFDPOreward modelingPythonPyTorchHuggingFacelarge language models
Soft Skills
collaborationiterationproblem-solving
Certifications
Ph.D. in machine learningPh.D. in reinforcement learning