FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Machine Learning Researcher – RL and Agentic Systems
Grupo ProtegeMachine Learning Researcher specializing in RL and agentic systems for evaluating AI datasets. Working with cross-functional teams to enhance data quality and model performance.
About the role
Key responsibilities & impact- Design and build datasets, tasks, and environments for benchmarking agentic systems and multi-step model behavior.
- Translate real-world workflows into structured tasks, interaction traces, trajectories, stateful environments, and verifiable outcomes that can be used to evaluate advanced AI systems.
- Develop frameworks that assess diversity, realism, coverage, fidelity, informativeness, and downstream usefulness of datasets for agentic systems.
- Build quality scorecards and evaluation methods that make dataset strengths, weaknesses, and failure modes legible across teams.
- Evaluate planning, tool use, robustness, recovery from failure, task completion, and generalization behavior in RL-style or agentic environments.
- Connect model failures back to concrete dataset, environment, or task-design gaps and recommend improvements grounded in empirical evidence.
- Contribute to tools and systems that automate dataset validation, environment generation, rollout analysis, benchmark construction, and evaluation workflows.
- Improve internal infrastructure for reproducible experimentation, benchmark management, and evaluation quality.
- Collaborate closely with research and engineering teams to identify data bottlenecks, improve evaluation methodology, and shape internal best practices around task-grounded AI training data.
- Represent DataLab’s perspective in cross-functional discussions around dataset quality, benchmark design, and frontier agentic-system evaluation.
Requirements
What you’ll need- PhD or equivalent Master’s Degree + 4+ years industry experience in machine learning, computer science, statistics, engineering, mathematics, economics, or related quantitative fields.
- Strong understanding of AI model training pipelines, evaluation methodology, and the role of data in shaping model performance.
- Experience working with large, unstructured, or semi-structured datasets used to train or evaluate ML systems.
- Experience with reinforcement learning, sequential decision-making, agentic systems, tool-using models, or multi-step model evaluation.
- Experience designing tasks, benchmarks, environments, simulations, or evaluation frameworks for real-world model behavior.
- Strong intuition for realism, coverage, difficulty, fidelity, and meaningful outcome structure in datasets.
- Strong experimental design, evaluation, benchmarking, and data-validation skills.
- High ownership and ability to independently identify and solve high-impact problems.
Benefits
Comp & perks- Health insurance
- 401(k) matching
- Paid time off
- Remote work options
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningreinforcement learningexperimental designevaluation methodologybenchmarkingdata validationdataset designtask designmodel evaluationstatistical analysis
Soft Skills
problem solvingcollaborationcommunicationownershipintuitioncritical thinkingindependencecreativityattention to detailadaptability
Certifications
PhDMaster's Degree