AI Scientist – Reinforcement Learning

Resaro

contract

Posted on: 1/15/2026

Location Type: Hybrid

Location: Singapore • Singapore

Visit company website

Explore more

✨ AI Apply

Apply

Job Level

Junior

Tech Stack

Numpy Python PyTorch

About the role

Execute a dedicated work plan to build frameworks that evaluate the performance, safety, and alignment of RL agents
Use Bayesian ML models (GPs, BNNs) to create metrics for model confidence and risk
Design and set up the debugging and automated testing frameworks required to evaluate non-deterministic systems
Perform "red-team" tests and benchmarks on models using Trust Region methods (PPO) and RL from Human Feedback (RLHF)
Work across the entire stack, from environment interfacing to policy optimization, with the opportunity to grow into Multi-Agent RL (MARL) technologies

Requirements

Strong proficiency in Python, NumPy, and PyTorch
A background in ML theory, Mathematics, or Physics
Experience with Bayesian ML models (e.g., Gaussian Processes, Bayesian Neural Networks)
Practical experience or familiarity with Trust Region methods (PPO) and RL from Human Feedback (RLHF)
Proven ability in debugging and setting up automated testing frameworks.

Benefits

Hybrid work arrangement
Professional development opportunities

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills

PythonNumPyPyTorchBayesian ML modelsGaussian ProcessesBayesian Neural NetworksTrust Region methodsPPORL from Human Feedbackautomated testing frameworks

Soft skills

problem-solvinganalytical thinking