Resaro

AI Scientist – Reinforcement Learning

Resaro

contract

Posted on:

Location Type: Hybrid

Location: SingaporeSingapore

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Execute a dedicated work plan to build frameworks that evaluate the performance, safety, and alignment of RL agents
  • Use Bayesian ML models (GPs, BNNs) to create metrics for model confidence and risk
  • Design and set up the debugging and automated testing frameworks required to evaluate non-deterministic systems
  • Perform "red-team" tests and benchmarks on models using Trust Region methods (PPO) and RL from Human Feedback (RLHF)
  • Work across the entire stack, from environment interfacing to policy optimization, with the opportunity to grow into Multi-Agent RL (MARL) technologies

Requirements

  • Strong proficiency in Python, NumPy, and PyTorch
  • A background in ML theory, Mathematics, or Physics
  • Experience with Bayesian ML models (e.g., Gaussian Processes, Bayesian Neural Networks)
  • Practical experience or familiarity with Trust Region methods (PPO) and RL from Human Feedback (RLHF)
  • Proven ability in debugging and setting up automated testing frameworks.
Benefits
  • Hybrid work arrangement
  • Professional development opportunities

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
PythonNumPyPyTorchBayesian ML modelsGaussian ProcessesBayesian Neural NetworksTrust Region methodsPPORL from Human Feedbackautomated testing frameworks
Soft skills
problem-solvinganalytical thinking