
AI Scientist – Reinforcement Learning
Resaro
contract
Posted on:
Location Type: Hybrid
Location: Singapore • Singapore
Visit company websiteExplore more
Job Level
About the role
- Execute a dedicated work plan to build frameworks that evaluate the performance, safety, and alignment of RL agents
- Use Bayesian ML models (GPs, BNNs) to create metrics for model confidence and risk
- Design and set up the debugging and automated testing frameworks required to evaluate non-deterministic systems
- Perform "red-team" tests and benchmarks on models using Trust Region methods (PPO) and RL from Human Feedback (RLHF)
- Work across the entire stack, from environment interfacing to policy optimization, with the opportunity to grow into Multi-Agent RL (MARL) technologies
Requirements
- Strong proficiency in Python, NumPy, and PyTorch
- A background in ML theory, Mathematics, or Physics
- Experience with Bayesian ML models (e.g., Gaussian Processes, Bayesian Neural Networks)
- Practical experience or familiarity with Trust Region methods (PPO) and RL from Human Feedback (RLHF)
- Proven ability in debugging and setting up automated testing frameworks.
Benefits
- Hybrid work arrangement
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonNumPyPyTorchBayesian ML modelsGaussian ProcessesBayesian Neural NetworksTrust Region methodsPPORL from Human Feedbackautomated testing frameworks
Soft skills
problem-solvinganalytical thinking