
Data Scientist, Reinforcement Learning
Binance
full-time
Posted on:
Location: 🇺🇸 United States
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
PythonWeb3
About the role
- Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users.
- You will develop and optimize RL models for enterprise-scale applications such as customer service, token reporting, compliance, and Web3 domain reasoning.
- You will explore and evaluate advanced algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of LLMs, VLMs, and Agentic AI at Binance.
- The role requires a strong theoretical foundation in RL—covering policy optimization, reward modeling, and planning—paired with the engineering skills to build scalable production systems.
- You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking.
- Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.
Requirements
- Master’s degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
- 3+ years of hands-on experience in RL or LLM/VLM/Agentic AI optimization.
- Strong coding skills in Python, with experience in ML frameworks and RL libraries.
- Experience with large-scale distributed training and optimization.
- Self-driven, ownership mindset, and strong problem-solving skills. Excellent communication skills for cross-functional collaboration.