
Research Engineer – AI Safety, Alignment
Character.AI
full-time
Posted on:
Location Type: Hybrid
Location: Redwood City • California • United States
Visit company websiteExplore more
Salary
💰 $225,000 - $400,000 per year
Tech Stack
About the role
- Develop and implement novel evaluation methodologies and metrics to assess the safety and alignment of large language models.
- Research and develop cutting-edge techniques for model alignment, value learning, and interpretability.
- Conduct adversarial testing to proactively uncover potential vulnerabilities and failure modes in our models.
- Analyze and mitigate biases, toxicity, and other harmful behaviors in large language models through techniques like reinforcement learning from human feedback (RLHF) and fine-tuning.
- Collaborate with engineering and product teams to translate safety research into practical, scalable solutions and best practices.
- Stay abreast of the latest advancements in AI safety research and contribute to the academic community through publications and presentations.
Requirements
- Hold a PhD (or equivalent experience) in a relevant field such as Computer Science, Machine Learning, or a related discipline.
- Write clear and clean production-facing and training code
- Experience working with GPUs (training, serving, debugging)
- Experience with data pipelines and data infrastructure
- Strong understanding of modern machine learning techniques, particularly transformers and reinforcement learning, with a focus on their safety implications.
- Are passionate about the responsible development of AI and dedicated to solving complex safety challenges.
- Nice to Have: Experience with product experimentation and A/B testing, Experience training large models in a distributed setting, Familiarity with ML deployment and orchestration (Kubernetes, Docker, cloud), Experience with explainable AI (XAI) and interpretability techniques, Have research in AI safety, alignment, ethics, or a related area, Knowledge of the broader societal and ethical implications of AI, including policy and governance, Publications in relevant academic journals or conferences in the field of machine learning
Benefits
- Top-notch health coverage for you & your family, with majority of the premium covered
- We invest in your future with a generous 401(K) contribution
- New parents, we've got you covered with incredible paid leave -up to 20 weeks
- 4 weeks of PTO to explore, unwind & come back recharged
- Daily in-office catering plus a monthly Doordash stipend to help keep you fueled no matter where you are
- Monthly wellness stipend to support you in your health journey
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
evaluation methodologiesmodel alignmentvalue learninginterpretabilityadversarial testingreinforcement learningfine-tuningtransformersdata pipelinesmachine learning techniques
Soft Skills
collaborationcommunicationpassion for responsible AI developmentproblem-solvingdedication to safety challenges
Certifications
PhD in Computer SciencePhD in Machine Learning