Character.AI

Research Engineer – AI Safety, Alignment

Character.AI

full-time

Posted on:

Location Type: Hybrid

Location: Redwood CityCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $225,000 - $400,000 per year

About the role

  • Develop and implement novel evaluation methodologies and metrics to assess the safety and alignment of large language models.
  • Research and develop cutting-edge techniques for model alignment, value learning, and interpretability.
  • Conduct adversarial testing to proactively uncover potential vulnerabilities and failure modes in our models.
  • Analyze and mitigate biases, toxicity, and other harmful behaviors in large language models through techniques like reinforcement learning from human feedback (RLHF) and fine-tuning.
  • Collaborate with engineering and product teams to translate safety research into practical, scalable solutions and best practices.
  • Stay abreast of the latest advancements in AI safety research and contribute to the academic community through publications and presentations.

Requirements

  • Hold a PhD (or equivalent experience) in a relevant field such as Computer Science, Machine Learning, or a related discipline.
  • Write clear and clean production-facing and training code
  • Experience working with GPUs (training, serving, debugging)
  • Experience with data pipelines and data infrastructure
  • Strong understanding of modern machine learning techniques, particularly transformers and reinforcement learning, with a focus on their safety implications.
  • Are passionate about the responsible development of AI and dedicated to solving complex safety challenges.
  • Nice to Have: Experience with product experimentation and A/B testing, Experience training large models in a distributed setting, Familiarity with ML deployment and orchestration (Kubernetes, Docker, cloud), Experience with explainable AI (XAI) and interpretability techniques, Have research in AI safety, alignment, ethics, or a related area, Knowledge of the broader societal and ethical implications of AI, including policy and governance, Publications in relevant academic journals or conferences in the field of machine learning
Benefits
  • Top-notch health coverage for you & your family, with majority of the premium covered
  • We invest in your future with a generous 401(K) contribution
  • New parents, we've got you covered with incredible paid leave -up to 20 weeks
  • 4 weeks of PTO to explore, unwind & come back recharged
  • Daily in-office catering plus a monthly Doordash stipend to help keep you fueled no matter where you are
  • Monthly wellness stipend to support you in your health journey
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
evaluation methodologiesmodel alignmentvalue learninginterpretabilityadversarial testingreinforcement learningfine-tuningtransformersdata pipelinesmachine learning techniques
Soft Skills
collaborationcommunicationpassion for responsible AI developmentproblem-solvingdedication to safety challenges
Certifications
PhD in Computer SciencePhD in Machine Learning