Aleph Alpha

Applied Research Engineer – LLM Training

Aleph Alpha

full-time

Posted on:

Origin:  • 🇩🇪 Germany

Visit company website
AI Apply
Manual Apply

Job Level

Mid-LevelSenior

Tech Stack

GoPythonPyTorchRust

About the role

  • Research and develop novel approaches and algorithms that improve training of foundation models for practical use in real-world applications
  • Develop large-scale, robust, distributed training and data generation pipelines
  • Analyze and benchmark state-of-the-art as well as new approaches in LLM research
  • Collaborate with scientists and engineers at Aleph Alpha, Aleph Alpha Research, external industrial and academic partners, and directly with customers
  • Publish own and collaborative work on machine learning venues, and release code and models for use by the broader research community
  • Own the entire lifecycle of model training, contributing to pre-training, post-training, synthetic data generation, and scalable training/data pipelines
  • Deliver model artifacts that power product offerings and enable language model applications in finance, administration, R&D, logistics, and manufacturing processes

Requirements

  • Recent experience addressing complex, cutting-edge AI challenges, with expertise in at least one of: distributed training, training data, model architectures
  • Advanced knowledge of transformers, deep learning concepts and practices, and ideally experience coding and pretraining LLMs from scratch
  • Strong software engineering skills, with expertise in Python and related deep-learning frameworks (PyTorch)
  • Experience with shipping production-ready models, building on open-source AI libraries
  • Proven ability to apply advanced scientific methods to novel problems, resulting in impactful outputs such as publications or projects
  • Willingness to work from Heidelberg, Berlin, or in a hybrid setup within Germany; employer covers travel expenses to Research HQ in Heidelberg for occasional onsite work
  • (Preferred) PhD in machine learning or related fields with publications in top tier ML/AI venues (eg NeurIPS, ICML, ICLR, EMNLP, NAACL, ACL, etc)
  • (Preferred) Experience writing kernels for GPUs (with CUDA, Triton, etc.)
  • (Preferred) Production-level skills with at least one other programming language, especially systems languages (Rust, C/C++, Go, etc.)
  • (Preferred) Fluency in writing scientific documentation and proposals, with strong public speaking skills in scientific contexts
  • (Preferred) Strong collaborative and interpersonal skills, with a track record of contributing to a multidisciplinary team's technical and strategic success