Research and develop novel approaches and algorithms that improve training of foundation models for practical use in real-world applications
Develop large-scale, robust, distributed training and data generation pipelines
Analyze and benchmark state-of-the-art as well as new approaches in LLM research
Collaborate with scientists and engineers at Aleph Alpha, Aleph Alpha Research, external industrial and academic partners, and directly with customers
Publish own and collaborative work on machine learning venues, and release code and models for use by the broader research community
Own the entire lifecycle of model training, contributing to pre-training, post-training, synthetic data generation, and scalable training/data pipelines
Deliver model artifacts that power product offerings and enable language model applications in finance, administration, R&D, logistics, and manufacturing processes
Requirements
Recent experience addressing complex, cutting-edge AI challenges, with expertise in at least one of: distributed training, training data, model architectures
Advanced knowledge of transformers, deep learning concepts and practices, and ideally experience coding and pretraining LLMs from scratch
Strong software engineering skills, with expertise in Python and related deep-learning frameworks (PyTorch)
Experience with shipping production-ready models, building on open-source AI libraries
Proven ability to apply advanced scientific methods to novel problems, resulting in impactful outputs such as publications or projects
Willingness to work from Heidelberg, Berlin, or in a hybrid setup within Germany; employer covers travel expenses to Research HQ in Heidelberg for occasional onsite work
(Preferred) PhD in machine learning or related fields with publications in top tier ML/AI venues (eg NeurIPS, ICML, ICLR, EMNLP, NAACL, ACL, etc)
(Preferred) Experience writing kernels for GPUs (with CUDA, Triton, etc.)
(Preferred) Production-level skills with at least one other programming language, especially systems languages (Rust, C/C++, Go, etc.)
(Preferred) Fluency in writing scientific documentation and proposals, with strong public speaking skills in scientific contexts
(Preferred) Strong collaborative and interpersonal skills, with a track record of contributing to a multidisciplinary team's technical and strategic success