
Research Scientist – Model Team
Mirelo AI
full-time
Posted on:
Location Type: Hybrid
Location: Berlin • Germany
Visit company websiteExplore more
Tech Stack
About the role
- Design, implement and train large-scale multimodal generative models for audio generation (diffusion and/or autoregressive models).
- Explore new modeling ideas for audio generation (music, sound, speech) while taking inspiration from the language and image domains.
- Develop and experiment with post-training for new capabilities (fine-grained control, in/out-painting, editing, …)
- Conduct rigorous ablation studies, get actionable insights and communicate results to the team to discuss new research directions.
- Contribute hands-on to all stages of model development including data curation, experimentation, evaluation, and deployment.
Requirements
- Hands-on experience in training large-scale generative models in a fast-paced research environment.
- Deep understanding of cutting-edge methods and ML research in at least one of the domains: image, language, video or audio (specific audio experience not necessary, but nice to have).
- Strong proficiency in PyTorch, transformer architectures, and the full ecosystem of modern deep learning.
- Solid understanding of distributed training techniques—FSDP, low precision training, model parallelism
- Strong track-record in working on generative models (publications in top-tier venues, open-source contributions or applied ML projects).
Benefits
- Competitive compensation and equity
- True ownership from day one
- Build for the next generation of creators
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
generative modelsaudio generationdiffusion modelsautoregressive modelsPyTorchtransformer architecturesdistributed training techniquesFSDPlow precision trainingmodel parallelism
Soft Skills
communicationteam collaborationresearch direction discussionactionable insights