
Machine Learning Intern/Co-op, Spring 2026
Cohere
internship
Posted on:
Location Type: Remote
Location: Canada
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Design, train and improve upon cutting-edge models.
- Help us develop new techniques to train and serve models safer, better, and faster.
- Train extremely large-scale models on massive datasets.
- Explore continual and active learning strategies for streaming data.
- Learn from experienced senior machine learning technical staff.
- Work closely with product teams to develop solutions.
Requirements
- Proficiency in Python and related ML frameworks such as Tensorflow, TF-Serving, JAX, and XLA/MLIR.
- Experience using large-scale distributed training strategies.
- Familiarity with autoregressive sequence models, such as Transformers.
- Strong communication and problem-solving skills.
- A demonstrated passion for applied NLP models and products.
- Bonus: experience writing kernels for GPUs using CUDA.
- Bonus: experience training on TPUs.
- Bonus: papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP).
Benefits
- An open and inclusive culture and work environment
- Work closely with a team on the cutting edge of AI research
- Free daily lunch
- Full health and dental benefits, including a separate budget to take care of your mental health
- Personal enrichment benefits towards arts and culture, fitness and well-being
- Remote-flexible, offices in Toronto, New York, San Francisco and London and coworking stipends
- Paid vacation
- Weekly lunch stipend, in-office lunches & snacks
- 100% Parental Leave top-up for up to 6 months
- Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
- 6 weeks of vacation (30 working days!)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonTensorflowTF-ServingJAXXLAMLIRlarge-scale distributed trainingautoregressive sequence modelsCUDATPUs
Soft skills
communicationproblem-solving