
Senior Machine Learning Engineer – Agents Data
Canva
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇦🇹 Austria
Visit company websiteJob Level
Senior
Tech Stack
AWSCloudDistributed SystemsPythonRay
About the role
- Design and build data pipelines for agent training: collection, filtering, deduplication, formatting, and versioning across text, image, and multimodal sources
- Develop tooling for dataset construction—including human annotation workflows, synthetic data generation, and preference data collection for RLHF/DPO-style training
- Own data quality: build validation frameworks, monitor for drift and contamination, and establish standards that make datasets trustworthy and reproducible
- Create evaluation datasets and benchmarks in collaboration with researchers—curating task distributions that surface real failure modes
- Build and maintain infrastructure for efficient data loading, storage, and retrieval at scale (S3, distributed systems, streaming pipelines)
- Collaborate with research scientists to translate research requirements into concrete data specifications, and iterate as experiments reveal new needs
- Document datasets thoroughly: provenance, known limitations, intended use cases, and versioning history
- Profile and optimize research code for training and inference efficiency, implement comprehensive test coverage for data pipelines and ML workflows, ensuring reliability and catching regressions early
- Elevate codebase quality through code reviews, refactoring, and establishing engineering best practices that help research velocity scale sustainably
- Contribute to team roadmaps by identifying data bottlenecks and proposing solutions that unblock research velocity
Requirements
- Strong software engineering skills in Python, with experience building production-grade data pipelines and ML DevOps
- Practical experience with prompt engineering—designing, testing, and refining prompts for reliable LLM/VLM outputs
- Experience with ML data workflows: large-scale data processing and loading (Ray, or similar), data versioning, and format considerations for training (tokenization, batching, sharding)
- Hands-on experience working with data pipelines for large scale distributed ML training runs
- Familiarity with annotation tooling and human-in-the-loop data collection (Label Studio or internal systems)
- Understanding of ML training requirements—you know what "good data" looks like for LLM/VLM fine-tuning and can anticipate downstream issues
- Experience loading and writing large datasets to/from cloud infrastructure (AWS) and distributed storage systems
- Strong communication skills: you can work with researchers to scope ambiguous problems and translate needs into actionable plans
- A collaborative approach, comfortable taking ownership and iterating quickly.
Benefits
- Equity packages - we want our success to be yours too
- Inclusive parental leave policy that supports all parents & carers
- An annual Vibe & Thrive allowance to support your wellbeing, social connection, home office setup & more
- Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Pythondata pipelinesML DevOpsprompt engineeringlarge-scale data processingdata versioningtokenizationbatchingshardingdata validation
Soft skills
strong communicationcollaborative approachownershipproblem scopingactionable planning