Canva

Senior Machine Learning Engineer – Agents Data

Canva

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇦🇹 Austria

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AWSCloudDistributed SystemsPythonRay

About the role

  • Design and build data pipelines for agent training: collection, filtering, deduplication, formatting, and versioning across text, image, and multimodal sources
  • Develop tooling for dataset construction—including human annotation workflows, synthetic data generation, and preference data collection for RLHF/DPO-style training
  • Own data quality: build validation frameworks, monitor for drift and contamination, and establish standards that make datasets trustworthy and reproducible
  • Create evaluation datasets and benchmarks in collaboration with researchers—curating task distributions that surface real failure modes
  • Build and maintain infrastructure for efficient data loading, storage, and retrieval at scale (S3, distributed systems, streaming pipelines)
  • Collaborate with research scientists to translate research requirements into concrete data specifications, and iterate as experiments reveal new needs
  • Document datasets thoroughly: provenance, known limitations, intended use cases, and versioning history
  • Profile and optimize research code for training and inference efficiency, implement comprehensive test coverage for data pipelines and ML workflows, ensuring reliability and catching regressions early
  • Elevate codebase quality through code reviews, refactoring, and establishing engineering best practices that help research velocity scale sustainably
  • Contribute to team roadmaps by identifying data bottlenecks and proposing solutions that unblock research velocity

Requirements

  • Strong software engineering skills in Python, with experience building production-grade data pipelines and ML DevOps
  • Practical experience with prompt engineering—designing, testing, and refining prompts for reliable LLM/VLM outputs
  • Experience with ML data workflows: large-scale data processing and loading (Ray, or similar), data versioning, and format considerations for training (tokenization, batching, sharding)
  • Hands-on experience working with data pipelines for large scale distributed ML training runs
  • Familiarity with annotation tooling and human-in-the-loop data collection (Label Studio or internal systems)
  • Understanding of ML training requirements—you know what "good data" looks like for LLM/VLM fine-tuning and can anticipate downstream issues
  • Experience loading and writing large datasets to/from cloud infrastructure (AWS) and distributed storage systems
  • Strong communication skills: you can work with researchers to scope ambiguous problems and translate needs into actionable plans
  • A collaborative approach, comfortable taking ownership and iterating quickly.
Benefits
  • Equity packages - we want our success to be yours too
  • Inclusive parental leave policy that supports all parents & carers
  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, home office setup & more
  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Pythondata pipelinesML DevOpsprompt engineeringlarge-scale data processingdata versioningtokenizationbatchingshardingdata validation
Soft skills
strong communicationcollaborative approachownershipproblem scopingactionable planning