Canva

Machine Learning Engineer, Training Optimization

Canva

full-time

Posted on:

Location Type: Remote

Location: China

Visit company website

Explore more

AI Apply
Apply

About the role

  • You’ll design, implement, and optimize large-scale machine learning systems for training
  • You’ll improve all aspects of performance, including GPU utilization, communication overhead, and memory efficiency.
  • You’ll partner with research and modeling teams to align systems with algorithmic needs.
  • You’ll evaluate and apply best practices for distributed training using industry-leading frameworks.
  • You’ll dive deep into low-level optimization, including custom CUDA or Triton kernels.
  • You’ll debug, profile, and fine-tune training workflows to unlock new levels of scalability.

Requirements

  • Strong background in LLMs, multimodal AI, or diffusion models.
  • Proficiency in Python.
  • Familiarity with a system programming language (e.g. C++ or Rust) is a plus.
  • Deep knowledge of PyTorch or JAX as well as libraries such as Megatron-LM, NeMo, or DeepSpeed.
  • Familiarity with common optimization techniques such as FSDP/ZeRO, gradient checkpointing, or low-precision data types.
  • Hands-on experience writing custom GPU kernels in CUDA or Triton.
  • Excellent communication and problem-solving skills, incl. full proficiency in English.
Benefits
  • Employees can work remotely
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
machine learningPythonC++RustPyTorchJAXMegatron-LMNeMoDeepSpeedCUDA
Soft Skills
communicationproblem-solving