FirstPrinciples Holding Company

Member of Technical Staff, Training Engineer – Large Scale Foundation Models

FirstPrinciples Holding Company

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇨🇦 Canada

Visit company website
AI Apply
Apply

Job Level

Lead

Tech Stack

Node.jsPyTorch

About the role

  • Develop and lead end-to-end pre-training of large language models on GPU clusters.
  • Combine deep engineering expertise with research intuition.
  • Build data pipelines and perform distributed training at scale.
  • Make informed decisions about microbatch and global batch configurations.
  • Provide strategic insights to the executive team on financial implications.
  • Design capital allocation frameworks for sustainability.
  • Operate distributed training infrastructure using modern techniques.
  • Write production-grade PyTorch and Triton/CUDA kernels when required.
  • Lead cross-functional efforts and mentor engineers.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 7-12+ years of total experience, including 2+ years training large Transformers at scale.
  • Hands-on experience with at least one frontier-style training run.
  • Expert-level proficiency in PyTorch (including compiled mode/torch.compile).
  • Deep facility with distributed frameworks (PyTorch FSDP or DeepSpeed ZeRO).
  • Proven success operating multi-node GPU jobs.
  • Demonstrated impact from data quality work.
  • Strong applied mathematics background.
Benefits
  • Health insurance
  • Innovative research environment
  • Collaboration with top experts
  • Opportunity to work on groundbreaking technology
  • Flexible remote work

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
large language modelsGPU clustersdata pipelinesdistributed trainingmicrobatch configurationsglobal batch configurationsPyTorchTritonCUDAapplied mathematics
Soft skills
leadershipmentoringstrategic insightscross-functional collaboration
Certifications
Bachelor's degreeMaster's degree