Autodesk

Principal Machine Learning Operations Developer – AEC

Autodesk

full-time

Posted on:

Origin:  • 🇨🇦 Canada

Visit company website
AI Apply
Apply

Job Level

Lead

Tech Stack

ApacheAWSAzureCloudDistributed SystemsDockerKubernetesPythonPyTorchRaySpark

About the role

  • Support AI researchers by building scalable ML training pipelines and infrastructure for foundation model development
  • Design efficient data processing workflows for large-scale design datasets and industry-specific file formats
  • Optimize distributed training systems and develop solutions for model parallelism, checkpointing, and efficient resource management
  • Analyze performance bottlenecks and provide solutions to scaling problems
  • Implement and maintain robust, testable, well-documented code
  • Collaborate with researchers and engineers on projects at the intersection of research and product
  • Present results to collaborators and leadership
  • Contribute to infrastructure that enables ML-powered product features for AEC (architecture, engineering, construction)

Requirements

  • BSc or MSc in Computer Science or related field, or equivalent industry experience
  • Experience with distributed systems for machine learning and deep learning at scale
  • Strong knowledge of ML infrastructure and model parallelism techniques
  • Experience with frameworks such as PyTorch, Lightning, Megatron, DeepSpeed, and FSDP
  • Proficiency in Python and strong software engineering practices
  • Experience with cloud services and architectures (AWS, Azure, etc.)
  • Familiarity with version control, CI/CD, and deployment pipelines
  • Excellent written documentation skills
  • Preferred: Experience with AEC data formats (BIM models, IFC files, CAD files, Drawing Sets)
  • Preferred: Knowledge of the AEC industry and its specific data processing challenges
  • Preferred: Experience scaling ML training and data pipelines for large datasets
  • Preferred: Experience with distributed data processing and ML infrastructure (Apache Spark, Ray, Docker, Kubernetes)
  • Preferred: Experience with performance optimization, monitoring, and efficiency in large-scale ML systems
  • Preferred: Experience with Autodesk or similar products (Revit, Sketchup, Forma)
  • Ability to work effectively on a global, remote-first team; self-starter and adaptable in ambiguous environments
Baseten

Senior Software Engineer, Model Training

Baseten
Seniorfull-time$200k–$275k / yearCalifornia, New York · 🇺🇸 United States
Posted: 20 days agoSource: jobs.ashbyhq.com
AWSAzureCloudDistributed SystemsGoogle Cloud PlatformKubernetesPyTorchRaySpark
League

Principal AI Engineer

League
Leadfull-time$215k–$269k / year🇺🇸 United States
Posted: 26 days agoSource: boards.greenhouse.io
AirflowApacheAWSAzureCloudDistributed SystemsDockerGoGoogle Cloud PlatformHadoopKubernetesNoSQL+6 more
Blue Ridge

Lead Full-Stack Engineer, GenAI

Blue Ridge
Seniorfull-time🇺🇸 United States
Posted: 18 days agoSource: ats.rippling.com
ApacheCloudDistributed SystemsDockerERPJavaScriptKerasKubernetesMicroservicesNode.jsOpen SourcePython+7 more
NVIDIA

Senior Software Engineer, AI Systems

NVIDIA
Seniorfull-time$116k–$247k / year🇨🇦 Canada
Posted: 35 days agoSource: nvidia.wd5.myworkdayjobs.com
AWSAzureCloudDistributed SystemsDockerGoogle Cloud PlatformKubernetesNode.jsPythonPyTorch
NVIDIA

Senior Solutions Architect, AdTech and Media

NVIDIA
Seniorfull-time$184k–$357k / yearCalifornia, Illinois, New York · 🇺🇸 United States
Posted: 6 days agoSource: nvidia.wd5.myworkdayjobs.com
AWSAzureCloudDistributed SystemsDockerGoogle Cloud PlatformKubernetesPythonPyTorchSparkTensorflow