
Senior Principal Machine Learning Engineer, Foundational Models
Autodesk
full-time
Posted on:
Location Type: Hybrid
Location: Boston • Massachusetts • United States
Visit company websiteExplore more
Job Level
About the role
- Define the long-term technical vision for Generative AI and Foundation Model infrastructure within the AEC Solutions team.
- Influence architectural decisions across the broader organization.
- Lead the design, development, and delivery of complex ML systems.
- Own the full lifecycle from model architecture selection and data strategy to distributed training and production deployment.
- Drive the development of large-scale training pipelines.
- Collaborate with Research Scientists to translate experimental ideas (custom architectures, novel loss functions) into scalable, performant code.
- Architect solutions for distributed training (e.g., FSDP, Megatron-LM, DeepSpeed) on massive compute clusters.
- Identify and resolve bottlenecks in data processing and model parallelism to maximize training throughput.
- Mentor Principal and Senior engineers, fostering a culture of technical ownership, rigorous experimentation, and best practices.
- Act as a technical partner to Product Management and Engineering leadership.
- Partner effectively with Data Engineering, Platform, and Research teams to integrate large-scale multimodal AEC data (3D geometry, images, text) into model development workflows.
- Establish standards for model evaluation, versioning, monitoring, and MLOps best practices to ensure reproducibility and reliability in a high-stakes production environment.
Requirements
- Master’s or PhD in a field related to AI/ML such as Computer Science, Mathematics, Statistics, Physics, Computational Linguistics, or related disciplines
- 10+ years of experience in machine learning, AI, or related fields, with a proven track record of technical leadership and hands-on implementation
- Demonstrated experience mentoring engineers and leading technical projects in cross-functional environments
- Proven history of leading the delivery of large-scale ML systems from conception to production
- Expert-level understanding of deep learning architectures (Transformers, Diffusion models) and modern frameworks (PyTorch is required)
- Hands-on experience with distributed training frameworks and techniques (e.g., PyTorch Distributed, Ray, DeepSpeed, Megatron, CUDA optimization) in HPC or cloud environments (AWS/Azure)
- Strong proficiency in Python, with an emphasis on performance profiling, debugging, and writing robust, maintainable production code
- Excellent ability to translate complex technical concepts into clear insights for executive leadership and cross-functional partners.
Benefits
- N/A 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
machine learningdeep learning architecturesTransformersDiffusion modelsdistributed training frameworksPyTorchperformance profilingdebuggingproduction codeMLOps
Soft skills
technical leadershipmentoringcross-functional collaborationcommunicationtechnical ownershiprigorous experimentationproblem-solvinginfluencingtranslating technical conceptsfostering culture
Certifications
Master’s degreePhD