FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Staff Machine Learning Engineer – ML Training Infrastructure
General MotorsStaff ML Engineer defining architecture and driving scalable ML infrastructure for AI at GM. Collaborating with teams to enhance intelligent driving technologies and optimize model training.
Posted 6/6/2026full-timeSunnyvale • California, Texas, Washington • 🇺🇸 United StatesLead💰 $185,000 - $335,300 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureCloudDistributed SystemsGoogle Cloud PlatformPythonPyTorchTensorflow
About the role
Key responsibilities & impact- Define and drive the architecture, design, and development of scalable, reliable, and high-performance ML frameworks and platform capabilities to support model training at scale.
- Lead model training performance analysis and optimization efforts across distributed training workflows, improving scalability, efficiency, and cost across heterogeneous hardware environments.
- Raise the bar on system observability, debuggability, operational excellence, and developer experience across the ML training stack.
- Own large, ambiguous, cross-functional technical initiatives from strategy through execution, including technical roadmap definition, tradeoff analysis, and delivery.
- Influence platform direction by identifying long-term infrastructure investments, setting engineering standards, and driving adoption of best practices across teams.
- Collaborate across organizational boundaries to align requirements, resolve technical disagreements, and integrate new capabilities into the platform ecosystem.
- Mentor engineers through design reviews, technical guidance, and hands-on partnership, while elevating engineering quality across the team.
Requirements
What you’ll need- Bachelor's degree or higher in Computer Science or a related field, or equivalent practical experience.
- 7+ years of professional software engineering experience.
- 5+ years of specialized experience in AI/ML infrastructure, such as enabling distributed training for large-scale ML models.
- Strong programming skills in Python, with deep proficiency in frameworks such as PyTorch (preferred), TensorFlow, or similar ML systems.
- Proven experience designing and operating distributed systems for ML training, including distributed computing, GPU computing, and cloud environments (AWS, GCP, Azure).
- Demonstrated track record of leading technically ambiguous, cross-team infrastructure initiatives and driving them to measurable impact.
- Strong architectural judgment and ability to make sound technical tradeoffs across performance, reliability, usability, and cost.
- Willingness to travel to Sunnyvale, CA as needed.
- Comfortable operating in highly ambiguous and dynamic environments.
Benefits
Comp & perks- GM offers a variety of health and wellbeing benefit programs.
- Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.
- Company Vehicle: Upon successful completion of a motor vehicle report review, you will be eligible to participate in a company vehicle evaluation program, through which you will be assigned a General Motors vehicle to drive and evaluate.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningdistributed trainingPythonPyTorchTensorFlowdistributed systemsGPU computingcloud environmentsperformance analysisoptimization
Soft Skills
leadershipcollaborationmentoringtechnical guidanceproblem-solvingcommunicationstrategic thinkingadaptabilityinfluenceoperational excellence