Design and implement core platform backend software components.
Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value.
Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms for highly optimized use of accelerators.
Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services.
Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques.
Lead technical initiatives across GM’s ML ecosystem.
Raise the engineering bar through technical leadership, establishing best practices.
Contribute to open source projects; represent GM in relevant communities.

Requirements

5+ years of industry experience, with focus on machine learning systems or high performance backend services.
Expertise in either Python, C++ or other relevant coding languages.
Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc).
Strong communication skills and a proven ability to drive cross-functional initiatives.
Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities.

Benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonC++ML inferencemodel serving frameworkstritonrayservevLLMbackend servicesdistributed computingmonitoring

Soft Skills

communication skillstechnical leadershipcross-functional initiativesadaptabilitymulti-tasking