Salary
💰 $280,000 - $340,000 per year
Tech Stack
AWSCloudDistributed SystemsGoogle Cloud PlatformKubernetes
About the role
- Build the platform infrastructure for ML at scale
- Keep training jobs running smoothly
- Enable model evaluation and exploration
- Scale production inference
Requirements
- 5+ years building distributed systems, data pipelines, and infrastructure at scale
- Experience managing engineering teams of 3-8 people
- Experience with cloud platforms (AWS/GCP)
- Experience with container orchestration (Kubernetes/ECS)
- Proven experience with monitoring and reliability
- Comfortable working directly with researchers and data scientists
- Health insurance
- 401(k) plan
- Flexible work arrangements
- Professional development
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
distributed systemsdata pipelinesinfrastructurecloud platformsAWSGCPcontainer orchestrationKubernetesECSmonitoring
Soft skills
team managementleadershipcollaborationcommunication