
Senior ML Infrastructure Engineer
Gridware
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • 🇺🇸 United States
Visit company websiteSalary
💰 $190,000 - $210,000 per year
Job Level
Senior
Tech Stack
AWSCloudKubernetesPython
About the role
- Design, build, and maintain the infrastructure, tooling, and workflows that enable reliable, scalable deployment of ML models to production.
- Develop monitoring and observability systems to track model performance, data drift, data quality, and overall system health.
- Create and maintain end-to-end testing frameworks and simulation environments to validate models and pipelines prior to deployment.
- Work closely with Data Engineering and Platform Engineering teams to ensure ML systems integrate cleanly with broader Gridware infrastructure and operational standards.
- Improve CI/CD pipelines for ML workloads, ensuring reproducibility, safe rollout, and automated rollback strategies.
Requirements
- 5+ years of experience building production ML infrastructure
- Strong software engineering skills and proficiency in Python
- Experience with cloud platforms (AWS) and container orchestration (Kubernetes)
- Familiarity with feature stores, model registries, or centralized metadata systems (i.e. MLFlow)
Benefits
- Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
- Paid parental leave
- Alternating day off (every other Monday)
- “Off the Grid”, a two week per year paid break for all employees.
- Commuter allowance
- Company-paid training
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
machine learning infrastructurePythonCI/CD pipelinesend-to-end testing frameworksmonitoring systemsobservability systemsdata driftdata qualityKubernetesAWS