Gridware

Senior ML Infrastructure Engineer

Gridware

full-time

Posted on:

Location Type: Hybrid

Location: San Francisco • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $190,000 - $210,000 per year

Job Level

Senior

Tech Stack

AWSCloudKubernetesPython

About the role

  • Design, build, and maintain the infrastructure, tooling, and workflows that enable reliable, scalable deployment of ML models to production.
  • Develop monitoring and observability systems to track model performance, data drift, data quality, and overall system health.
  • Create and maintain end-to-end testing frameworks and simulation environments to validate models and pipelines prior to deployment.
  • Work closely with Data Engineering and Platform Engineering teams to ensure ML systems integrate cleanly with broader Gridware infrastructure and operational standards.
  • Improve CI/CD pipelines for ML workloads, ensuring reproducibility, safe rollout, and automated rollback strategies.

Requirements

  • 5+ years of experience building production ML infrastructure
  • Strong software engineering skills and proficiency in Python
  • Experience with cloud platforms (AWS) and container orchestration (Kubernetes)
  • Familiarity with feature stores, model registries, or centralized metadata systems (i.e. MLFlow)
Benefits
  • Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
  • Paid parental leave
  • Alternating day off (every other Monday)
  • “Off the Grid”, a two week per year paid break for all employees.
  • Commuter allowance
  • Company-paid training

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
machine learning infrastructurePythonCI/CD pipelinesend-to-end testing frameworksmonitoring systemsobservability systemsdata driftdata qualityKubernetesAWS