Deepgram

ML Ops Infrastructure Engineer

Deepgram

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $160,000 - $220,000 per year

About the role

  • Design and build CI/CD pipelines specifically tailored for ML model development, validation, and deployment
  • Architect and maintain model deployment pipelines that move models from research environments through staging to production with confidence
  • Build A/B testing infrastructure that enables controlled rollouts of new models and measures real-world performance impact
  • Implement comprehensive monitoring for model performance in production -- accuracy metrics, latency, drift detection, and regression alerts
  • Develop automated retraining pipelines that trigger on data changes, performance degradation, or scheduled cadences
  • Create and maintain build and test environments that mirror production, giving researchers high-fidelity feedback before deployment
  • Establish model versioning, artifact management, and rollback capabilities to ensure safe and reproducible deployments
  • Collaborate with research engineers to define and enforce model quality gates before production promotion
  • Build observability dashboards that give the team real-time insight into model health across all environments
  • Optimize model serving infrastructure for latency, throughput, and cost efficiency

Requirements

  • 4+ years of experience in MLOps, DevOps, or infrastructure engineering with a focus on ML systems
  • Strong proficiency in Python and experience building automation and tooling for ML workflows
  • Deep experience with CI/CD systems and building pipelines for software and model delivery
  • Hands-on experience with Docker and Kubernetes for containerized workload management
  • Practical experience deploying and serving ML models in production environments
  • Familiarity with model evaluation, validation, and quality assurance processes
  • Understanding of monitoring and observability principles as applied to ML systems
  • Strong problem-solving skills and a bias toward automation over manual processes
Benefits
  • Medical, dental, vision benefits
  • Annual wellness stipend
  • Mental health support
  • Life, STD, LTD Income Insurance Plans
  • Unlimited PTO
  • Generous paid parental leave
  • Flexible schedule
  • 12 Paid US company holidays
  • Quarterly personal productivity stipend
  • One-time stipend for home office upgrades
  • 401(k) plan with company match
  • Tax Savings Programs
  • Learning / Education stipend
  • Participation in talks and conferences
  • Employee Resource Groups
  • AI enablement workshops / sessions
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
CI/CD pipelinesML model developmentA/B testingmonitoringautomated retrainingmodel versioningartifact managementPythonDockerKubernetes
Soft Skills
problem-solvingcollaborationautomation bias