Fortive

Senior ML Ops Engineer

Fortive

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Own the ML platform infrastructure (training, serving, feature store, model registry) with attention to cost, reliability, and security.
  • Lead initiatives to improve observability and monitoring for data pipelines and ML services, including data drift, model performance, and latency.
  • Provide on-call support for production ML services; drive incident management and disaster recovery for ML workloads.
  • Lead the implementation and adoption of MLOps automation (CI/CD for ML, model packaging, deployment, rollback, and retraining orchestration).
  • Partner with Data Science and Engineering to improve reproducibility, experiment tracking, and model governance (versioning, lineage, approvals).
  • Establish quality gates for datasets, features, and models (tests, validation, bias/risk checks) before promotion to production.
  • Drive platform and tooling improvements (build in-house frameworks, templates, and reusable components to accelerate ML delivery).
  • Champion Responsible AI practices: auditability, explainability, access controls, and compliance processes.
  • Implement and maintain model monitoring systems to track:
  • Prediction accuracy and performance metrics over time.
  • Data drift and concept drift detection to trigger retraining workflows.
  • Latency and resource utilization for inference services.
  • Alerts and dashboards for anomalies, failures, and SLA breaches.
  • Develop automated retraining and rollback strategies based on monitoring insights

Requirements

  • 8+ years of experience in Cloud Ops
  • 3+ years of experience in MLOps, ML engineering, or platform/DevOps roles supporting ML in production.
  • Proficient with containerization and orchestration: Docker, Kubernetes.
  • Experience building CI/CD pipelines for ML (GitHub Actions, GitLab CI, Jenkins).
  • Proficient with ML lifecycle tooling: MLflow, Kubeflow, TFX, model registries.
  • Strong Python skills and familiarity with ML frameworks (TensorFlow, PyTorch).
  • Experience deploying online/batch inference services and optimizing for latency and throughput.
  • Proficient with cloud platforms (AWS/GCP/Azure) and managed ML services.
  • Knowledge of data engineering foundations: feature stores, data validation, lineage.
  • Experience with observability: logs, metrics, traces (Prometheus, Grafana) and model/data drift monitoring.
  • Solid understanding of security and governance for ML systems.
  • Bachelor’s/Master’s degree in Computer Science, Engineering, Data Science, or related fields.
  • Experience with infrastructure as code (Terraform, CloudFormation).
  • Familiarity with feature stores and data quality frameworks.
  • Hands-on with real-time/streaming data and online feature serving.
  • Experience with model explainability and Responsible AI risk checks.
  • Certifications in cloud ML services or Kubernetes are a plus.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
MLOpsML engineeringCloud OpsPythonDockerKubernetesCI/CDMLflowKubeflowTensorFlow
Soft skills
leadershipcommunicationincident managementdisaster recoverycollaborationproblem-solvingobservabilityquality assuranceresponsible AI practicesmonitoring
Certifications
cloud ML servicesKubernetes