Senior ML Ops Engineer

Fortive

full-time

Posted on: 1/5/2026

Location Type: Remote

Location: India

Visit company website

Explore more

Machine Learning Engineer jobs

✨ AI Apply

Apply

Job Level

Senior

Tech Stack

AWS Azure Cloud Docker Google Cloud Platform Grafana Jenkins Kubernetes Prometheus Python PyTorch Tensorflow Terraform

About the role

Own the ML platform infrastructure (training, serving, feature store, model registry) with attention to cost, reliability, and security.
Lead initiatives to improve observability and monitoring for data pipelines and ML services, including data drift, model performance, and latency.
Provide on-call support for production ML services; drive incident management and disaster recovery for ML workloads.
Lead the implementation and adoption of MLOps automation (CI/CD for ML, model packaging, deployment, rollback, and retraining orchestration).
Partner with Data Science and Engineering to improve reproducibility, experiment tracking, and model governance (versioning, lineage, approvals).
Establish quality gates for datasets, features, and models (tests, validation, bias/risk checks) before promotion to production.
Drive platform and tooling improvements (build in-house frameworks, templates, and reusable components to accelerate ML delivery).
Champion Responsible AI practices: auditability, explainability, access controls, and compliance processes.
Implement and maintain model monitoring systems to track:
Prediction accuracy and performance metrics over time.
Data drift and concept drift detection to trigger retraining workflows.
Latency and resource utilization for inference services.
Alerts and dashboards for anomalies, failures, and SLA breaches.
Develop automated retraining and rollback strategies based on monitoring insights

Requirements

8+ years of experience in Cloud Ops
3+ years of experience in MLOps, ML engineering, or platform/DevOps roles supporting ML in production.
Proficient with containerization and orchestration: Docker, Kubernetes.
Experience building CI/CD pipelines for ML (GitHub Actions, GitLab CI, Jenkins).
Proficient with ML lifecycle tooling: MLflow, Kubeflow, TFX, model registries.
Strong Python skills and familiarity with ML frameworks (TensorFlow, PyTorch).
Experience deploying online/batch inference services and optimizing for latency and throughput.
Proficient with cloud platforms (AWS/GCP/Azure) and managed ML services.
Knowledge of data engineering foundations: feature stores, data validation, lineage.
Experience with observability: logs, metrics, traces (Prometheus, Grafana) and model/data drift monitoring.
Solid understanding of security and governance for ML systems.
Bachelor’s/Master’s degree in Computer Science, Engineering, Data Science, or related fields.
Experience with infrastructure as code (Terraform, CloudFormation).
Familiarity with feature stores and data quality frameworks.
Hands-on with real-time/streaming data and online feature serving.
Experience with model explainability and Responsible AI risk checks.
Certifications in cloud ML services or Kubernetes are a plus.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

MLOpsML engineeringCloud OpsPythonDockerKubernetesCI/CDMLflowKubeflowTensorFlow

Soft Skills

leadershipcommunicationincident managementdisaster recoverycollaborationproblem-solvingobservabilityquality assuranceresponsible AI practicesmonitoring

Certifications

cloud ML servicesKubernetes