FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal MLOps Engineer
SolventumLead MLOps engineering as a Principal MLOps Engineer at Solventum, shaping AI integration in healthcare systems. Define operational standards and ensure reliability in clinical environments.
Posted 6/4/2026full-timeRemote • Pennsylvania • 🇺🇸 United StatesLead💰 $142,800 - $196,350 per yearWebsite
Tech Stack
Tools & technologiesAirflowAWSAzureCloudGoGoogle Cloud PlatformGrafanaJavaKubernetesMicroservicesPrometheusPython
About the role
Key responsibilities & impact- Lead the operational architecture, deployment strategy, and reliability engineering for integrating AI into high-stakes Healthcare Information Systems (HIS)
- Define the enterprise operational standards, govern the release processes, and build the resilient infrastructure required to maintain models in mission-critical clinical environments
- Architect and govern the comprehensive release process, defining enterprise checklists, automated approval gates, release notes, and deployment readiness standards
- Establish the deployment execution standards for promoting AI across all environments and ensure customer deployments adhere to strict internal production discipline
- Architect and oversee the enterprise model registry, ensuring seamless integration with CI/CD pipelines and full version control traceability
- Define and enforce monitoring standards, establishing critical SLAs/SLOs, service health metrics, and comprehensive dashboards across the AI ecosystem
- Architect automated checks for input/output data quality and model drift, ensuring proactive detection of system degradation
- Establish and lead the production incident process, including rigorous triage workflows, severity escalation paths, postmortems, rollback mechanisms, and recovery infrastructure
- Partner with Platform teams to provide essential ATO (Authority to Operate) and compliance support, ensuring complete deployment traceability and strict operational controls
- Oversee comprehensive operational reporting, providing leadership with status updates across production systems, pre-prod testing, customer rollouts, and incident metrics
- Foster a culture of production discipline, guiding junior engineers in maintaining operational runbooks and reliable deployment pipelines
Requirements
What you’ll need- Bachelor's Degree or Higher in Computer Science, Software Engineering, or related technical field
- 10+ years of experience in software engineering, with at least 6 years dedicated to deploying and maintaining large-scale ML systems in production
- Expert-level experience with Cloud Providers (AWS/GCP/Azure) and orchestration tools (Kubernetes, Kubeflow, or Airflow)
- Expert-level Python and Java/Go (or similar)
- Deep proficiency in backend frameworks, microservices, and system design patterns
- Expert knowledge of monitoring stacks (Prometheus, Grafana, Datadog) and establishing enterprise SLAs/SLOs for AI services
- Proven track record of designing automated deployment pipelines, managing complex rollback procedures, and enforcing model registry governance at scale.
Benefits
Comp & perks- Medical
- Dental & Vision
- Health Savings Accounts
- Health Care & Dependent Care Flexible Spending Accounts
- Disability Benefits
- Life Insurance
- Voluntary Benefits
- Paid Absences
- Retirement Benefits
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonJavaGoCloud ProvidersKubernetesKubeflowAirflowmonitoring stacksbackend frameworksmicroservices
Soft Skills
leadershipcommunicationorganizationalmentoringincident managementproblem-solvingcollaborationproduction disciplinegovernancereporting
Certifications
Bachelor's DegreeMaster's Degree