FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

ML Ops Engineer
Pharmacy2U LtdML Ops Engineer driving production-grade Machine Learning and LLM services for leading pharmacy, ensuring models run reliably and efficiently within a hybrid work environment.
Tech Stack
Tools & technologiesAzureDockerKubernetesPythonPyTorchTensorflow
About the role
Key responsibilities & impact- Design and operate CI/CD pipelines for ML models and LLM prompt‑flows, covering build, test, validation, deployment, and rollback
- Own model registration and promotion across environments, ensuring traceability, governance, and auditability
- Implement safe deployment strategies (e.g. blue/green, canary, champion/challenger)
- Package and deploy containerised inference services and batch pipelines, ensuring repeatability and rapid rollback
- Run ML and LLM services as production‑grade systems, defining SLOs/SLIs, dashboards, and alerting
- Lead incident response for runtime issues, including triage, mitigation, recovery, and post‑incident reviews
- Develop and maintain operational runbooks covering restart, rollback, secret rotation, and safe‑mode scenarios
- Improve service resilience and reduce MTTR through automation (e.g. self‑healing, retries, fallbacks, circuit breakers)
- Implement monitoring for availability, latency, errors, resource usage, and job performance
- Monitor data quality including freshness, volume, completeness, schema drift, and distribution changes
- Monitor model performance, including drift and prediction distribution shifts, and track accuracy where labels exist
- Instrument LLM services for token usage, latency, and safety signals, with clear visibility into cost, quotas, and risks
- Manage prompts and workflows as code, including versioning, code reviews, and automated regression testing
- Own production configuration for LLM deployments, including model updates, limits, and safeguards
- Partner with Data Science and Security to ensure robust safety practices, including PII protection and prompt‑injection testing
- Implement secure access controls, identity management, and secrets handling aligned to best practice
- Support production readiness through documentation, monitoring plans, cost models, and audit evidence
- Ensure all changes follow structured governance, with clear traceability and reproducibility
Requirements
What you’ll need- Strong Python engineering skills, with experience in ML frameworks such as scikit‑learn, PyTorch, or TensorFlow, and familiarity with experiment tracking
- Comfortable working in regulated environments, with an understanding of privacy, auditability, change control, and handling sensitive data
- Strong DevOps/SRE background, including CI/CD, Infrastructure as Code, monitoring and alerting, incident management, and reliability engineering
- Hands‑on experience with containerisation using tools such as Docker and Kubernetes (e.g. AKS), including debugging, performance tuning, and working with container registries
- Experience working with Azure, ideally including Azure Machine Learning (pipelines, registries, online and batch endpoints) and Azure Monitor or Log Analytics
- Experience operationalising ML pipelines, including training, batch scoring, feature engineering workflows, and preventing training‑serving skew
- Experience implementing safe deployment practices such as blue/green or canary releases, supported by automated validation
- Understanding of data contracts, schema evolution, and data quality practices, with the ability to troubleshoot data drift and missing features
Benefits
Comp & perks- Competitive contributory pension
- Occupational sick pay
- Long-service awards and refer-a-friend bonuses
- Professional registration fees covered (GPhC, NMC, CIPD and more)
- Cycle to Work and Green Car schemes (subject to eligibility)
- Enhanced maternity and paternity pay
- Flexible hybrid working to help balance work and home life
- Private healthcare insurance at discounted rates (Aviva)
- Employee Assistance Programme and in-house mental health support
- Access to discounted gym memberships via Blue Light Card and benefits schemes
- Regular health and wellbeing initiatives
- Strong commitment to CPD, training and professional development
- 25 days’ annual leave, increasing with service
- Buy and sell holiday scheme
- Blue Light Card and employee discount platform
- Exclusive discounts at The Springs, Leeds
- 25% off health & beauty purchases
- 25% off Pharmacy2U Private Online Doctor services
- Regular social events throughout the year
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonML frameworksscikit-learnPyTorchTensorFlowCI/CDInfrastructure as CodecontainerisationAzure Machine Learningdata quality
Soft Skills
incident managementleadershipcommunicationcollaborationproblem-solvingorganizational skillsadaptabilityattention to detailcritical thinkingtime management