Allstate

Cloud Platform Lead Engineer – ML DevOps

Allstate

full-time

Posted on:

Location Type: Hybrid

Location: IllinoisNorth CarolinaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $110,000 - $160,000 per year

Job Level

About the role

  • Lead the design, build, and operation of cloud infrastructure supporting ML experimentation, training, and production deployments
  • Define technical direction and best practices for ML platforms, MLOps, reliability, and cloud infrastructure
  • Architect ML platforms for high availability, fault tolerance, and resiliency across supported environments
  • Build and oversee CI/CD pipelines and automation for infrastructure and ML workflows
  • Champion MLOps best practices including model versioning, validation, promotion, monitoring, and rollback strategies
  • Mentor engineers through design reviews, code reviews, and hands-on technical leadership

Requirements

  • Proven experience leading cloud platform or infrastructure initiatives
  • Strong hands-on experience with cloud platforms (Azure, AWS, and/or GCP)
  • Deep knowledge of infrastructure as code, automation, CI/CD, and reliability engineering
  • Experience designing highly available and resilient distributed systems
  • Experience with ML platforms or MLOps tooling (e.g., MLflow, Kubeflow, Azure ML, SageMaker, Vertex AI)
  • Familiarity with observability tools (e.g., Datadog, ELK, New Relic, Prometheus)
  • Strong communication skills and a leadership mindset
  • 6 or more years of experience (Preferred)
Benefits
  • Joining our team isn’t just a job — it’s an opportunity
  • One that takes your skills and pushes them to the next level
  • One that encourages you to challenge the status quo
  • One where you can shape the future of protection while supporting causes that mean the most to you
  • Meaningful impact
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
cloud infrastructureMLOpsCI/CDinfrastructure as codeautomationreliability engineeringdistributed systemsmodel versioningvalidationmonitoring
Soft Skills
leadershipcommunicationmentoringtechnical leadership