PlanHub

MLOps Engineer

PlanHub

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇺🇸 United States

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AirflowAWSDockerJenkinsPythonTerraform

About the role

  • Design and implement end-to-end ML infrastructure using AWS services (SageMaker, Lambda, ECS, ECR, Glue, etc.)
  • Build and maintain feature stores for efficient feature engineering, storage, and serving
  • Create CI/CD pipelines for automated model testing, validation, deployment, and rollback
  • Implement comprehensive model monitoring, observability, and alerting systems to track performance, drift, and reliability
  • Manage compute resources and optimize infrastructure costs for training and inference workloads
  • Establish MLOps best practices; version control for data, models (ex. model lifecycle management), experiments, and infrastructure
  • Automate infrastructure provisioning and management using Terraform or CDK
  • Collaborate with AI Engineers to understand requirements, remove deployment friction, and accelerate the model development lifecycle
  • Build tools and automation that enable self-service model deployment and experimentation

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field
  • 5+ years of experience in MLOps, DevOps, or infrastructure engineering with exposure to ML systems
  • Strong understanding of the ML model lifecycle from training to production deployment and monitoring
  • Hands-on expertise with AWS services --- including SageMaker, Lambda, ECS/ECR, Glue, Athena, S3, and CloudWatch
  • Proficiency with containerization (Docker) and orchestration tools
  • Solid software engineering skills in Python and experience with infrastructure-as-code (Terraform or CloudFormation)
  • Experience with CI/CD tools and platforms (GitHub Actions, GitLab CI, Jenkins, or similar)
  • Knowledge of ML platforms and tools (MLflow, Kubeflow, Airflow, or AWS-native alternatives)
  • Understanding of model monitoring concepts including data drift, performance degradation, and retraining triggers
  • Strong problem-solving skills and ability to design scalable, reliable systems.
Benefits
  • Remote friendly
  • Open time-off policy
  • 401(k)/RRSP plan with a company match

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
MLOpsDevOpsinfrastructure engineeringML model lifecycleAWS SageMakerAWS LambdaAWS ECSAWS ECRAWS GluePython
Soft skills
problem-solvingcollaborationcommunication
Certifications
Bachelor's degree in Computer ScienceBachelor's degree in Engineering