
MLOps Engineer
PlanHub
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AirflowAWSDockerJenkinsPythonTerraform
About the role
- Design and implement end-to-end ML infrastructure using AWS services (SageMaker, Lambda, ECS, ECR, Glue, etc.)
- Build and maintain feature stores for efficient feature engineering, storage, and serving
- Create CI/CD pipelines for automated model testing, validation, deployment, and rollback
- Implement comprehensive model monitoring, observability, and alerting systems to track performance, drift, and reliability
- Manage compute resources and optimize infrastructure costs for training and inference workloads
- Establish MLOps best practices; version control for data, models (ex. model lifecycle management), experiments, and infrastructure
- Automate infrastructure provisioning and management using Terraform or CDK
- Collaborate with AI Engineers to understand requirements, remove deployment friction, and accelerate the model development lifecycle
- Build tools and automation that enable self-service model deployment and experimentation
Requirements
- Bachelor's degree in Computer Science, Engineering, or related field
- 5+ years of experience in MLOps, DevOps, or infrastructure engineering with exposure to ML systems
- Strong understanding of the ML model lifecycle from training to production deployment and monitoring
- Hands-on expertise with AWS services --- including SageMaker, Lambda, ECS/ECR, Glue, Athena, S3, and CloudWatch
- Proficiency with containerization (Docker) and orchestration tools
- Solid software engineering skills in Python and experience with infrastructure-as-code (Terraform or CloudFormation)
- Experience with CI/CD tools and platforms (GitHub Actions, GitLab CI, Jenkins, or similar)
- Knowledge of ML platforms and tools (MLflow, Kubeflow, Airflow, or AWS-native alternatives)
- Understanding of model monitoring concepts including data drift, performance degradation, and retraining triggers
- Strong problem-solving skills and ability to design scalable, reliable systems.
Benefits
- Remote friendly
- Open time-off policy
- 401(k)/RRSP plan with a company match
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
MLOpsDevOpsinfrastructure engineeringML model lifecycleAWS SageMakerAWS LambdaAWS ECSAWS ECRAWS GluePython
Soft skills
problem-solvingcollaborationcommunication
Certifications
Bachelor's degree in Computer ScienceBachelor's degree in Engineering