Design, build, and own the end-to-end MLOps infrastructure on AWS, with a heavy emphasis on scalable data engineering and reliable, cost-efficient ML systems.
Implement and manage high-throughput, event-driven ML workflows (S3, Lambda, SQS, Step Functions, Batch) to support both data-centric pipelines and model execution.
Develop and maintain robust CI/CD pipelines for model deployment and promotion, enforcing best practices for Git, semantic versioning, and multi-branch release strategies.
Orchestrate complex data pipelines for the ingestion, processing, and updating of embeddings in vector databases (e.g., Qdrant, ChromaDB).
Establish and manage systems for training phase management and experiment tracking (e.g., MLflow, SageMaker Experiments) and evaluate modern model serving tools (e.g., BentoML).
Implement comprehensive security measures, including least-privilege access control (IAM) and secure credential management for models and APIs.
Collaborate with data science teams to translate prototypes (including LLMs and standalone APIs) into production-grade services with clear monitoring strategies for production model health.
Requirements
Bachelor's degree in Computer Science, Engineering, or a related technical field.
5+ years of professional experience in MLOps, DevOps, or a senior Data Engineering role with a focus on operationalizing machine learning models.
Expert-level proficiency in Python for pipeline automation and scripting, including extensive experience with the AWS SDK (Boto3) and Bash.
Deep, hands-on experience with core AWS services, including S3, Lambda, SageMaker, IAM, and a solid understanding of networking within VPCs.
Proven experience building and deploying containerized applications (Docker), especially for serving ML models and LLM-based APIs.
Deep familiarity with Git workflows (branching, merging, rebasing) and experience implementing CI/CD pipelines using tools like GitHub Actions or AWS CodePipeline.
Demonstrated experience in designing and orchestrating complex, data-engineering-heavy pipelines, from data ingestion through to production inference.
Benefits
Fast-paced startup environment where your ideas can quickly become reality
Opportunity to wear multiple hats and grow beyond your job description
Remote-first culture with home office support
Comprehensive health benefits (Medical, Dental, Vision, HSA)
401(k) plan and life insurance
Flexible time off and 12 weeks parental leave
Professional development reimbursement
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.