MLOps Pipeline Ownership: Design, implement, and manage end-to-end MLOps pipelines for the continuous training, deployment, monitoring, and versioning of high-impact ML models.
Data Pipeline Design & Orchestration: Design, build, and maintain robust data ingestion and transformation pipelines.
Kubernetes and Cloud Architecture: Architect, deploy, and maintain highly scalable, fault-tolerant infrastructure using Kubernetes across Google Cloud Platform (GCP).
DevOps and Automation: Drive DevOps culture and best practices, focusing on Infrastructure as Code (IaC) using Terraform or similar tools.
CI/CD Implementation: Configure and manage automated deployment and testing pipelines using GitHub Actions and other integrated tools.
Core Development: Write clean, efficient, and well-tested code in Python for automation scripts and infrastructure tooling.
API and Service Deployment: Design, develop, and deploy robust, high-performance Python APIs to serve machine learning predictions.
Monitoring and Observability: Implement comprehensive monitoring, logging, and alerting solutions.
Security and Compliance: Ensure platform integrity by implementing security best practices.
Collaboration and Mentorship: Work closely with data scientists, software engineers, and product managers; provide technical guidance and mentorship.
Requirements
5+ years of professional experience in DevOps, Cloud Engineering, or a related field.
At least 2 years specifically focusing on MLOps in a production environment.
Deep expertise in Python for development, scripting, and automation.
Proven experience building and deploying production-ready APIs and backend services using Python frameworks (e.g., FastAPI, Flask, or Django).
Strong proficiency in SQL.
Experience designing and optimizing schemas for relational/NoSQL databases and data warehouses (e.g., BigQuery, Cloud SQL).
Experience with data pipeline and workflow orchestration tools (Dagster, Airflow).
Expert-level knowledge of containerization technologies (Docker) and orchestration platforms (Kubernetes).
Hands-on experience with package management (e.g., Helm Charts) and Kubernetes-native Infrastructure as Code tools (Crossplane).
Extensive hands-on experience designing and managing scalable services within the Google Cloud Platform (GCP) ecosystem.
Fluency in version control and collaboration workflows using GitHub.
Strong understanding of network architecture, security principles, and large-scale data processing technologies.
Excellent communication and problem-solving skills.
Benefits
Competitive salary and comprehensive benefits package.
Health insurance
Dental insurance
Vision insurance
Disability insurance
Life insurance
401(k)
Paid holidays
Paid time off
Employee assistance programs
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
MLOpsPythonSQLAPI developmentData pipeline designContainerizationInfrastructure as CodeCI/CDMonitoring and observabilitySecurity best practices