Build and scale ML infrastructure: Design and maintain scalable, reliable and efficient production pipelines for feature engineering, training, prediction and model serving using tools including Airflow, Big Query and Kubeflow
Drive model performance: Train, validate and deploy high-quality ML models, applying advanced techniques in feature selection, hyperparameter tuning and model architecture choices to improve the accuracy of our products
Accelerate ML development: Optimize feature engineering pipelines for performance and scalability while collaborating with Data Science to research, develop, and deploy new features that improve model accuracy
Ensure reliability: Implement comprehensive model monitoring, automated training pipelines, and observability solutions to maintain model health and performance
Champion best practices: Apply CI/CD principles including automated testing, model validation, and deployment strategies
Requirements
2+ years building production ML systems at scale, including feature engineering, training, deployment, and monitoring
Strong proficiency in Python and ML frameworks (scikit-learn, PyTorch, XGBoost)
Hands-on experience with cloud ML platforms (AWS SageMaker, Vertex AI, or Azure ML)
Expertise in big data processing including SQL optimization and distributed computing (Spark/Dask)
Production experience with workflow orchestration tools (Airflow, Dagster, Prefect)
Proficiency with version control (Git) and CI/CD practices
Experience with real-time streaming data (Kafka, Flink, Pub/Sub.)
Bachelor's degree in Computer Science, Statistics, or related field
Experience with MLOps tools (MLflow, Weights & Biases, etc.)
Benefits
equity
comprehensive medical and dental coverage
life and disability benefits
401k plan
flexible time off
paid parental leave
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.