Tech Stack
Amazon RedshiftBigQueryPythonPyTorchRaySparkSQLTensorflow
About the role
- Design, deploy, and scale ML systems that power real-time recommendations across millions of user journeys
- Build and deploy ML models serving 100M+ predictions per day to personalize user experiences at scale
- Enhance data processing pipelines (Spark, Beam, Dask) for efficiency and reliability
- Design ranking algorithms that balance relevance, diversity, and revenue
- Deliver real-time personalization with latency <50ms across key product surfaces
- Run statistically rigorous A/B tests to measure true business impact
- Optimize for latency, throughput, and cost efficiency in production
- Partner with product, engineering, and analytics teams to launch personalization features
- Implement monitoring systems and maintain ownership for model reliability
- Own modeling, feature engineering, data pipelines, and experimentation workflows
Requirements
- 5+ years building and scaling production ML systems with measurable business impact
- Experience deploying ML systems serving 100M+ predictions daily
- Strong background in ranking algorithms (collaborative filtering, learning-to-rank, deep learning)
- Proficiency with Python and ML frameworks (TensorFlow or PyTorch)
- Skilled with SQL and modern data warehouses (Snowflake, BigQuery, Redshift) plus data lakes
- Familiarity with distributed computing (Spark, Ray) and LLM/AI Agent frameworks
- Track record of improving business KPIs via ML-powered personalization
- Experience with A/B testing platforms and experiment logging best practices
- Experience with experimentation infrastructure (MLflow, W&B)
- Experience with feature engineering, data pipelines, and model deployment at scale