Salary
💰 $126,000 - $196,000 per year
Tech Stack
AirflowApacheAWSAzureCloudGoGoogle Cloud PlatformGRPCPythonRubyRuby on RailsSaltStackScalaSparkTerraform
About the role
- Design, build, and optimize ML pipelines, including data ingestion, feature engineering, training, and deployment for large-scale, real-time systems
- Improve and extend core ML Platform capabilities such as the feature store, model registry, and embedding-based retrieval services
- Collaborate with product software engineers to integrate ML models into user-facing features like recommendations, personalization, and AskAI
- Conduct model experimentation, A/B testing, and performance analysis to guide production deployment
- Optimize and refactor existing systems for performance, scalability, and reliability
- Ensure data accuracy, integrity, and quality through automated validation and monitoring
- Participate in code reviews and uphold engineering best practices
- Manage and maintain ML infrastructure in cloud environments, including deployment pipelines, security, and monitoring
Requirements
- 3+ years of experience as a professional software or machine learning engineer
- Proficiency in at least one key programming language (preferably Python or Golang; Scala or Ruby also considered)
- Hands-on experience building ML pipelines and working with distributed data processing frameworks like Apache Spark, Databricks, or similar
- Experience working with systems at scale and deploying to production environments
- Cloud experience (AWS, Azure, or GCP), including building, deploying, and optimizing solutions with ECS, EKS, or AWS Lambda
- Strong understanding of ML model trade-offs, scaling considerations, and performance optimization
- Bachelor’s in Computer Science or equivalent professional experience
- Experience with Languages: Python, Golang, Scala, Ruby on Rails
- Experience with Orchestration & Pipelines: Airflow, Databricks, Spark
- Experience with ML & AI: AWS Sagemaker, embedding-based retrieval (Weaviate), feature store, model registry, model serving platforms, LLM providers like OpenAI, Anthropic, Gemini
- Experience with APIs & Integration: HTTP APIs, gRPC
- Experience with Infrastructure & Cloud tools: AWS (Lambda, ECS, EKS, SQS, ElastiCache, CloudWatch), Datadog, Terraform
- Nice to have: experience with embedding-based retrieval, recommendation systems, ranking models, or large language model integration
- Nice to have: experience with feature stores, model serving & monitoring platforms, and experimentation systems
- Nice to have: familiarity with large-scale system design for ML
- Employees must have their primary residence in or near one of the listed cities (including surrounding metro areas): Atlanta, Austin, Boston, Dallas, Denver, Chicago, Houston, Jacksonville, Los Angeles, Miami, New York City, Phoenix, Portland, Sacramento, Salt Lake City, San Diego, San Francisco, Seattle, Washington D.C., Ottawa, Toronto, Vancouver, Mexico City