Salary
💰 $200,000 - $250,000 per year
Tech Stack
AirflowAWSCloudDistributed SystemsDockerJavaScriptKubernetesPythonPyTorchRayTensorflowTypeScript
About the role
- Design and build scalable ML infrastructure including model serving systems, feature stores, and training pipelines
- Develop and maintain robust data pipelines for ML workflows, from data ingestion to model deployment
- Implement MLOps best practices including CI/CD for ML models, monitoring, and automated retraining pipelines
- Build and optimize model serving infrastructure to support real-time and batch inference at scale
- Collaborate with ML engineers and data scientists to understand infrastructure requirements and translate them into reliable systems
- Monitor and optimize ML system performance, reliability, and cost efficiency
- Establish infrastructure standards and tooling to accelerate ML development workflows
- Own the deployment and operational aspects of ML models in production
Requirements
- 5+ years of relevant experience building and maintaining data/ML infrastructure systems
- Strong programming skills in Python or TypeScript and experience with infrastructure-as-code tools
- Experience with ML infrastructure components such as feature stores, model registries, and serving systems
- Solid understanding of distributed systems, containerization (Docker/Kubernetes), and cloud platforms
- Experience with data pipeline orchestration tools (e.g., Airflow, Prefect, or similar)
- Independent and autonomous. We're too small to micromanage, and expect that every person at the company owns their work and can be a leader
- Hold yourself and others to a high standard when working on production systems
- Enjoy collaboration with a diverse group of stakeholders while bringing your own unique experience and background to the team