Tech Stack
AirflowAWSAzureCloudGoogle Cloud PlatformTerraform
About the role
- Architect and build the foundational ML infrastructure layer on top of the core data platform to power fleet-scale humanoid robotics.
- Define a long-term vision for ML infrastructure aligned with company goals and industry best practices.
- Deliver a roadmap for development of a level 2 MLOps platform after consulting with researchers and perception engineers.
- Partner with data platform engineers to integrate ML workflow orchestration and tracking systems with existing data platform tooling.
- Design and implement ML development environments, including secure, scalable workspaces for interactive experimentation (JupyterHub etc.).
- Develop core infrastructure including a model registry, feature store, and experiment tracking tooling.
- Define CI/CD lifecycle for ML enabling continuous retraining, automated testing, and seamless model delivery to production.
- Drive adoption of MLOps best practices: reproducibility, lineage, rollback, monitoring, and governance.
- Mentor junior engineers and influence the cloud platform organization’s roadmap.
- Lead greenfield, zero-to-one efforts to define company-wide ML infrastructure.
Requirements
- 8+ years of software engineering experience or ML infrastructure experience with a demonstrated track record of building data platforms and MLOps pipelines.
- Expertise in modern data platform technologies.
- Significant experience with ML frameworks and orchestration tools (MLflow, WandB, Airflow, Kubeflow, etc.).
- Strong proficiency with cloud-native tooling (AWS, GCP, or Azure), containers, and IaC (e.g., CDK, Terraform).
- Experience working cross-functionally with ML, data, and platform teams.
- Ability to mentor junior engineers and influence broader platform roadmaps.
- Bonus: Experience with robotics, autonomous vehicles, drones or embedded ML.