Salary
💰 $135,482 - $227,700 per year
Tech Stack
CloudDockerGoIoTJavaKubernetesPythonPyTorchRayScalaSparkTensorflow
About the role
- Design and implement scalable machine learning infrastructure using Ray to support model training, deployment, and inference at scale.
- Leverage Kubernetes for orchestration of containerized applications, ensuring seamless deployment, scaling, and management of ML models and associated services.
- Develop and maintain CI/CD pipelines for automated testing, deployment, and management of ML applications and infrastructure.
- Implement robust monitoring, logging, and alerting systems to ensure high availability, performance, and security of the ML platform.
- Collaborate with data scientists and ML engineers to optimize data pipelines and model performance.
- Provide DevOps/SRE support for the ML platform, including incident response, performance tuning, and disaster recovery planning.
- Stay abreast of the latest advancements in machine learning technologies and infrastructure, and advocate for adoption of best practices and new technologies within the team.
- Work closely with various engineering teams across ML, full-stack, firmware as well as cross functional partners to deliver core infrastructure, services, and optimizations.
- Champion and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team).
Requirements
- BS or MS in Computer Science or other relevant field.
- 6+ years of experience as a Machine Learning Engineer, Applied Scientist, or similar role.
- Strong proficiency in one or more common languages (e.g., C++, Golang, Java, Python, Scala).
- Proficiency with common ML tools (e.g., Spark, TensorFlow, PyTorch).
- Experience deploying and iteratively refining models using customer feedback loops.
- Comfortable with full-stack / backend development code to build a strong understanding of underlying data structures and other dependencies.
- This is a remote position open to candidates residing in the US.
- (Preferred) Ph.D. in Computer Science or quantitative discipline (e.g., Applied Math, Physics, Statistics).
- (Preferred) Experience building, deploying, and optimizing ML models on the edge.
- (Preferred) Experience building end-to-end ML applications from scratch.
- (Preferred) Expertise in optimizing distributed model training with GPUs.