
Senior Machine Learning Engineer – DevOps
Roche
full-time
Posted on:
Location Type: Office
Location: New York City • California • New York • United States
Visit company websiteExplore more
Salary
💰 $141,100 - $262,100 per year
Job Level
About the role
- Design, implement, and maintain scalable and reliable ML infrastructure on AWS.
- Automate deployment, monitoring, alerting, and operational tasks using tools like Terraform and Helm.
- Manage and optimize CI/CD pipelines and Git repositories for ML projects, ensuring efficient version control to support collaboration and deployment.
- Collaborate closely with ML engineers and data scientists to understand their infrastructure needs and provide solutions.
- Troubleshoot and resolve infrastructure-related issues in a timely manner.
- Implement and enforce security best practices for ML infrastructure.
- Document infrastructure designs, processes, and operational procedures.
- Contribute to initiatives independently as part of a team, delivering assigned outputs.
- Proactively identify issues and gaps, proposing ideas and suggestions for improvements.
Requirements
- BS/MS with 2-3 years of industry experience required.
- Proven experience in designing, deploying, and managing infrastructure on Amazon Web Services (AWS), including services such as EC2, S3, RDS, EKS, SageMaker, etc.
- Strong proficiency with Git and Git repository management.
- Hands-on experience with Terraform for infrastructure provisioning and management.
- Experience with Helm for deploying and managing applications on Kubernetes.
- Proficiency in scripting languages (e.g., Python, Bash) for automation.
- Excellent problem-solving skills and a strong ability to debug complex issues.
- Strong communication and interpersonal skills to effectively collaborate with cross-functional teams and user-facing interactions.
- Demonstrated ability to take initiative, anticipate needs, and drive projects to completion.
- Ability to thrive in a fast-paced environment and adapt to evolving requirements while adhering to corporate guidelines.
- Ability to write clean code with little syntax/convention feedback.
- Applies software engineering best practices (linting automation, unit testing, documentation, CI/CD).
- Familiarity with modern machine learning methods.
- Knowledge of and experience with high-performance computing, distributed systems, and cloud computing.
Benefits
- A discretionary annual bonus may be available based on individual and Company performance.
- Benefits detailed at the link provided below.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learning infrastructureAWSTerraformHelmCI/CD pipelinesGitscripting languagesPythonBashhigh-performance computing
Soft Skills
problem-solvingcommunicationinterpersonal skillsinitiativeproject managementadaptabilitycollaborationdebuggingattention to detailproactive identification of issues
Certifications
BSMS