Roche

Senior Machine Learning Engineer – DevOps

Roche

full-time

Posted on:

Location Type: Office

Location: New York CityCaliforniaNew YorkUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $141,100 - $262,100 per year

Job Level

About the role

  • Design, implement, and maintain scalable and reliable ML infrastructure on AWS.
  • Automate deployment, monitoring, alerting, and operational tasks using tools like Terraform and Helm.
  • Manage and optimize CI/CD pipelines and Git repositories for ML projects, ensuring efficient version control to support collaboration and deployment.
  • Collaborate closely with ML engineers and data scientists to understand their infrastructure needs and provide solutions.
  • Troubleshoot and resolve infrastructure-related issues in a timely manner.
  • Implement and enforce security best practices for ML infrastructure.
  • Document infrastructure designs, processes, and operational procedures.
  • Contribute to initiatives independently as part of a team, delivering assigned outputs.
  • Proactively identify issues and gaps, proposing ideas and suggestions for improvements.

Requirements

  • BS/MS with 2-3 years of industry experience required.
  • Proven experience in designing, deploying, and managing infrastructure on Amazon Web Services (AWS), including services such as EC2, S3, RDS, EKS, SageMaker, etc.
  • Strong proficiency with Git and Git repository management.
  • Hands-on experience with Terraform for infrastructure provisioning and management.
  • Experience with Helm for deploying and managing applications on Kubernetes.
  • Proficiency in scripting languages (e.g., Python, Bash) for automation.
  • Excellent problem-solving skills and a strong ability to debug complex issues.
  • Strong communication and interpersonal skills to effectively collaborate with cross-functional teams and user-facing interactions.
  • Demonstrated ability to take initiative, anticipate needs, and drive projects to completion.
  • Ability to thrive in a fast-paced environment and adapt to evolving requirements while adhering to corporate guidelines.
  • Ability to write clean code with little syntax/convention feedback.
  • Applies software engineering best practices (linting automation, unit testing, documentation, CI/CD).
  • Familiarity with modern machine learning methods.
  • Knowledge of and experience with high-performance computing, distributed systems, and cloud computing.
Benefits
  • A discretionary annual bonus may be available based on individual and Company performance.
  • Benefits detailed at the link provided below.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
machine learning infrastructureAWSTerraformHelmCI/CD pipelinesGitscripting languagesPythonBashhigh-performance computing
Soft Skills
problem-solvingcommunicationinterpersonal skillsinitiativeproject managementadaptabilitycollaborationdebuggingattention to detailproactive identification of issues
Certifications
BSMS