Salary
💰 $189,600 - $312,730 per year
Tech Stack
AnsibleAWSAzureCloudCyber SecurityGoogle Cloud PlatformJenkinsKubernetesOpenShiftPythonTerraform
About the role
- Build and release the Red Hat AI Inference runtimes and continuously improve processes and tooling used by the DevOps team
- Work closely with product and research teams to scale SOTA deep learning products and software for enterprise deployments
- Create and manage model training and deployment pipelines
- Create DevOps and CI/CD infrastructure and scale the current technology stack
- Actively contribute to managing and releasing upstream and midstream product builds
- Test to ensure correctness, responsiveness, and efficiency
- Troubleshoot, debug and upgrade Dev & Test pipelines
- Identify and deploy cybersecurity measures via continuous vulnerability assessment and risk management
- Collaborate with cross-functional teams on market requirements and best practices
- Keep abreast of the latest technologies and standards in the field
Requirements
- 2+ years of experience in MLOps, DevOps, Automation and modern Software Deployment practices
- Strong experience with Git, Github Actions including self-hosted runners, Terraform, Jenkins, Ansible, and common technologies for automation and monitoring
- Highly experienced with administering Kubernetes/Openshift
- Familiar with Agile development methodology
- Experience with Cloud Computing using at least one of the following: AWS, GCP, Azure, or IBM Cloud
- Solid programming skills especially in Python
- Solid troubleshooting skills
- Experience maintaining an infrastructure and ensuring stability
- Ability to interact comfortably with members of a large, geographically dispersed team
- Familiarity with contributing to the vLLM CI community is a big plus
- While a Bachelor’s degree or higher in computer science, mathematics, or a related discipline is valued, technical prowess, initiative, problem solving, and practical experience are prioritized