Salary
💰 $200,000 - $322,000 per year
Tech Stack
CloudDockerKubernetesOpenStack
About the role
- Develop automation running reliable AI infrastructure services at scale (close to bare metal and over VMaaS)
- Develop and manage one or more teams to ensure internal and external cloud services atop accelerated computing hardware run reliably
- Recruit and retain talent and manage career development for your organization
- Be accountable for deliverables of team(s) in scope and for cross-team and cross-company communications
- Participate in KPI-driven strategic planning and foster a collaborative environment
Requirements
- 7+ overall years of experience
- BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics) or equivalent experience
- 3+ years of management experience with prior hands-on experience as an individual contributor
- Proven track record of impactful project deliveries while managing Software Engineers focused on cloud infrastructure or cloud application services
- Experience with DevOps and/or SRE practices and/or Platform Engineering
- Systematic problem-solving approach, coupled with strong communications skills and a sense of ownership and drive
- Nice-to-haves: Developing ML/AI infrastructure; Developing bare metal as a service (BMaaS) systems; Developing multi-cloud infrastructure services; Teaching reliability or cloud systems practices; Running private/public cloud systems using Kubernetes, OpenStack, NVIDIA BCM, Docker, or Slurm