Salary
💰 $159,100 - $213,565 per year
Tech Stack
AWSAzureCloudGoogle Cloud PlatformKubernetesPythonTerraform
About the role
- Report to the Director of AI & ML on the Data Products group and collaborate to execute the vision for maturing AI/ML infrastructure
- Build and scale core AI and ML platforms that empower product teams and accelerate feature delivery
- Provide technical leadership across the full AI/ML stack, from model registry, CI/CD, and feature stores to LLM orchestration and observability
- Champion AI Trust & Safety by translating principles like clinical norms, fairness, and transparency into technical controls and guardrails
- Improve MLOps and LLMOps capabilities: establish monitoring for model performance, latency, and cost; define SLOs; build CI/CD pipelines
- Execute on strategy: break down large initiatives into clear, phased roadmaps and partner with product managers on prioritization and trade-offs
- Manage stakeholders across Product, Member Experience, and Clinical teams and communicate KPI-focused metrics on platform health, adoption, and developer velocity
- Lead a high-performing team: foster psychological safety, hire and retain ML engineering talent, set clear goals and KPIs, and run performance reviews
- Coach and mentor engineers, using career ladders to create personalized development plans
- Hands-on player-coach: engage in technical details, guide architecture and AI safety decisions, and implement workflows using tools like LangGraph
Requirements
- 2-4+ years in a formal engineering management role, with direct experience leading teams of 4+ engineers
- History of productionizing successful AI/ML platforms and solutions
- 1+ years of experience iteratively building AI-empowered tools and ensuring they operate safely and at scale
- Hands-on experience with the modern AI stack, including orchestration frameworks like LangGraph, observability tools like LangSmith, and best practices for prompt engineering and building safety guardrails
- 5+ years of experience in software or machine learning engineering; background as a Senior MLE, SRE, or DevOps Engineer working on ML infrastructure
- Hands-on experience building, evaluating, and deploying machine learning models
- Strong understanding of cloud services (AWS, GCP, Azure), Kubernetes, IaC (Terraform), and CI/CD systems
- Proficient in Python
- Demonstrated ability to collaborate with product management and other cross-functional partners in an outcome-driven environment
- History of successfully delivering complex, multi-month technical projects
- Security and privacy awareness; previous experience in a medical/health records industry is preferred
- Ability to work in a hybrid role based in San Francisco, with expectation to be in office 2-3 days a week