
Manager, Data – AI Platform Engineering
Stitch Fix
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $146,300 - $195,000 per year
About the role
- Lead in a player-coach capacity in execution for Stitch Fix’s next-gen Data, ML, and GenAI platforms
- Contribute towards modernization of data and ML foundations to support unified signals, adaptive models, experimentation velocity, and scalable AI/ML workloads.
- Provide foundational APIs, SDKs, frameworks, and self-service tools that make it easy for data scientists, ML engineers, analysts, and application teams to build and deploy AI solutions quickly, safely, and at scale.
- Partner with Data Science, Engineering, and Product teams to translate Data/ML/GenAI platform capabilities into production-grade features and intelligent experiences that deliver measurable business value.
- Drive responsible AI and data adoption by creating reusable templates, documentation, and enablement programs.
- Contribute towards improving governance practices including data contracts, lineage, metric definitions, access policies, and responsible AI guardrails - for trust, safety, and compliance.
- Ensure operational excellence through platform reliability, performance, observability, cost efficiency, and simplification of legacy systems.
- Lead and develop high-performing engineering teams fostering a culture of clarity, excellence, and trust.
- Balance speed of innovation with platform stability, ensuring engineering efforts are tightly aligned to business priorities and long-term client value.
Requirements
- 5+ years in software, data, ML, or platform engineering; 1+ years leading engineering individual contributors is a plus.
- Demonstrated success contributing towards large-scale data platforms, ML platforms, or AI/GenAI platforms in cloud environments.
- Experience delivering platform modernization, unification, and multi-year architectural transformation.
- Strong software engineering foundation, with experience designing and building large-scale distributed systems and resilient, high-quality APIs and services using modern programming languages and cloud-native architectures.
- Track record operating and evolving modern data infrastructure, including some of the following: distributed compute and storage technologies (Spark, Trino, Iceberg), real-time processing frameworks (Kafka/Flink), metadata / catalog systems, and Kubernetes-based orchestration.
- Expertise across the ML lifecycle - feature engineering, training pipelines, model deployment and serving, monitoring, validation, fine-tuning, and MLOps best practices.
- Proven capability in building self-service platform abstractions and tooling that enable teams to develop, experiment, and deploy data and ML products efficiently.
- Experience with modern GenAI architectures - semantic retrieval, knowledge-grounded indexing, LLM orchestration, agent workflows, and evaluation frameworks.
- Familiarity with modern ML frameworks like PyTorch and Ray is a plus.
- Strategic thinker able to align platform investments with business priorities and emerging AI opportunities.
- Potential to be a strong people leader with a track record of contributing to make inclusive, high-performing engineering teams.
- Excellent communicator who can influence both technical and business stakeholders across domains.
Benefits
- 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
data engineeringmachine learningplatform engineeringAPI developmentdistributed systemscloud-native architectureMLOpsfeature engineeringmodel deploymentreal-time processing
Soft Skills
leadershipstrategic thinkingcommunicationteam developmentinfluencecollaborationtrust buildingclarityexcellenceadaptability