
Technical Lead Manager, ML Platform
Whatnot
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • New York • United States
Visit company websiteExplore more
Salary
💰 $255,000 - $345,000 per year
Job Level
About the role
- Own the infrastructure powering AI and ML models across critical business surfaces–supporting growth, recommendations, trust and safety, fraud, seller tooling, and more.
- Guide the prototyping, deployment, and productionization of novel ML architectures that directly shape user experience and marketplace dynamics.
- Help design and scale inference infrastructure capable of serving large models with low latency and high throughput.
- Oversee and evolve real-time feature pipelines that feed both our online and offline stores, ensuring single-second feedback from behavioral signals, high reliability, and model training fidelity.
- Drive feature platform improvements and expand scope to cover non-ML use cases such as fraud rules where point-in-time backtesting is also critical.
- Lead the development of distributed training and inference pipelines leveraging GPUs and both model and data parallelism.
- Optimize system performance by managing resource utilization and developing intelligent feature caching strategies.
- Empower scientists to iterate faster by building abstractions, APIs, and developer tools that simplify the development of near-realtime features and model iteration.
- Roll out ever-better ergonomics around model training and deployment.
- Stretch beyond your comfort zone to take on new technical challenges as we scale AI across Whatnot’s ecosystem.
Requirements
- 1+ years of TLM experience developing production machine learning systems at consumer-scale loads
- Bachelor’s degree in Computer Science, Statistics, Applied Mathematics or a related technical field, or equivalent work experience.
- 5+ years of hands-on software engineering experience building and maintaining production systems for consumer-scale loads.
- 1+ years of professional experience developing software in Python
- Ability to work autonomously and drive initiatives across multiple product areas and communicate findings with leadership and product teams.
- Experience with operational, search, and key-value databases such as PostgreSQL, DynamoDB, Elasticsearch, Redis.
- Experience working with with ML-specific tools and frameworks such as MLFlow, LitServe, TorchServe, Triton
- Firm grasp of visualization tools for monitoring and logging e.g. DataDog, Grafana.
- Familiarity with cloud computing platforms and managed services such as AWS Sagemaker, Lambda, Kinesis, S3, EC2, EKS/ECS, Apache Kafka, Flink.
- Professionalism around collaborating in a remote working environment and well tested, reproducible work.
- Exceptional documentation and communication skills.
Benefits
- Generous Holiday and Time off Policy
- Health Insurance options including Medical, Dental, Vision
- Work From Home Support
- Home office setup allowance
- Monthly allowance for cell phone and internet
- Care benefits
- Monthly allowance for wellness
- Annual allowance towards Childcare
- Lifetime benefit for family planning, such as adoption or fertility expenses
- Retirement; 401k offering for Traditional and Roth accounts in the US (employer match up to 4% of base salary) and Pension plans internationally
- Monthly allowance to dogfood the app
- All Whatnauts are expected to develop a deep understanding of our product. We're passionate about building the best user experience, and all employees are expected to use Whatnot as both a buyer and a seller as part of their job (our dogfooding budget makes this fun and easy!).
- Parental Leave
- 16 weeks of paid parental leave + one month gradual return to work *company leave allowances run concurrently with country leave requirements which take precedence.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningproduction systemsPythondistributed traininginference pipelinesfeature engineeringmodel trainingdata parallelismfeature cachingreal-time processing
Soft Skills
autonomyinitiativecommunicationcollaborationdocumentationproblem-solvingleadershipadaptabilitycreativitycritical thinking
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in StatisticsBachelor’s degree in Applied Mathematics