
Machine Learning Engineer, Platform Integrations
Twelve Labs
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • United States
Visit company websiteExplore more
Salary
💰 $225,000 - $325,000 per year
About the role
- Optimize TwelveLabs' video foundation models for deployment on model inference platforms across public clouds (AWS, Azure, GCP, OCI) and data platforms (Databricks, Snowflake)
- Conduct experiments to benchmark and optimize model performance across inference stacks — measuring latency, throughput, and cost across different accelerator and serving configurations
- Collaborate with platform partner engineering teams as a peer to resolve inference-level technical challenges and inform how their infrastructure evolves to support multimodal workloads
- Work closely with TwelveLabs' core ML research team to ensure model architecture decisions account for multi-platform deployment requirements
Requirements
- 8+ years building ML systems in production, with deep experience in model serving, inference optimization, capacity planning, and GPU compute
- Deep understanding of the full model inference stack — from model weights and tensor operations through serving runtimes to accelerator hardware
- Designed production services using Python, Postgres, FastAPI, SQLAlchemy, Pydantic (and friends)
- Strong hands-on experience with cloud infrastructure (AWS, GCP or Azure), Docker, Kubernetes, and distributed systems in real-world environments — specifically in the context of ML inference and model hosting capabilities
- Defined technical roadmap and prioritization for large, ambiguous, cross-functional projects, driving high-impact technical decisions
Benefits
- Full health, dental, and vision benefits
- Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.
- VISA support where applicable
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
model servinginference optimizationcapacity planningGPU computePythonPostgresFastAPISQLAlchemyPydanticdistributed systems
Soft Skills
collaborationproblem-solvingtechnical decision-makingproject prioritization