About the role
- Fine-tune state-of-the-art models, design evaluation frameworks, and bring AI features into production
- Run and manage open-source models efficiently, optimizing for cost and reliability
- Ensure high performance and stability across GPU, CPU, and memory resources
- Monitor and troubleshoot model inference to maintain low latency and high throughput
- Collaborate with engineers to implement scalable and reliable model serving solutions
- Work closely with regional teams across product, engineering, operations, infrastructure and data to build and scale impactful AI solutions
- Participate in prototyping, testing, and iterating on AI features in a hybrid global team environment
Requirements
- Experience with model serving platforms such as vLLM or HuggingFace TGI
- Proficiency in GPU orchestration using tools like Kubernetes, Ray, Modal, RunPod, LambdaLabs
- Ability to monitor latency, costs, and scale systems efficiently with traffic demands
- Experience setting up inference endpoints for backend engineers
- Experience with monitoring and troubleshooting model inference to maintain low latency and high throughput
- Experience running and managing open-source models efficiently, optimizing for cost and reliability
- Flat structure & real ownership
- Full involvement in direction and consensus decision making
- Flexibility in work arrangement
- High-impact role with visibility across product, data, and engineering
- Top-of-market compensation and performance-based bonuses
- Global exposure to product development
- Lots of perks - housing rental subsidies, a quality company cafeteria, and overtime meals
- Health, dental & vision insurance
- Global travel insurance (for you & your dependents)
- Unlimited, flexible time off
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
model fine-tuningmodel evaluation frameworksAI feature productionopen-source model managementGPU orchestrationlatency monitoringinference endpoint setupmodel troubleshootingcost optimizationhigh throughput maintenance
Soft skills
collaborationcommunicationproblem-solvingteamworkadaptability