FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
About the role
Key responsibilities & impact- Drive innovation in model serving and inference architectures for advanced AI systems
- Design and deploy state-of-the-art model serving architectures that deliver high throughput and low latency
- Ensure pipelines run efficiently across diverse environments
- Establish clear performance targets
- Build, run, and monitor controlled inference tests
- Identify and prepare high-quality test datasets and simulation scenarios
- Analyze computational efficiency and diagnose bottlenecks in the serving pipeline
- Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines
Requirements
What you’ll need- A degree in Computer Science or related field
- Ideally PhD in NLP, Machine Learning, or a related field
- Must have knowledge of Metal Shading Language (MSL)
- Proven experience in low-level kernel optimizations and inference optimization on mobile devices
- A deep understanding of modern model serving architectures and inference optimization techniques
- Strong expertise in writing GPU kernels for mobile devices
- Practical experience in developing and deploying end-to-end inference pipelines
- Demonstrated ability to apply empirical research to overcome challenges in model serving
- Distributed Inference Systems: Designing and optimizing high-performance inference engines
Benefits
Comp & perks- Work remotely from anywhere in the world
- Opportunity to collaborate with a global team
- Professional development opportunities to hone your skills
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Metal Shading Languagelow-level kernel optimizationsinference optimizationGPU kernelsend-to-end inference pipelinesmodel serving architecturesinference optimization techniquescomputational efficiency analysisperformance targets establishmentcontrolled inference tests
Soft Skills
innovationcross-functional collaborationproblem-solvinganalytical skillscommunication
Certifications
PhD in NLPPhD in Machine Learning
