FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

AI Inference Intern
PerplexityAI Inference Internship for exceptional Master’s or PhD students at Perplexity AI startup. Gaining experience in performance optimization and inference technologies with hands-on work.
Tech Stack
Tools & technologiesDistributed Systems
About the role
Key responsibilities & impact- Work with the inference team to improve serving latency and throughput
- Bring up support for new models and state-of-the art inference optimizations or quantization schemes
- Optimize inference across the entire stack, from GPU kernels to serving endpoints
Requirements
What you’ll need- Strong engineering track record with proven knowledge of fundamentals and programming languages (multi-threaded programming, networking, compilation, systems programming, etc)
- Pursuing a Master's or PhD in Computer Science with a focus on performance-related subjects (HPC, Compilers, Distributed Systems)
- Experience with ML frameworks (Torch, JAX)
- Experience with GPU programming (CUDA, Triton)
- Experience with High-Performance Computing (OpenMPI)
Benefits
Comp & perks- Unfortunately we cannot provide housing.
- Unfortunately we cannot provide health insurance for interns. Full time employees receive full health insurance and benefits.
- There is no limit. All outstanding performers will be given a full time offer!
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
multi-threaded programmingnetworkingcompilationsystems programminginference optimizationsquantization schemesGPU programmingCUDATritonHigh-Performance Computing
Certifications
Master's in Computer SciencePhD in Computer Science