FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesPythonPyTorch
About the role
Key responsibilities & impact- Research and implement techniques for LLM inference and LLM optimizations.
- Conduct experiments to evaluate the impact of optimization methods on model accuracy, latency, and throughput.
- Collaborate with researchers and engineers to integrate optimizations into real-world machine learning workflows.
- Document findings and contribute to technical reports, blog posts, or research publications.
Requirements
What you’ll need- Currently pursuing a Ph.D. degree in Computer Science, Electrical Engineering, Machine Learning, or a related field
- Strong programming skills in C++, CUDA, and Python
- Experience with tensor math libraries such as PyTorch
- Familiarity with AI model optimization techniques such as quantization (e.g., INT4, FP8), pruning, and knowledge distillation
- Deep understanding and experience in GPU performance optimizations
- Excellent knowledge of large language model architectures
- Strong analytical and problem-solving skills
- Excellent communication skills and ability to work in a team-oriented research environment
- Background in efficient inference techniques for large-scale language models or computer vision models
- Prior experience contributing to open-source ML frameworks or research publications
- 1 or more co-authored papers at a top tier conference like NeurIPS, ICLR, ACL, CVPR, MLSys is a big plus.
Benefits
Comp & perks- Competitive stipend
- Mentorship from leading experts in machine learning and model efficiency
- Opportunity to contribute to research papers, patents, or open-source projects
- Flexible work arrangements
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C++CUDAPythonPyTorchquantizationpruningknowledge distillationGPU performance optimizationslarge language model architecturesefficient inference techniques
Soft Skills
analytical skillsproblem-solving skillscommunication skillsteamwork
Certifications
Ph.D. in Computer SciencePh.D. in Electrical EngineeringPh.D. in Machine Learning
