
Senior Principal Machine Learning Engineer, vLLM Inference
Red Hat
full-time
Posted on:
Location Type: Remote
Location: California • Massachusetts • United States
Visit company websiteExplore more
Salary
💰 $206,600 - $351,050 per year
Job Level
About the role
- Write robust Python and C++, working on vLLM systems, high performance machine learning primitives, performance analysis and modeling, and numerical methods.
- Contribute to the design, development, and testing of various inference optimization algorithms
- Participate in technical design discussions and provide innovative solutions to complex problems
- Act as a core contributor for the vLLM open-source project: reviewing PRs, authoring RFCs, and mentoring external contributors
- Mentor and guide other engineers on the team and foster a culture of continuous learning and innovation
Requirements
- Extensive experience in writing high performance code for GPUs and deep knowledge of GPU hardware
- Strong understanding of computer architecture, parallel processing, and distributed computing concepts
- Experience with tensor math libraries such as PyTorch
- Deep understanding and experience in GPU performance optimizations such as ability to reason about memory bandwidth bound vs. compute bound operations
- Experience optimizing kernels for deep neural networks
- Experience with NVIDIA Nsight is a plus
- Solid understanding of LLM Inference Optimization fundamentals: Continuous Batching, PagedAttention, Quantization, Speculative Decoding, Tensor Parallelism, etc.
- Strong communications skills with both technical and non-technical team members
- Experience optimizing for non-NVIDIA hardware (AMD ROCm, TPUs, etc) is a plus
- BS, or MS in computer science or computer engineering or a related field.
- A PhD in a ML related domain is considered a plus
Benefits
- Comprehensive medical, dental, and vision coverage
- Flexible Spending Account - healthcare and dependent care
- Health Savings Account - high deductible medical plan
- Retirement 401(k) with employer match
- Paid time off and holidays
- Paid parental leave plans for all new parents
- Leave benefits including disability, paid family medical leave, and paid military leave
- Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonC++high performance codeGPU performance optimizationstensor math librariesPyTorchkernel optimizationLLM Inference Optimizationmemory bandwidthcompute bound operations
Soft skills
strong communication skillsmentoringteam collaborationproblem solvingcontinuous learninginnovation
Certifications
BS in computer scienceMS in computer sciencePhD in ML related domain