Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Red Hat

Machine Learning Engineer, Distributed vLLM

Red Hat

. Contribute to the design, development, and testing of new features and solutions for Red Hat AI Inference .

Posted 5/7/2026full-timeBoston • Massachusetts • 🇺🇸 United StatesMid-LevelSenior💰 $136,320 - $225,090 per yearWebsite

Tech Stack

Tools & technologies
CloudGoGRPCKubernetesOpen SourcePythonRust

About the role

Key responsibilities & impact
  • Contribute to the design, development, and testing of new features and solutions for Red Hat AI Inference
  • Innovate in the inference domain by participating in upstream communities
  • Develop and maintain distributed inference infrastructure leveraging Kubernetes APIs, operators, and the Gateway Inference Extension API for scalable LLM deployments.
  • Develop and maintain system components in Go and/or Rust to integrate with the vLLM project and manage distributed inference workloads.
  • Develop and maintain KV cache-aware routing and scoring algorithms to optimize memory utilization and request distribution in large-scale inference deployments.
  • Enhance the resource utilization, fault tolerance, and stability of the inference stack.
  • Develop and test various inference optimization algorithms.
  • Actively participate in technical design discussions
  • Contribute to a culture of continuous improvement by sharing recommendations and technical knowledge with team members
  • Collaborate with other engineering and cross-functional teams to deliver on engineering deliverables
  • Communicate effectively to team members to ensure proper visibility of development efforts
  • Be taught, coached, and mentored by senior members of the team
  • Provide timely and constructive code reviews

Requirements

What you’ll need
  • Strong proficiency in Python and/or GoLang or similar language
  • Experience with cloud-native Kubernetes service mesh technologies/stacks such as Istio, Cilium, Envoy (WASM filters), and CNI.
  • Working understanding of Layer 7 networking, HTTP/2, gRPC, and the fundamentals of API gateways and reverse proxies.
  • Knowledge of serving runtime technologies for hosting LLMs, such as vLLM, SGLang, TensorRT-LLM, etc.
  • Excellent written and verbal communication skills, capable of interacting effectively with both technical and non-technical team members.
  • Ability work independently in a dynamic, fast-paced environment
  • Proficiency in C, C++, or Rust is considered a plus
  • Experience with the Kubernetes ecosystem, including core concepts, custom APIs, operators, and the Gateway API inference extension for GenAI workloads, is a plus.
  • Working knowledge of high-performance networking protocols and technologies including UCX, RoCE, InfiniBand, and RDMA is a plus.
  • Experience with GPU performance benchmarking and profiling tools like NVIDIA Nsight or distributed tracing libraries/techniques like OpenTelemetry is a plus.
  • Experience in writing high performance code for GPUs and deep knowledge of GPU hardware is a plus.
  • Strong understanding of computer architecture, parallel processing, and distributed computing concepts is a plus.
  • Bachelor's degree in computer science or related field is an advantage, though we prioritize hands-on experience.
  • Active engagement in the ML research community (publications, conference participation, or open source contributions) is a significant advantage

Benefits

Comp & perks
  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonGoLangKubernetesvLLMSGLangTensorRT-LLMCC++RustGPU performance benchmarking
Soft Skills
written communicationverbal communicationindependent workcollaborationcontinuous improvementtechnical design discussionsmentorshipcode reviews
Certifications
Bachelor's degree in computer science