Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
NVIDIA

Senior Performance Engineer – LLM Inference Frameworks

NVIDIA

Senior Performance Engineer optimizing LLM inference on NVIDIA GPUs, developing efficient pipelines and innovative techniques in TensorRT‑LLM team for high-performance infrastructure.

Posted 4/20/2026full-timeYokneam • 🇮🇱 IsraelSeniorWebsite

Tech Stack

Tools & technologies
PythonPyTorch

About the role

Key responsibilities & impact
  • Design, implement, and optimize high‑performance inference pipelines for large language models running on GPUs
  • Profile and tune model execution across the stack - from scheduler design to kernel fusions and everything in-between
  • Design and experiment with memory management strategies for improved memory bandwidth optimization and cache efficiency
  • Innovate and Implement cutting-edge techniques such as Speculative Decoding, Context Caching, and FP8/INT4 quantization to push the boundaries of tokens-per-second-per-watt
  • Develop and maintain benchmarking and testing systems that quantify latency, utilization, and efficiency

Requirements

What you’ll need
  • Bachelor's, Master's, or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused degree (or equivalent experience)
  • 5+ years of relevant software development experience
  • Excellent Python programming skills, software design, and software engineering skills
  • Experience working with deep learning frameworks like PyTorch and HuggingFace
  • Experience profiling and debugging performance at all levels - Python runtime, PyTorch internals, and GPU utilization metrics
  • Awareness of the latest developments in LLM architectures and LLM inference techniques
  • Proactive and able to work without supervision
  • Excellent written and oral communication skills in English

Benefits

Comp & perks
  • Competitive salaries
  • Comprehensive benefits package

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Pythondeep learninginference pipelinesmemory managementbenchmarkingperformance profilingFP8 quantizationINT4 quantizationscheduler designkernel fusion
Soft Skills
proactivecommunicationindependent work