Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

Apply faster with JobTailor

Recommended
Apply

Apply your way

Use the standard apply link, or let JobTailor help you move faster.

  • Apply directly in one click
  • No setup required
  • Best if you’re in a hurry
Start AI Apply
NVIDIA

Senior Performance Engineer – LLM Inference Frameworks

NVIDIA

. Design, implement, and optimize high‑performance inference pipelines for large language models running on GPUs .

Posted 4/20/2026full-timeYokneam • 🇮🇱 IsraelSeniorWebsite

Tech Stack

Tools & technologies
PythonPyTorch

About the role

Key responsibilities & impact
  • Design, implement, and optimize high‑performance inference pipelines for large language models running on GPUs
  • Profile and tune model execution across the stack - from scheduler design to kernel fusions and everything in-between
  • Design and experiment with memory management strategies for improved memory bandwidth optimization and cache efficiency
  • Innovate and Implement cutting-edge techniques such as Speculative Decoding, Context Caching, and FP8/INT4 quantization to push the boundaries of tokens-per-second-per-watt
  • Develop and maintain benchmarking and testing systems that quantify latency, utilization, and efficiency

Requirements

What you’ll need
  • Bachelor's, Master's, or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused degree (or equivalent experience)
  • 5+ years of relevant software development experience
  • Excellent Python programming skills, software design, and software engineering skills
  • Experience working with deep learning frameworks like PyTorch and HuggingFace
  • Experience profiling and debugging performance at all levels - Python runtime, PyTorch internals, and GPU utilization metrics
  • Awareness of the latest developments in LLM architectures and LLM inference techniques
  • Proactive and able to work without supervision
  • Excellent written and oral communication skills in English

Benefits

Comp & perks
  • Competitive salaries
  • Comprehensive benefits package

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Pythondeep learninginference pipelinesmemory managementbenchmarkingperformance profilingFP8 quantizationINT4 quantizationscheduler designkernel fusion
Soft Skills
proactivecommunicationindependent work