FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Performance Engineer – LLM Inference Frameworks
NVIDIASenior Performance Engineer optimizing LLM inference on NVIDIA GPUs, developing efficient pipelines and innovative techniques in TensorRT‑LLM team for high-performance infrastructure.
Tech Stack
Tools & technologiesPythonPyTorch
About the role
Key responsibilities & impact- Design, implement, and optimize high‑performance inference pipelines for large language models running on GPUs
- Profile and tune model execution across the stack - from scheduler design to kernel fusions and everything in-between
- Design and experiment with memory management strategies for improved memory bandwidth optimization and cache efficiency
- Innovate and Implement cutting-edge techniques such as Speculative Decoding, Context Caching, and FP8/INT4 quantization to push the boundaries of tokens-per-second-per-watt
- Develop and maintain benchmarking and testing systems that quantify latency, utilization, and efficiency
Requirements
What you’ll need- Bachelor's, Master's, or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused degree (or equivalent experience)
- 5+ years of relevant software development experience
- Excellent Python programming skills, software design, and software engineering skills
- Experience working with deep learning frameworks like PyTorch and HuggingFace
- Experience profiling and debugging performance at all levels - Python runtime, PyTorch internals, and GPU utilization metrics
- Awareness of the latest developments in LLM architectures and LLM inference techniques
- Proactive and able to work without supervision
- Excellent written and oral communication skills in English
Benefits
Comp & perks- Competitive salaries
- Comprehensive benefits package
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Pythondeep learninginference pipelinesmemory managementbenchmarkingperformance profilingFP8 quantizationINT4 quantizationscheduler designkernel fusion
Soft Skills
proactivecommunicationindependent work