
Senior Software Engineer – TensorRT, Edge-LLM
NVIDIA
full-time
Posted on:
Location Type: Hybrid
Location: Santa Clara • California • Texas • United States
Visit company websiteExplore more
Salary
💰 $152,000 - $287,500 per year
Job Level
Tech Stack
About the role
- Develop and evolve a state-of-the-art inference framework in modern C++ that extends TensorRT with autoregressive model serving capabilities, including speculative decoding, LoRA, MoE, and KV cache management.
- Design and implement compiler and runtime optimizations tailored for transformer-based models running on constrained, real-time platforms.
- Collaborate with teams across CUDA, kernel libraries, compilers, and robotics to deliver high-performance, production-ready solutions.
- Contribute to CUDA kernel and operator development for critical transformer components such as attention, GEMM, and MoE.
- Benchmark, profile, and optimize inference performance across diverse embedded and automotive environments.
- Stay ahead of the rapidly evolving LLM/VLM ecosystem and bring emerging techniques into product-grade software.
Requirements
- BS, MS, PhD, or equivalent experience in Computer Science, Electrical/Computer Engineering, or a closely related field.
- 4+ years of relevant software development experience.
- Deep understanding of transformer models and inference optimization techniques (e.g., quantization, tensor parallelism, or memory-efficient scheduling).
- Proficient programming ability with modern C++ (C++11/14/17 and beyond).
- Familiarity with popular LLM frameworks and libraries such as TensorRT, TensorRT-LLM, vLLM, SGLang, MLC-LLM, or FlashInfer.
- A track record of strong software design, execution, and collaboration across fields.
Benefits
- eligible for equity and benefits
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C++TensorRTtransformer modelsinference optimizationquantizationtensor parallelismmemory-efficient schedulingcompiler optimizationsruntime optimizationsbenchmarking
Soft Skills
collaborationsoftware designexecution
Certifications
BS in Computer ScienceMS in Computer SciencePhD in Computer ScienceBS in Electrical/Computer EngineeringMS in Electrical/Computer EngineeringPhD in Electrical/Computer Engineering