NVIDIA

Senior Deep Learning Software Engineer, TensorRT Performance

NVIDIA

full-time

Posted on:

Location Type: Hybrid

Location: Santa ClaraCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $152,000 - $241,500 per year

Job Level

About the role

  • Establish groundbreaking performance benchmarking methodologies and analysis workflows and identify performance issues and opportunities for NVIDIA’s inference ecosystem (e.g. TensorRT/TensorRT-EdgeLLM/Torch-TensorRT)
  • Contribute features and code to NVIDIA/OSS inference frameworks including but not limited to TensorRT/TensorRT-EdgeLLM/Torch-TensorRT.
  • Develop new model pipelines for NVIDIA’s inference ecosystem with optimized performance including but not limited to areas like quantization, scheduling, memory management, and distributed inference to set the gold standard for Gen AI performance.
  • Work with cross-collaborative teams inside and outside of NVIDIA across generative AI, automotive, robotics, image understanding, and speech understanding to set directions and develop innovative inference solutions.
  • Scale performance of deep learning models across different architectures and types of NVIDIA accelerators.

Requirements

  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Science, Computer Engineering, EECS, AI).
  • At least 3 years of relevant software development experience.
  • Strong C++, Python programming and software engineering skills
  • Experience with DL frameworks (e.g. PyTorch, JAX, TensorFlow, ONNX) and inference libraries (e.g. TensorRT, TensorRT-LLM, vLLM, SGLang, FlashInfer).
  • Experience with performance analysis and performance optimization
Benefits
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
C++Pythondeep learning frameworksPyTorchJAXTensorFlowONNXTensorRTperformance analysisperformance optimization
Soft Skills
cross-collaborationcommunicationproblem-solvinginnovation
Certifications
BachelorsMastersPhD