NVIDIA

Senior Deep Learning Architect, LLM Inference

NVIDIA

full-time

Posted on:

Origin:  • 🇺🇸 United States • California

Visit company website
AI Apply
Apply

Salary

💰 $184,000 - $356,500 per year

Job Level

Senior

Tech Stack

D3.jsJavaScriptPythonPyTorch

About the role

  • Characterize the latest LLMs and inference servers like vLLM and SGLang to ensure TRT-LLM maintains leadership
  • Build engaging content with performance marketing (blog posts and written materials) highlighting TRT-LLM achievements
  • Collaborate with engineers from AI startups to debug and establish standard methodologies
  • Profile GPU kernel-level performance to identify hardware and software optimization opportunities
  • Develop profiling and analysis software tools to keep up with rapid network scaling
  • Contribute to deep learning software projects (PyTorch, TRT-LLM, vLLM, SGLang)
  • Verify TRT-LLM performance for new GPU product launches
  • Collaborate across software, research, and product teams to guide direction of inference serving

Requirements

  • Master's or PhD degree in Computer Science, Computer Engineering, or related fields, or equivalent experience
  • 6+ years of relevant industry experience
  • Detailed knowledge of deep learning inference serving, PyTorch programming, profiling, and compiler optimizations
  • Proficiency in Python and C++ programming languages and familiarity with CUDA
  • Experience with LLMs and their performance challenges and opportunities
  • Solid understanding of CPU and GPU microarchitecture and performance characteristics
  • Experience with complex software projects like frameworks, compilers, or operating systems
  • Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment
  • (Ways to stand out) Drive to continuously improve software and hardware performance
  • (Nice to have) Examples of novel use cases for agentic AI tools in the workplace
  • (Nice to have) Experience with database and visualization tools like D3.js
NVIDIA

Senior Math Libraries Engineer - Sparsity in AI

NVIDIA
Seniorfull-time$184k–$357k / yearCalifornia · 🇺🇸 United States
Posted: 37 days agoSource: nvidia.wd5.myworkdayjobs.com
PythonPyTorch
NVIDIA

Senior Software Engineer, cuBLASDx and cuSolverDx

NVIDIA
Seniorfull-time$148k–$288k / yearCalifornia · 🇺🇸 United States
Posted: 5 days agoSource: nvidia.wd5.myworkdayjobs.com
PythonPyTorch
NVIDIA

Senior Software Engineer – cuBLASDx and cuSolverDx

NVIDIA
Seniorfull-time🇺🇸 United States
Posted: 1 day agoSource: nvidia.wd5.myworkdayjobs.com
PythonPyTorch
Precision Neuroscience

Staff Software Engineer

Precision Neuroscience
Leadfull-time$200k–$210k / yearCalifornia, Illinois, New York · 🇺🇸 United States
Posted: 3 days agoSource: precisionneuro.pinpointhq.com
PythonPyTorchRustTensorflow
NVIDIA

Solutions Architect – Higher Education and Research

NVIDIA
Mid · Seniorfull-time$148k–$288k / yearIllinois, Ohio · 🇺🇸 United States
Posted: 1 day agoSource: nvidia.wd5.myworkdayjobs.com
PythonPyTorchTensorflow