NVIDIA

DL Performance Software Engineer, LLM Inference

NVIDIA

full-time

Posted on:

Origin:  • 🇺🇸 United States • California

Visit company website
AI Apply
Apply

Salary

💰 $120,000 - $235,750 per year

Job Level

JuniorMid-Level

Tech Stack

Distributed SystemsPythonPyTorch

About the role

  • Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.
  • Perform benchmarking, profiling, and system-level programming for GPU applications.
  • Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.
  • Conduct unit tests and performance tests for different stages of the inference pipeline.
  • Work and collaborate with teams involving resource orchestration, distributed systems, inference engine optimization, and writing high performance GPU kernels.
  • Architect and implement inference stacks to enable efficient, scalable, and accessible LLM inference.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent experience.
  • Strong coding skills in Python and C/C++.
  • 2+ years of industry experience in software engineering or equivalent research experience.
  • Knowledgeable and passionate about machine learning and performance engineering.
  • Proven project experiences in building software where performance is one of its core offerings.
  • Solid fundamentals in machine learning, deep learning, operating systems, computer architecture and parallel programming.
  • Research experience in systems or machine learning.
  • Project experience in modern DL software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.
  • Experience with performance modelling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU.
  • We strongly encourage including sample projects (e.g. Github) that demonstrate the qualifications above.
NVIDIA

Senior Research Scientist, Post-Training LLM and DLM

NVIDIA
Seniorfull-time$160k–$299k / yearCalifornia · 🇺🇸 United States
Posted: 7 days agoSource: nvidia.wd5.myworkdayjobs.com
Distributed SystemsPythonPyTorch
d-Matrix

Staff System Software Engineer, AI/ML

d-Matrix
Leadfull-time🇮🇳 India
Posted: 14 days agoSource: jobs.ashbyhq.com
CloudDistributed SystemsKubernetesLinuxPythonPyTorchRayTensorflow
Luma AI

Research Engineer - Evaluations

Luma AI
Mid · Seniorfull-time$220k–$280k / yearCalifornia · 🇺🇸 United States
Posted: 29 days agoSource: jobs.ashbyhq.com
Distributed SystemsPythonPyTorchTensorflow
Loopio

Staff Applied Scientist

Loopio
Leadfull-time🇨🇦 Canada
Posted: 12 days agoSource: jobs.ashbyhq.com
Distributed SystemsMicroservicesPythonPyTorchRaySparkTensorflow
InspiredOne

Senior Software Engineer, Product

InspiredOne
Seniorfull-time$160k–$200k / year🇺🇸 United States
Posted: 14 days agoSource: jobs.ashbyhq.com
AWSDistributed SystemsOpen SourcePythonPyTorchRust