NVIDIA

Senior DL Algorithms Engineer – Inference Performance

NVIDIA

full-time

Posted on:

Location Type: Remote

Location: CaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $184,000 - $356,500 per year

Job Level

About the role

  • Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs).
  • Contribute new features, fix bugs and deliver production code to TRT-LLM, NVIDIA’s open-source inference serving library.
  • Profile and analyze bottlenecks across the full inference stack to push the boundaries of inference performance.
  • Benchmark state-of-the-art offerings in various DL models inference and perform competitive analysis for NVIDIA SW/HW stack.
  • Collaborate heavily with other SW/HW co-design teams to enable the creation of the next generation of AI-powered services.

Requirements

  • PhD in CS, EE or CSEE or equivalent experience.
  • 5+ years of experience.
  • Strong background in deep learning and neural networks, in particular inference.
  • Experience with performance profiling, analysis and optimization, especially for GPU-based applications.
  • Proficient in C++, PyTorch or equivalent frameworks.
  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.
  • Proven experience with processor and system-level performance optimization.
  • Deep understanding of modern LLM architectures.
  • Strong fundamentals in algorithms.
  • GPU programming experience (CUDA or OpenCL) is a plus
Benefits
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
C++PyTorchdeep learningneural networksperformance profilingperformance optimizationGPU programmingCUDAOpenCLalgorithms
Certifications
PhD in CSPhD in EEPhD in CSEE