NVIDIA

Senior Software Engineer – NIM Factory Infrastructure

NVIDIA

full-time

Posted on:

Location Type: Remote

Location: CaliforniaNew YorkUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $148,000 - $287,500 per year

Job Level

About the role

  • Develop, analyze and optimize factory infrastructure that will take an AI model in and produce a deployable service that is validated across Cloud, On-prem and Kubernetes environments.
  • With the team, define and deliver rapid iterations on the group's technical strategies and roadmaps to deliver and improve the NIM factory.
  • You will be developing harness, automating hardware acceptance, analyze benchmarks, data gathering and statistical analysis of systems health and performance analysis of NIMs
  • Work with technical leaders designing and developing scalable and reliable factory acceptance and performance tuning of hardware platforms.
  • You will collaborate with multiple AI model teams to understand their requirements to build an efficient infrastructure that improves every team's productivity.
  • Define metrics and drive improvements based on user feedback.
  • You will mentor and collaborate throughout the team and with other teams to grow your colleagues and yourself.

Requirements

  • A history of using your advanced programming skills to build tooling and automation for hardware system characterization and benchmarking.
  • Proven experience debugging and analyzing performance of compute applications and system
  • Deep technical expertise working with system software and platform layers including Kernel, device driver, memory, storage, networking and PCIe devices
  • Experience working with hardware clusters, distributed system, networking, GPU interconnects (PCie, NVlink), node and cluster interconnect (InfiniBand)
  • Passion for building platform engineering components and automation of system benchmarking and characterization.
  • Excellent interpersonal skills and the ability to lead multi-functional efforts.
  • BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience)
  • 5+ years of shown experience developing performant microservice, cloud software and/or tooling roles.
Benefits
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
programmingautomationbenchmarkingperformance analysissystem softwareplatform layershardware characterizationmicroservicescloud softwareKubernetes
Soft skills
interpersonal skillsleadershipcollaborationmentoringcommunication