NVIDIA

Software Engineering Manager

NVIDIA

full-time

Posted on:

Origin:  • 🇺🇸 United States • California, Washington

Visit company website
AI Apply
Manual Apply

Salary

💰 $224,000 - $356,500 per year

Job Level

SeniorLead

Tech Stack

CloudDistributed SystemsNode.js

About the role

  • We are seeking a Software Engineering Manager to lead the development for the Dynamo engineering team, NVIDIA’s high-performance, low-latency inference platform for serving generative AI and reasoning workloads at scale.
  • The team accelerates deployment of cutting-edge models across diverse engines and architectures, enabling breakthroughs from real-time LLM serving to complex multi-GPU, multi-node pipelines.
  • Ideal candidate is strong in software development, designing and creating fault-tolerant distributed systems, and has the ability to implement well thought out long term maintenance strategy.
  • What you'll be doing: Mentor, grow, and develop the Dynamo engineering team and be responsible for planning and execution of projects and workflows. Work across several teams and orgs to build platforms that use the latest developments in LLM inferencing. In this role, you will be collaborating with research and development teams and serve a large user base (software teams both internal and external to NVIDIA). Align priorities across collaborators and define metrics for measuring the success of the product/team. Stay updated with the latest trends in AI, ML, and infrastructure, proactively seeking opportunities to integrate advancements into NVIDIA's LLM and AI infrastructure solutions.
  • Ways to stand out from the crowd: Strong technical background in cloud/distributed systems. Experience working in a globally distributed organization. Good knowledge of CPU and/or GPU hardware architecture. Background in developing LLM inference systems. Experience with LLM frameworks like vLLM & TRT-LLM.
  • NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most expert and passionate people in the world working for us. …

Requirements

  • Masters or PhD or equivalent experience in Computer Science, computer architecture, or related field
  • 10+ years of overall experience in developing large distributed systems
  • 2+ years of experience managing AI and SW development teams
  • Experience in developing and maintaining LLM or GenAI infrastructure
  • Excellent communication, collaboration and problem-solving skills, with a dedication to encouraging an inclusive and diverse workplace
  • Strong technical background in cloud/distributed systems
  • Experience working in a globally distributed organization
  • Good knowledge of CPU and/or GPU hardware architecture
  • Background in developing LLM inference systems
  • Experience with LLM frameworks like vLLM & TRT-LLM