Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
NVIDIA

Principal Developer, AI Networking

NVIDIA

Senior Software Engineer optimizing AI workloads for large-scale LLM training and inference on NVIDIA supercomputers. Focusing on distributed systems with high-performance networking and NVIDIA communication libraries.

Posted 6/12/2026full-timeRemote • California, Colorado, Texas, Washington • 🇺🇸 United StatesLead💰 $272,000 - $431,250 per yearWebsite

Tech Stack

Tools & technologies
Distributed SystemsPythonPyTorchTensorflow

About the role

Key responsibilities & impact
  • Characterizing AI workloads and deep learning models aimed at large-scale LLM training and inference on NVIDIA supercomputers.
  • The role centers on distributed systems with a focus on high-performance networking and NVIDIA communication libraries.
  • Benchmarking, profiling, and analyzing the performance to find bottlenecks and identify areas for improvement and optimizations, with a strong emphasis on networking aspects.
  • Developing PyTorch trace-based profiling, analysis, and replaying toolset to aid in benchmarking, debugging, and co-designing network systems for LLM workloads.
  • Collaborating with multiple teams from hardware to software to provide performance analysis insights.
  • Defining performance test plans, setting performance expectations for new technologies and solutions, and working to achieve performance targets.

Requirements

What you’ll need
  • B.Sc in Computer Science or Software Engineering or equivalent experience.
  • 15+ years of experience with high-performance networking (RDMA, MPI, NCCL, SHARP).
  • Demonstrated ability in performance evaluation techniques and approaches.
  • Experience with NVIDIA GPUs and the CUDA library.
  • Knowledge of deep learning frameworks like TensorFlow or PyTorch.
  • Expertise in networking collective communication libraries such as NCCL and protocols like RoCE and RDMA.
  • Fast and self-learning capabilities with strong analytical and problem-solving skills.
  • Proficiency in programming languages: Python, Bash, and C++.
  • Experience with a container-based development environment.
  • Great teammate who communicates clearly and works well with others.

Benefits

Comp & perks
  • equity
  • benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
high-performance networkingRDMAMPINCCLSHARPperformance evaluation techniquesNVIDIA GPUsCUDAdeep learning frameworksPython
Soft Skills
analytical skillsproblem-solving skillscommunicationcollaborationself-learningteammate