NVIDIA

Senior HPC and AI Networking Performance Engineer

NVIDIA

full-time

Posted on:

Location Type: Office

Location: ShanghaiChina

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Profile and analyze AI workloads on large GPUs and CPUs scale clusters for distributed Deep Learning LLM training and inference focusing at the communication patterns
  • Benchmarking, Profiling, and Analyzing the performance to find bottlenecks and identify areas of improvement
  • Implement performance analysis tools
  • Collaborating with many teams from HW to SW to provide performance analysis insights
  • Define performance test planning and set performance expectations

Requirements

  • B.Sc in Computer Science or Software Engineering
  • 8+ years of experience with high-performance Networking (RDMA, MPI, NCCL)
  • Demonstrated Performance Analysis skills and methodologies.
  • Experience with NVIDIA GPUs, CUDA library, deep learning frameworks like TensorFlow or PyTorch
  • Fast and self-learning capabilities with strong analytical and problem solving skills.
  • Programming Languages: Python, Bash and C languages
  • Experience with Linux OS distros
  • Team player with good communication and interpersonal skills.
Benefits
  • Health insurance
  • Retirement plans
  • Professional development
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Performance AnalysisDeep LearningNVIDIA GPUsCUDATensorFlowPyTorchPythonBashCHigh-performance Networking
Soft Skills
Analytical skillsProblem solvingCommunicationInterpersonal skillsTeam playerSelf-learning
Certifications
B.Sc in Computer ScienceB.Sc in Software Engineering