NVIDIA

Senior Solutions Architect, GPU System

NVIDIA

full-time

Posted on:

Location Type: Office

Location: BeijingChina

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Lead presales and architecture engagements with AI industry customers, focusing on GPU servers, AI clusters, and large-scale training/inference platforms built on NVIDIA HGX, GPU systems, and reference architectures
  • Design and validate end-to-end AI data center solutions, including server platforms, storage connectivity, and high-performance networking based on Spectrum, Quantum, ConnectX, and BlueField
  • Define system architectures for AI supercomputing, LLM training, and inference workloads, including node configuration, GPU topology, PCIe/NVLink considerations, and network design
  • Support business teams in exploring, developing, and deploying NVIDIA server and GPU solution opportunities, from early technical discovery through POC and production rollout
  • Own and execute POCs and hands-on labs that validate GPU server performance, scalability, reliability, and interoperability across compute, storage, and network domains
  • Troubleshoot complex end-to-end issues involving GPU servers, firmware, drivers, operating systems, and networking stacks, and drive fixes with internal R&D and partners
  • Provide structured feedback on platform features, system requirements, and customer needs to server OEMs, engineering, and product teams to improve NVIDIA AI platforms and ecosystems

Requirements

  • BS/BA in Computer Science, Electrical/Computer Engineering, or equivalent experience
  • 6+ years of experience with data center servers, GPU platforms, or large-scale AI/HPC infrastructure
  • Strong understanding of GPU server architecture: CPU/GPU balance, memory and PCIe/NVLink topology, storage and NIC placement, and power/cooling considerations
  • Proven experience designing or operating AI or HPC clusters using GPU-accelerated servers in cloud or on-prem data centers
  • Solid background in data center and cloud networking for AI workloads, including leaf-spine fabrics, RDMA and high-bandwidth/low-latency designs
  • Strong Linux system and Linux networking skills, including driver, firmware, and OS-level tuning for GPU and NIC performance
  • Knowledge and experience with K8S, RDMA/RoCE and, ideally, RoCE and Infiniband AI clusters
  • Excellent communication skills to collaborate with customers, server OEMs, and internal architecture and engineering teams
Benefits
  • Competitive salaries
  • Generous benefits package
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GPU architectureAI data center solutionsAI supercomputingLLM traininginference workloadsdata center serversHPC infrastructureLinux system administrationKubernetesRDMA
Soft Skills
communicationcollaborationproblem-solvingfeedback provision
Certifications
BS in Computer ScienceBA in Computer ScienceBS in Electrical EngineeringBA in Electrical Engineering