
Senior GPU Networking Architect
NVIDIA
full-time
Posted on:
Location Type: Remote
Location: Poland
Visit company websiteExplore more
Salary
💰 PLN 292,500 - PLN 650,000 per year
Job Level
About the role
- Build, implement, and optimize GPU communication kernels that underpin collective and point-to-point operations in large-scale AI systems.
- Leverage deep knowledge of GPU architecture—thread scheduling, memory hierarchy, execution pipelines—to improve kernel efficiency, minimize latency, and overlap computation with communication.
- Develop GPU-resident communication primitives and device-side APIs that enable fine-grained, kernel-initiated data movement across nodes and accelerators.
- Profile and tune GPU kernels end-to-end, identifying bottlenecks at the intersection of compute, memory, and network, and driving targeted optimizations.
- Collaborate with network software, hardware, and AI framework teams to co-design communication strategies that align with GPU execution patterns and emerging model architectures.
- Build proofs-of-concept, conduct experiments, and perform quantitative modeling to evaluate and validate new communication strategies before committing them to production.
- Contribute to the evolution of programming models that expose GPU-aware networking capabilities to application developers.
Requirements
- 5+ years of hands-on CUDA programming, including writing and optimizing non-trivial GPU kernels.
- M.Sc. or equivalent experience in computer science, computer engineering, or a closely related field.
- Strong understanding of GPU architecture fundamentals: warp scheduling, shared memory, L2 cache, memory coalescing, occupancy tuning, and asynchronous execution.
- Experience with systems-level C/C++ development in performance-critical environments.
- Familiarity with GPU data movement mechanisms such as GPUDirect RDMA and GPU-initiated communication.
- Ability to read and reason about GPU performance profiles (e.g., Nsight Compute, Nsight Systems) and translate observations into actionable optimizations.
- Strong collaboration skills in a multi-national, interdisciplinary environment.
Benefits
- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
- Remote work options
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
CUDA programmingGPU architectureGPU kernelsC/C++ developmentGPU data movementperformance optimizationkernel profilingasynchronous executionmemory coalescingoccupancy tuning
Soft Skills
collaborationinterdisciplinary teamworkcommunication
Certifications
M.Sc. in computer scienceM.Sc. in computer engineering