
Senior Staff Engineer
DDN
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Lead the design and implementation of high-performance data movement pipelines using NVIDIA NIXL across GPU, CPU, and storage tiers.
- Architect and drive integration of DDN Infinia with GPU-accelerated inference platforms for large-scale, real-time AI workloads.
- Own end-to-end optimization of I/O paths between GPU memory and storage using technologies such as NVIDIA GPUDirect Storage, RDMA, and NVMe-over-Fabrics.
- Define and implement multi-tier storage architectures (NVMe, SSD, object storage) optimized for inference latency, throughput, and scalability.
- Lead development of advanced KV cache management strategies, including offloading, prefetching, and persistence across distributed storage layers.
- Partner with AI/ML engineering teams to optimize inference performance in frameworks such as PyTorch and TensorFlow.
- Establish benchmarking frameworks and lead performance tuning efforts for storage and data movement in production inference environments.
- Diagnose and resolve complex system bottlenecks across storage, networking, and GPU subsystems.
- Influence architecture decisions for distributed inference systems, ensuring scalability, resilience, and efficient data locality.
- Drive engineering excellence through best practices in observability, performance monitoring, automation, and reliability engineering.
- Mentor junior engineers and provide technical leadership across cross-functional teams.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 12+ years of experience in storage systems, distributed systems, or performance engineering.
- Proven track record of architecting and delivering large-scale, high-performance infrastructure systems.
- Deep expertise in distributed storage architectures (object storage, scalable file systems, or cloud-native storage platforms).
- Strong understanding of Linux I/O stack, filesystem internals, and storage protocols.
- Extensive hands-on experience with NVMe, SSD optimization, and high-performance storage environments.
- Strong experience with RDMA, InfiniBand, or other high-speed data transfer technologies.
- Solid understanding of GPU computing concepts and CPU–GPU data movement patterns.
- Proficiency in Python and/or C/C++, with advanced debugging, profiling, and performance tuning skills.
- Demonstrated ability to optimize latency-sensitive, high-throughput production systems.
Benefits
- Competitive salary
- Flexible working hours
- Professional development opportunities
- Remote work options
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
data movement pipelinesNVIDIA NIXLDDN InfiniaGPU-accelerated inferenceNVIDIA GPUDirect StorageRDMANVMe-over-FabricsKV cache managementPyTorchTensorFlow
Soft Skills
technical leadershipmentoringcollaborationperformance tuningproblem-solvinginfluencing architecture decisionsengineering excellenceobservabilityautomationreliability engineering