Senior Staff Engineer

DDN

Senior Staff Engineer leading hands-on development of AI data path and storage architecture at DDN. Architecting high-performance data movement systems for real-time AI workloads.

Posted 4/14/2026full-timeRemote • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies

CloudDistributed SystemsLinuxPythonPyTorchTensorflow

About the role

Key responsibilities & impact

Lead the design and implementation of high-performance data movement pipelines using NVIDIA NIXL across GPU, CPU, and storage tiers.
Architect and drive integration of DDN Infinia with GPU-accelerated inference platforms for large-scale, real-time AI workloads.
Own end-to-end optimization of I/O paths between GPU memory and storage using technologies such as NVIDIA GPUDirect Storage, RDMA, and NVMe-over-Fabrics.
Define and implement multi-tier storage architectures (NVMe, SSD, object storage) optimized for inference latency, throughput, and scalability.
Lead development of advanced KV cache management strategies, including offloading, prefetching, and persistence across distributed storage layers.
Partner with AI/ML engineering teams to optimize inference performance in frameworks such as PyTorch and TensorFlow.
Establish benchmarking frameworks and lead performance tuning efforts for storage and data movement in production inference environments.
Diagnose and resolve complex system bottlenecks across storage, networking, and GPU subsystems.
Influence architecture decisions for distributed inference systems, ensuring scalability, resilience, and efficient data locality.
Drive engineering excellence through best practices in observability, performance monitoring, automation, and reliability engineering.
Mentor junior engineers and provide technical leadership across cross-functional teams.

Requirements

What you’ll need

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
12+ years of experience in storage systems, distributed systems, or performance engineering.
Proven track record of architecting and delivering large-scale, high-performance infrastructure systems.
Deep expertise in distributed storage architectures (object storage, scalable file systems, or cloud-native storage platforms).
Strong understanding of Linux I/O stack, filesystem internals, and storage protocols.
Extensive hands-on experience with NVMe, SSD optimization, and high-performance storage environments.
Strong experience with RDMA, InfiniBand, or other high-speed data transfer technologies.
Solid understanding of GPU computing concepts and CPU–GPU data movement patterns.
Proficiency in Python and/or C/C++, with advanced debugging, profiling, and performance tuning skills.
Demonstrated ability to optimize latency-sensitive, high-throughput production systems.

Benefits

Comp & perks

Competitive salary
Flexible working hours
Professional development opportunities
Remote work options

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

data movement pipelinesNVIDIA NIXLDDN InfiniaGPU-accelerated inferenceNVIDIA GPUDirect StorageRDMANVMe-over-FabricsKV cache managementPyTorchTensorFlow

Soft Skills

technical leadershipmentoringcollaborationperformance tuningproblem-solvinginfluencing architecture decisionsengineering excellenceobservabilityautomationreliability engineering