NVIDIA

Senior HPC Storage Engineer

NVIDIA

full-time

Posted on:

Location Type: Office

Location: Santa ClaraCaliforniaTexasUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $184,000 - $356,500 per year

Job Level

About the role

  • Research and analyze existing internal distributed storage services.
  • Research, design, and implement scalable, next-gen distributed storage services for HPC workloads, optimizing both performance and cost-effectiveness to meet NVIDIA’s growing infrastructure needs
  • Develop tooling to automate management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
  • Detail the general procedures and practices, perform technology evaluations, related to distributed file systems.
  • Collaborate across teams to better understand developers' workflows and capture their infrastructure requirements.
  • Influence and guide methodologies for building, testing, and deploying applications to ensure efficient performance and resource utilization.
  • Supporting our researchers to run their flows on our clusters including performance analysis and optimizations of deep learning workflows
  • Root cause analysis and suggest corrective action for problems large and small scales

Requirements

  • Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience.
  • 8+ years of experience designing and/or operating large scale storage infrastructure.
  • Experience analyzing and tuning storage performance for a variety of workloads.
  • Proficient in Centos/RHEL and/or Ubuntu Linux distros including Python programming and bash scripting
  • In depth understanding of container technologies like Docker, Enroot
Benefits
  • Health insurance
  • Stock options
  • Comprehensive benefits package
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
distributed storage servicesHPC workloadsperformance analysisdeep learning workflowsPython programmingbash scriptingstorage performance tuningcontainer technologiesDockerEnroot
Soft Skills
collaborationinfluenceguidanceproblem-solvingcommunication
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in Electrical Engineering