
Senior HPC Storage Engineer
NVIDIA
full-time
Posted on:
Location Type: Office
Location: Santa Clara • California • Texas • United States
Visit company websiteExplore more
Salary
💰 $184,000 - $356,500 per year
Job Level
About the role
- Research and analyze existing internal distributed storage services.
- Research, design, and implement scalable, next-gen distributed storage services for HPC workloads, optimizing both performance and cost-effectiveness to meet NVIDIA’s growing infrastructure needs
- Develop tooling to automate management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
- Detail the general procedures and practices, perform technology evaluations, related to distributed file systems.
- Collaborate across teams to better understand developers' workflows and capture their infrastructure requirements.
- Influence and guide methodologies for building, testing, and deploying applications to ensure efficient performance and resource utilization.
- Supporting our researchers to run their flows on our clusters including performance analysis and optimizations of deep learning workflows
- Root cause analysis and suggest corrective action for problems large and small scales
Requirements
- Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience.
- 8+ years of experience designing and/or operating large scale storage infrastructure.
- Experience analyzing and tuning storage performance for a variety of workloads.
- Proficient in Centos/RHEL and/or Ubuntu Linux distros including Python programming and bash scripting
- In depth understanding of container technologies like Docker, Enroot
Benefits
- Health insurance
- Stock options
- Comprehensive benefits package
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
distributed storage servicesHPC workloadsperformance analysisdeep learning workflowsPython programmingbash scriptingstorage performance tuningcontainer technologiesDockerEnroot
Soft Skills
collaborationinfluenceguidanceproblem-solvingcommunication
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in Electrical Engineering