NVIDIA

Senior DevOps Engineer – Build Systems

NVIDIA

full-time

Posted on:

Location Type: Hybrid

Location: Santa Clara • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $144,000 - $230,000 per year

Job Level

Senior

Tech Stack

AnsibleAWSAzureCloudDockerGoogle Cloud PlatformKubernetesPythonTerraform

About the role

  • Building and maintaining infrastructure from first principles needed to deliver TensorRT LLM
  • Maintain CI/CD pipelines to automate the build, test, and deployment process and build improvements on the bottlenecks.
  • Managing tools and enabling automations for redundant manual workflows via Github Actions, Gitlab, Terraform, etc
  • Enable performing scans and handling of security CVEs for infrastructure components
  • Improve the modularity of our build systems using CMake
  • Use AI to help build automated triaging workflows
  • Extensive collaboration with cross-functional teams to integrate pipelines from deep learning frameworks and components is essential to ensuring seamless deployment and inference of deep learning models on our platform.

Requirements

  • Masters degree or equivalent experience
  • 3+ years of experience in Computer Science, computer architecture, or related field
  • Ability to work in a fast-paced, agile team environment
  • Excellent Bash, CI/CD, Python programming and software design skills, including debugging, performance analysis, and test design.
  • Experience with CMake.
  • Background with Security best practices for releasing libraries.
  • Experience in administering, monitoring, and deploying systems and services on GitHub and cloud platforms.
  • Highly skilled in Kubernetes and Docker/containerd.
  • Automation expert with hands-on skills in frameworks like Ansible & Terraform.
  • Experience in AWS, Azure or GCP.
Benefits
  • equity
  • benefits 📊 Resume Score Upload your resume to see if it passes auto-rejection tools used by recruiters Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
PythonBashCMakeCI/CDKubernetesDockerAnsibleTerraformAWSAzure
Soft skills
collaborationagile team environmentperformance analysisdebuggingtest design
Certifications
Masters degree
SS&C Technologies

Site Reliability Engineer – SRE

SS&C Technologies
Mid · Seniorfull-time$150k–$180k / yearCalifornia, Colorado, Massachusetts, New York, Texas · 🇺🇸 United States
Posted: 3 days agoSource: ssctech.wd1.myworkdayjobs.com
AWSCloudKubernetesOpenShiftOpenStackPrometheusSplunkVMware
EEOC

DevOps Engineer

EEOC
Mid · Seniorfull-time$78k–$176k / yearAlabama, California, Colorado, Virginia · 🇺🇸 United States
Posted: 3 days agoSource: bah.wd1.myworkdayjobs.com
Cloud
Concentrix

Senior Director, Platform Engineering, Software Engineering, Site Reliability

Concentrix
Seniorfull-time$190k–$250k / yearCalifornia · 🇺🇸 United States
Posted: 4 days agoSource: cnx.wd1.myworkdayjobs.com
Cloud
Extreme Networks

Cloud Operations Engineer

Extreme Networks
Senior · Leadfull-timeCalifornia · 🇺🇸 United States
Posted: 4 days agoSource: jobs.lever.co
AnsibleAWSAzureCloudDockerElasticSearchGoogle Cloud PlatformKubernetesLinuxMicroservicesPostgresPrometheus+3 more