NVIDIA

Senior System Engineer

NVIDIA

full-time

Posted on:

Origin:  • 🇺🇸 United States • California

Visit company website
AI Apply
Manual Apply

Salary

💰 $184,000 - $356,500 per year

Job Level

Senior

Tech Stack

CloudKubernetesNode.js

About the role

  • Platform fundamentals: design, build, and operate core services and node/cluster foundations for Lepton platform; automate deployments, upgrades, and day-2 operations.
  • Vulnerability & patch management: own intake, prioritization, rollout, and rollback rhythms across OS, drivers/firmware, and platform components for Lepton product.
  • Security as a product quality: define, deliver, and maintain secure-by-default baselines (host hardening, workload isolation, network segmentation, least-privilege access) for AI infrastructure at scale.
  • Identity & access stewardship: standardize patterns for service identity, role scoping, secrets handling, and certificate hygiene.
  • Trusted releases: drive change control and release practices that ensure traceability and integrity of what runs in production.
  • Monitoring & incident practice: establish health signals and SLOs; lead investigations, root causes, and follow-through actions that improve both reliability and security.
  • Risk & readiness: partner with product, SRE, and security stakeholders to assess risks for new features and close gaps with pragmatic controls.
  • Documentation & mentorship: publish runbooks and standards; review designs and coach engineers on secure operational practices.

Requirements

  • 7+ years in systems/platform engineering operating large-scale, production environments.
  • Demonstrated ability to deliver secure, reliable platforms (hardening, access control, isolation, monitoring, and strong operational runbooks).
  • Experience with containerized/managed cluster environments; familiarity with GPU-accelerated platforms or the ability to ramp quickly.
  • Automation mindset with infrastructure-as-code and CI/CD; disciplined change management.
  • Clear communication and documentation skills; ability to turn requirements into practical, supportable designs.
  • Bachelor's degree or higher in Computer Science or a related technical field (or equivalent experience).