
Senior Manager, Engineering – Data Center Telemetry, RAS
NVIDIA
full-time
Posted on:
Location Type: Hybrid
Location: Santa Clara • California • United States
Visit company websiteExplore more
Salary
💰 $272,000 - $431,250 per year
Job Level
Tech Stack
About the role
- Own the end-to-end architecture and delivery for telemetry solutions, including fleet health monitoring, fault remediation, and data visualization at scale
- Own OOB telemetry solution and data validation for telemetry from each underlying device
- Recruit, develop, and motivate a high-performing engineering team focused on platform telemetry, RAS and observability
- Continuously improve software development processes for optimal productivity and quality
- Work across teams to ensure seamless integration of telemetry solutions with platform firmware, server architecture, and data center management
- Drive product life cycles with QA teams, ensuring robust testing, productization, and delivery
- Conduct performance reviews, foster a culture of excellence, and ensure high productivity
Requirements
- 12+ overall years of relevant experience
- 5+ years of managing systems/platform software teams
- BS, MS, or PhD in EE/CS or related field (or equivalent experience)
- Strong knowledge of DMTF/PLDM for OOB telemetry collection
- Time series databases (e.g., InfluxDB, Prometheus) and REST APIs (Redfish)
- Deep understanding of Server and firmware architecture and optimization for low-latency APIs
- Proven track record of delivering scalable server products and telemetry solutions
- Experience with SCM (Git, Perforce) and project management tools (Jira)
- Hands-on experience with x86/ARM system architecture and coding (C/C++, Python)
- Familiarity with Confidential Compute and notification systems
- Demonstrated ability to analyze algorithms for time/space complexity and system resource requirements
Benefits
- Equity
- Benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
DMTFPLDMtime series databasesInfluxDBPrometheusREST APIsCC++Pythonx86/ARM system architecture
Soft Skills
team managementleadershipmotivationprocess improvementcollaborationperformance reviewsculture of excellenceproductivity
Certifications
BS in EEMS in EEPhD in EEBS in CSMS in CSPhD in CS