
Principal Platform Software Engineer – RAS
NVIDIA
full-time
Posted on:
Location Type: Remote
Location: California • United States
Visit company websiteExplore more
Salary
💰 $272,000 - $431,250 per year
Job Level
Tech Stack
About the role
- Drive next generation fleet management solutions for scaling AI infrastructure using GPUs and Grace solution from Nvidia
- Work with customers, product management and other architects to narrow down on requirements for implementation
- Bring up clarity on architecture for fleet health monitoring and fault-remediation solution at scale
- Work with customers and other architects, understand their requirements on health monitoring
- Detailed architecture, do POCs to validate architecture
- Educate customers about product architecture and take feedback
- Write architecture specs, design documents and own end to end delivery of product
- Do code review for the code produced because of architecture specs
- Ensure product is properly tested by working with the development team
- Drive product life cycles with QA teams to productize the code and be responsible as a product owner
- Articulate requirements as part of Jira and bug management tools and work out an end-to-end execution plan
- Contribute to all phases of product development, from product definition, architecture, and design, through implementation, debugging, testing and early customer support.
Requirements
- BS, MS, or PhD in EE/CS or related field of education (or equivalent experience)
- 15+ years hands-on coding experience
- Strong knowledge of time series databases like Influxdb & Prometheus
- Strong knowledge of building and consuming REST APIs (Redfish is big plus)
- Strong knowledge of telemetry visualization solutions like Grafana & Influx
- Strong knowledge of firmware architecture, optimize firmware for low latency APIs
- Strong knowledge of analyzing algorithms for time & space complexity and project system resource requirements
- Proven record of solutions for scalability
- Strong and demonstrable skill in C/C++ and Python
- Experience programming and debugging skills for server platforms
- Experience in SCM (e.g., Git, Perforce) and project management tools like Jira.
Benefits
- Equity
- Benefits 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
CC++Pythontime series databasesREST APIstelemetry visualizationfirmware architecturealgorithm analysisscalability solutionsserver platform debugging
Soft Skills
communicationcollaborationcustomer educationfeedback incorporationarchitecture specificationend-to-end deliverycode reviewproduct ownershipexecution planningproblem-solving
Certifications
BS in EEMS in EEPhD in EEBS in CSMS in CSPhD in CS