Salary
💰 $175,000 - $260,000 per year
About the role
- Validate reliability, stability, and performance of d-Matrix accelerator platforms integrated into industry-standard server systems and rack-scale solutions.
- Design and execute system-level test plans that ensure platforms meet quality standards required for customer deployment and internal development.
- Develop and execute system-level tests including stress, thermal, and PCIe interoperability tests.
- Automate test frameworks and validation workflows to improve test coverage and efficiency.
- Drive root cause analysis and debug failures in collaboration with hardware, firmware, and software teams.
- Ensure platforms meet internal and external quality criteria for production readiness.
- Document test procedures, results, and validation status across SKUs.
Requirements
- Bachelor’s or Master’s in EE, CS, or related field.
- 5+ years of experience in GPU server platform validation, preferably with PCIe-based hardware.
- Strong understanding of server architecture, Linux environments, and hardware-software interactions.
- Experience with test automation (e.g., Python, Bash) and validation tools.
- Detail-oriented with strong debugging and documentation skills.
- Experience working with ODM and OEM vendors for GPU servers and rack scale solutions.
- Hands-on experience with Electrical Validation, Functional Validation and High Speed Bus Validation.
- Hands-on experience updating firmware on servers and utilizing custom vendor software tools for debug.
- Stress testing experience.
- Strong debugging skills across hardware (compute and networking) and host/embedded software.