FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead Systems HPC Engineer
Nebius GroupLead Systems HPC Engineer optimizing large-scale GPU clusters at Nebius, enhancing performance across hardware and software in cloud computing.
Tech Stack
Tools & technologiesGoLinuxPython
About the role
Key responsibilities & impact- Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.
- Investigate and troubleshoot performance issues of GPU cluster under real workloads (training and inference).
- Evaluate and integrate new hardware, system configurations and tuning approaches through software stack.
- Support complex performance-related escalations from internal teams and customers.
- Work closely with infrastructure, software engineering and hardware vendor teams (e.g. NVIDIA, Mellanox, Intel).
- Contribute to hardware and cluster qualification (acceptance), ensuring systems meet performance expectations.
Requirements
What you’ll need- 5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
- 3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
- In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
- Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).
Benefits
Comp & perks- Health insurance: 100% company-paid medical, dental and vision coverage for employees and families.
- 401(k) plan: Up to 4% company match with immediate vesting.
- Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
- Remote work reimbursement: Up to $85/month for mobile and internet.
- Disability & life insurance: Company-paid short-term, long-term and life insurance coverage.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
system-level software developmentperformance optimizationlow-level programmingLinux systems administrationLinux troubleshootingperformance tuningserver architectureperformance-oriented programming languageshigh-performance computingGPU cluster performance
Soft Skills
problem-solvingcollaborationcommunicationtroubleshootinginvestigation