Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Nebius Group

Lead Systems HPC Engineer

Nebius Group

. Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.

Posted 4/21/2026full-timeRemote • 🇺🇸 United StatesSenior💰 $170,000 - $300,000 per yearWebsite

Tech Stack

Tools & technologies
GoLinuxPython

About the role

Key responsibilities & impact
  • Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.
  • Investigate and troubleshoot performance issues of GPU cluster under real workloads (training and inference).
  • Evaluate and integrate new hardware, system configurations and tuning approaches through software stack.
  • Support complex performance-related escalations from internal teams and customers.
  • Work closely with infrastructure, software engineering and hardware vendor teams (e.g. NVIDIA, Mellanox, Intel).
  • Contribute to hardware and cluster qualification (acceptance), ensuring systems meet performance expectations.

Requirements

What you’ll need
  • 5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
  • 3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
  • In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
  • Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).

Benefits

Comp & perks
  • Health insurance: 100% company-paid medical, dental and vision coverage for employees and families.
  • 401(k) plan: Up to 4% company match with immediate vesting.
  • Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Remote work reimbursement: Up to $85/month for mobile and internet.
  • Disability & life insurance: Company-paid short-term, long-term and life insurance coverage.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
system-level software developmentperformance optimizationlow-level programmingLinux systems administrationLinux troubleshootingperformance tuningserver architectureperformance-oriented programming languageshigh-performance computingGPU cluster performance
Soft Skills
problem-solvingcollaborationcommunicationtroubleshootinginvestigation