Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Marvell Technology

Staff Engineer – Compute Infrastructure and Grid Operations

Marvell Technology

Senior Engineer at Marvell designing and enhancing engineering compute infrastructure for chip design and verification. Focused on grid job management, storage systems, and reliability in compute environments.

Posted 4/29/2026full-timeWestborough • California, Massachusetts, Texas • 🇺🇸 United StatesLead💰 $128,000 - $189,370 per yearWebsite

Tech Stack

Tools & technologies
Distributed SystemsLinuxNFSPython

About the role

Key responsibilities & impact
  • Design, operate, and continuously improve the engineering compute infrastructure for large-scale chip design and verification
  • Own and evolve grid job management infrastructure used for large regressions and high-volume batch workloads
  • Debug and resolve grid job failures, including scheduling issues, hung jobs, resource starvation, and intermittent infrastructure faults
  • Improve job reliability through watchdogs, retries, heartbeats, timeouts, and failure detection mechanisms
  • Develop deep operational understanding of shared engineering storage systems used by compute jobs
  • Collaborate with IT teams on filesystem migrations, maintenance windows, and outage prevention
  • Proactively identify systemic issues that lead to grid instability or job loss
  • Act as a technical bridge between engineering users, tools teams, and central IT

Requirements

What you’ll need
  • Bachelor’s degree in computer science, Computer Engineering, Electrical Engineering, or equivalent experience
  • 8+ years of experience in compute infrastructure, grid operations, or large-scale engineering environments
  • Strong experience with grid or batch schedulers (e.g., LSF, UGE, Slurm, PBS)
  • Hands-on experience debugging distributed systems and batch job failures
  • Strong Linux systems knowledge, including process management and resource monitoring
  • Experience with shared storage systems (NFS, enterprise filers, high-performance filesystems)
  • Strong scripting skills in Python, shell, or similar languages

Benefits

Comp & perks
  • Employee stock purchase plan with a 2-year look back
  • Family support programs to help balance work and home life
  • Robust mental health resources to prioritize emotional well-being
  • Recognition and service awards to celebrate contributions and milestones

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
grid job managementdebugging distributed systemsbatch job failuresLinux systemsprocess managementresource monitoringscripting in Pythonscripting in shellshared storage systemshigh-performance filesystems
Soft Skills
collaborationproblem-solvingtechnical communicationproactive identification of issuesoperational understanding
Certifications
Bachelor’s degree in computer scienceBachelor’s degree in Computer EngineeringBachelor’s degree in Electrical Engineering