FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Staff Engineer – Compute Infrastructure and Grid Operations
Marvell TechnologySenior Engineer at Marvell designing and enhancing engineering compute infrastructure for chip design and verification. Focused on grid job management, storage systems, and reliability in compute environments.
Posted 4/29/2026full-timeWestborough • California, Massachusetts, Texas • 🇺🇸 United StatesLead💰 $128,000 - $189,370 per yearWebsite
Tech Stack
Tools & technologiesDistributed SystemsLinuxNFSPython
About the role
Key responsibilities & impact- Design, operate, and continuously improve the engineering compute infrastructure for large-scale chip design and verification
- Own and evolve grid job management infrastructure used for large regressions and high-volume batch workloads
- Debug and resolve grid job failures, including scheduling issues, hung jobs, resource starvation, and intermittent infrastructure faults
- Improve job reliability through watchdogs, retries, heartbeats, timeouts, and failure detection mechanisms
- Develop deep operational understanding of shared engineering storage systems used by compute jobs
- Collaborate with IT teams on filesystem migrations, maintenance windows, and outage prevention
- Proactively identify systemic issues that lead to grid instability or job loss
- Act as a technical bridge between engineering users, tools teams, and central IT
Requirements
What you’ll need- Bachelor’s degree in computer science, Computer Engineering, Electrical Engineering, or equivalent experience
- 8+ years of experience in compute infrastructure, grid operations, or large-scale engineering environments
- Strong experience with grid or batch schedulers (e.g., LSF, UGE, Slurm, PBS)
- Hands-on experience debugging distributed systems and batch job failures
- Strong Linux systems knowledge, including process management and resource monitoring
- Experience with shared storage systems (NFS, enterprise filers, high-performance filesystems)
- Strong scripting skills in Python, shell, or similar languages
Benefits
Comp & perks- Employee stock purchase plan with a 2-year look back
- Family support programs to help balance work and home life
- Robust mental health resources to prioritize emotional well-being
- Recognition and service awards to celebrate contributions and milestones
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
grid job managementdebugging distributed systemsbatch job failuresLinux systemsprocess managementresource monitoringscripting in Pythonscripting in shellshared storage systemshigh-performance filesystems
Soft Skills
collaborationproblem-solvingtechnical communicationproactive identification of issuesoperational understanding
Certifications
Bachelor’s degree in computer scienceBachelor’s degree in Computer EngineeringBachelor’s degree in Electrical Engineering