FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Manager, Platform, Lifecycle, Troubleshooting
VultrSenior Manager overseeing Platform & Lifecycle Team at Vultr, a leading cloud infrastructure company. Focused on technical troubleshooting and lifecycle excellence for high-performance server fleet.
Tech Stack
Tools & technologiesAnsibleCloudLinuxPython
About the role
Key responsibilities & impact- Lead the Platform, Lifecycle & Troubleshooting team in resolving complex incidents and platform issues.
- Own server repurposing, migrations (e.g., OS/distribution upgrades), and deeper lifecycle management.
- Perform and guide advanced troubleshooting for RDMA links, GPU, storage, and server-side networking.
- Validate firmware choices and handle complex/ongoing firmware updates.
- Provide 24/7 on-call leadership and drive incident response improvements.
- Develop runbooks, automation, and self-healing processes to reduce toil and improve MTTR.
- Collaborate closely with Hardware and Onboarding teams on handoffs and mixed tickets.
- Partner with Engineering, Networking, and Solutions teams on technical escalations and improvements.
- Mentor senior engineers and build a high-performing team focused on root-cause analysis.
- Track key metrics (uptime, incident trends, migration success) and drive operational maturity.
Requirements
What you’ll need- 8+ years of experience in Linux systems administration, platform engineering, or SRE-style operations in cloud or large-scale infrastructure environments.
- Deep expertise in troubleshooting GPU, storage, RDMA, and high-performance networking issues.
- Proven track record leading technical teams, including on-call rotations and complex migrations.
- Strong scripting/automation skills (Python, Bash, Ansible, etc.) and experience with monitoring tools.
- Excellent problem-solving, documentation, and cross-team communication abilities.
- Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
Benefits
Comp & perks- 100% company-paid insurance premiums for employee medical, dental and vision plans.
- 401(k) plan that matches 100% up to 4%, with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan
- Commitment matters to Vultr! Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 stipend for remote office setup in first year + $400 each following year
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company paid Wellable subscription
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Linux systems administrationplatform engineeringSRE operationstroubleshootingGPUstorageRDMAhigh-performance networkingscriptingautomation
Soft Skills
problem-solvingdocumentationcross-team communicationleadershipmentoringcollaborationincident responseroot-cause analysisoperational maturityteam building