
Head of Recovery and Problem Management
LPL Financial
full-time
Posted on:
Location Type: Office
Location: Fort Mill • North Carolina • Texas • United States
Visit company websiteExplore more
Salary
💰 $132,793 - $221,321 per year
Job Level
Tech Stack
About the role
- Lead the 24/7 Command Center/NOC operations, ensuring 100% visibility of system health and security events
- Develop, maintain, and execute the monitoring and incident response strategy, including playbooks, automation, and tooling to shift from reactive to proactive, data-driven operations
- Define, monitor, and report on key performance indicators to measure operational efficiency of the Command Center team
- Develop playbooks by coordinating with domain owners and ensure more self-sufficiency and diagnosis accuracy
- Maintain the knowledge base for repeat alerts and incidents, Known Error Database (KEDB) and produce trend analysis reports for senior leadership
- Act as the primary liaison to senior leadership, providing timely, accurate updates
- Oversee root cause analysis (RCA) for repeat alerts and incidents
- Apply techniques and required intervention to prevent major incidents by effectively handling alerts and dashboard anomalies
- Build, lead and empower a high-performing team of incident managers, command center analysts, and technical leads
- Identify opportunities to improve IT service reliability and reduce operational risks related to people, process and technology
- Provide continuous feedback to Observability, Automation, Resiliency and Domain teams on improving observability posture, automation, single points of failures, architectural and design gaps
- Mentor and develop other team members, providing training and stay current with industry best practices and technologies
Requirements
- Progressive and proven experience and expertise in Production Services
- running NOC/Command Center, SRE
- Well versed with principles of ITSM
- Experience with observability tools, dashboards and diagnostics to be able to troubleshoot and coach people
- Knowledge of key technology components and architectural principles
Benefits
- 401K matching
- health benefits
- employee stock options
- paid time off
- volunteer time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
incident responsemonitoringroot cause analysisdata-driven operationsperformance indicatorstrend analysisobservabilityautomationdiagnosticsIT service management
Soft Skills
leadershipteam buildingmentoringcommunicationcollaborationproblem-solvingcoachingfeedbacktrainingorganizational skills