AVP, Recovery Manager

LPL Financial

Recovery Manager leading major and critical incident responses at LPL Financial. Collaborating with technical teams to ensure service restoration and customer impact management.

Posted 5/15/2026full-timeFort Mill • South Carolina, Texas • 🇺🇸 United StatesLead💰 $112,476 - $187,460 per yearWebsite

Tech Stack

Tools & technologies

Cloud

About the role

Key responsibilities & impact

Lead and coordinate cross‑functional technical teams during major and critical incidents, ensuring timely recovery and effective stakeholder engagement.
Serve as a recovery lead during declared major incidents, maintaining focus on service restoration and customer impact.
Participate in and facilitate post‑incident reviews and post‑mortems, ensuring outcomes are actionable and measurable.
Drive high‑quality root cause analysis for major incidents using structured techniques such as 5‑Why, Fishbone, and Blameless RCA.
Ensure contributing factors (process, technology, observability, automation, or human factors) are clearly identified and documented.
Partner with domain teams to translate findings into concrete remediation actions.
Develop, document, and maintain incident recovery plans, SOPs, runbooks, and playbooks in collaboration with domain owners.
Support and execute mock drills, recovery tests, and readiness exercises to improve response effectiveness.
Ensure recovery documentation remains accurate, consumable, and operationally relevant.
Work with application, infrastructure, and platform teams to improve diagnostic accuracy and time‑to‑engage during incidents.
Help establish clear ownership, escalation paths, and recovery patterns to reduce dependency on ad‑hoc tribal knowledge.
Promote repeatable recovery patterns across services.
Identify opportunities to improve service reliability, operational maturity, and recovery effectiveness.
Analyze incident data and trends to recommend targeted improvements across people, process, and technology.
Support adoption of SRE‑aligned practices, including error budgets, readiness reviews, and failure mode awareness.
Provide structured feedback to Observability, Automation, Resiliency, and Domain teams on; gaps in monitoring, alerts, and diagnostics; single points of failure; architectural or design weaknesses impacting recoverability
Act as an operational voice to ensure post‑incident learnings inform engineering and platform decisions.
Mentor junior recovery managers or operational staff through hands‑on incident participation and coaching.
Contribute to operational training sessions, tabletop exercises, and knowledge‑sharing initiatives.
Maintain awareness of industry best practices in production operations, incident management, and SRE.

Requirements

What you’ll need

5+ years of experience in Production Services, Incident Management, Recovery Management, Problem Management, SRE, DevOps, or related disciplines
2+ years of application, infrastructure, and/or cloud technologies, enabling effective triage and informed recovery leadership
2+ years experience using observability tools, logs, metrics, and diagnostics to troubleshoot production issues

Benefits

Comp & perks

401K matching
health benefits
employee stock options
paid time off
volunteer time off

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

incident managementrecovery managementproblem managementsite reliability engineering (SRE)DevOpsroot cause analysisobservabilitydiagnosticscloud technologiesproduction services

Soft Skills

leadershipstakeholder engagementcommunicationmentoringcollaborationanalytical thinkingproblem-solvingorganizational skillscoachingtraining