
VP of Engineering, Reliability
Filevine
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $260,000 - $360,000 per year
Job Level
About the role
- Define and execute the reliability engineering roadmap, aligning infrastructure and AI-native architecture with Filevine’s enterprise growth and platform modernization.
- Balance centralized platform capabilities with distributed ownership, ensuring the reliability model scales across a diversifying technology portfolio.
- Establish and manage SLO/SLI/error budget frameworks to create a shared language for balancing feature velocity with system stability.
- Lead infrastructure cost management (optimization and forecasting), capacity planning, and disaster recovery to meet rigorous enterprise contractual commitments.
- Lead and scale a multi-disciplinary organization (DevOps, SRE, DBRE, Tooling), fostering a culture of ownership, high craftsmanship, and clear career growth.
- Drive continuous improvement through DORA metrics, incident trend analysis, and systematic toil reduction to enhance service availability and deployment health.
- Delivery of self-service tooling, guardrails, and documentation that allow feature teams to operate their own services effectively without bottlenecks.
- Act as the primary engineering interface for the CISO to advance compliance posture (FedRAMP, SOC 2, CJIS, ISO) and translate security needs into pragmatic action.
- Collaborate with the CTO, CPO, and Architect to communicate risks and investment needs, positioning reliability as a key enabler for enterprise go-to-market success.
Requirements
- 15+ years of engineering experience, with 7+ years specifically leading infrastructure, reliability, or platform teams at scale in product-driven companies.
- Proven track record managing organizations of 40+ engineers across SRE, DevOps, and Tooling, including developing multiple layers of management.
- Demonstrated experience evolving reliability operating models to meet the shifting needs of a scaling business.
- Deep expertise operating in regulated sectors (Legal Tech, Fintech, Gov, or Healthcare) where compliance and data sensitivity are primary constraints.
- Practical, production-hardened understanding of SRE principles, including SLOs, error budgets, toil reduction, and incident management.
- Strong technical command of AWS, container orchestration, Terraform (IaC), CI/CD, and modern observability stacks.
- Direct experience owning cloud infrastructure budgets and successfully driving meaningful cost optimization and forecasting.
- Familiarity with the reliability requirements for modern AI workloads, such as model serving, vector search, and data pipeline integrity.
- Ability to engage the C-suite on risk trade-offs and transformation progress with a "builder mentality" that thrives on solving complex, high-stakes problems.
Benefits
- Medical, Dental, & Vision Insurance (for full-time employees)
- Competitive & Fair Pay
- Maternity & paternity leave (for full-time employees)
- Short & long-term disability
- Opportunity to learn from a dedicated leadership team
- Top-of-the-line company swag
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
reliability engineeringSLOSLIerror budgetcost managementcapacity planningdisaster recoveryDORA metricsincident managementtoil reduction
Soft Skills
leadershipcollaborationcommunicationownershiphigh craftsmanshipcontinuous improvementrisk managementproblem-solvingorganizational developmentcareer growth
Certifications
FedRAMPSOC 2CJISISO