Global Head of SRE

Socure

full-time

Posted on: 11/20/2025

Location Type: Remote

Location: Remote • 🇺🇸 United States

Visit company website

✨ AI Apply

Apply

Salary

💰 $260,000 - $285,000 per year

Job Level

Lead

Tech Stack

AWSCloud

About the role

Define the global reliability strategy and roadmap across availability, latency, durability, data integrity, cost efficiency, and safety—mapped to clear business outcomes and service level objectives.
Architect multi‑region, multi‑zone resilience patterns with automated failover, graceful degradation, and progressive delivery; validate readiness through continuous game days and fault‑injection experiments.
Build and lead a world‑class red‑team QA and chaos engineering program across infrastructure, data pipelines, and applications; codify attack playbooks and steady‑state guardrails to improve detection and recovery.
Establish a unified observability practice: end‑to‑end tracing, high‑signal alerting, health and saturation indicators, user‑journey telemetry, and incident command protocols—standardized into a single, actionable operations view.
Drive rigorous incident management: real‑time incident command, rapid mitigation, blameless post‑incident reviews, durable corrective actions, and automated safeguards.
Ensure public sector readiness and continuous authorization: sustain FedRAMP Moderate posture, prove environmental parity between commercial and GovCloud, and strengthen controls for data residency, deletion, and audit evidence.
Partner with product engineering to make reliability a product feature: embed reliability patterns into RiskOS workflows and make Identity Graph‑based decisions observable, explainable, and resilient by default.
Lead developer tooling and release engineering: own CI/CD pipelines, test sandboxes and ephemeral environments, and the golden paths that make shipping changes safe, repeatable, and fast.
Advance an AI‑first SRE strategy: deploy ML for anomaly detection, incident prediction, adaptive alerting, automated runbooks, incident summarization, and capacity forecasts; measure impact via concrete reliability and efficiency wins.
Lead capacity planning and performance engineering across compute, storage, and networking—delivering consistently low‑latency decisions at peak volumes.
Attract, grow, and retain exceptional reliability engineers and leaders across regions; run a humane, effective, continuously improving on‑call program.

Requirements

Deep experience leading reliability for large‑scale, always‑on platforms with highly sensitive data—owning availability, latency, durability, and security across multiple product lines and regions.
Mastery in modern cloud architecture (AWS), product‑aligned multi‑account patterns, real‑time observability, progressive delivery, and automated disaster recovery—with a track record of measurable reliability gains.
Experience building red‑team and chaos engineering programs that surface systemic weaknesses, improve mean time to mitigate, and harden systems over time.
Proven leadership of developer tooling at scale: CI/CD, release engineering, and ephemeral environment strategies that increase velocity while reducing risk.
Strong partnership with product, data, and security; fluency in data lifecycle, retention and deletion, privacy, and governance for regulated industries and public sector.
A people‑first leadership style: you raise the bar on hiring and mentoring, set crisp principles, and build an ownership culture grounded in curiosity, accountability, and continuous learning.

Benefits

Offers Equity
Offers Bonus

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills

reliability engineeringcloud architectureCI/CDchaos engineeringincident managementcapacity planningperformance engineeringdata integrityautomated disaster recoveryanomaly detection

Soft skills

leadershipmentoringaccountabilitycuriositycontinuous learningpartnershipcommunicationproblem-solvingteam buildingstrategic thinking

Certifications

FedRAMP ModerateAWS Certified Solutions ArchitectCertified Kubernetes AdministratorCertified Information Systems Security Professional (CISSP)Certified Reliability Engineer (CRE)ITIL CertificationCertified ScrumMaster (CSM)Google Cloud Professional Cloud ArchitectMicrosoft Certified: Azure Solutions Architect ExpertCompTIA Security+