CoreLogic

Site Reliability Engineering Manager, Windows Server, IIS, Azure

CoreLogic

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Manual Apply

Salary

💰 $112,700 - $160,000 per year

Job Level

SeniorLead

Tech Stack

AWSAzureCloudCyber SecuritySQL

About the role

  • Lead, mentor, and work alongside a 24x7 site reliability team responsible for monitoring, incident response, and resolution of mission-critical systems
  • Take hands-on ownership of incident management processes to ensure rapid detection, escalation, communication, and resolution in line with SLA targets
  • Design, plan, and conduct disaster recovery (DR) drills to validate system resilience and recovery readiness
  • Maintain and enforce SOC and ISO controls related to site reliability, security, and compliance
  • Drive alignment and compliance with NIST cybersecurity and risk management frameworks in collaboration with security and audit teams
  • Ensure the reliability and availability of complex H/A systems architected on Azure and AWS clouds
  • Collaborate with development, infrastructure, and security teams to implement best practices for system reliability, automation, and scalability
  • Drive continuous improvement of site reliability processes, automation, and tooling to enhance system performance and minimize downtime
  • Manage capacity planning and resource allocation to sustain a resilient and responsive site reliability function
  • Develop and maintain runbooks, documentation, and standards for incident response, recovery, and compliance
  • Lead root cause analysis efforts and implement preventive measures to reduce recurrence of issues

Requirements

  • Proven experience managing and working hands-on with SRE or 24x7 site reliability teams in a high-availability environment
  • 8-15 years of relevant experience
  • Bachelor Of Science Degree or equivalent work experience is highly preferred
  • Expertise in incident management and ensuring system reliability for mission-critical applications
  • Experience designing and executing disaster recovery drills
  • Knowledge and practical experience maintaining SOC and ISO compliance controls
  • Strong understanding of NIST frameworks and ability to drive organizational alignment
  • Very strong working experience with Microsoft Azure or AWS cloud platforms is preferred
  • Experience with NewRelic or similar monitoring and observability tools is preferred
  • Hands-on familiarity with IIS, Windows server environments, and SQL databases is a plus
  • Proficient with infrastructure automation, monitoring, alerting, and incident response tools
  • Exceptional leadership, communication, and collaboration skills
  • Demonstrated self-initiative, accountability, and a growth mindset
  • Ability to thrive in a fast-paced, dynamic environment with multiple stakeholders
  • Relevant cloud certifications (e.g., Azure Solutions Architect, AWS Certified SysOps Administrator) are advantageous