FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Director, Site Reliability Engineering – Cloud Operations
ESA - Electronic Security AssociationDirector, overseeing global cloud infrastructure, SRE for large-scale IoT ecosystem at Resideo. Leading platform engineering with focus on innovation, reliability, and automation.
Tech Stack
Tools & technologiesAnsibleAzureCloudGrafanaIoTKubernetesPrometheusTerraform
About the role
Key responsibilities & impact- Define and execute global cloud operations and SRE strategies, ensuring 99.999%+ uptime for mission-critical IoT services.
- Architect, implement, and optimize multi-cloud infrastructure to support IoT devices with low-latency data processing, scalability, and high availability.
- Drive cost optimization strategies while balancing performance, redundancy, and financial efficiency across cloud platforms (Azure).
- Develop automated deployment, monitoring, and recovery systems using technologies like Kubernetes, Terraform, Ansible, and CI/CD pipelines.
- Establish and refine SLOs, SLIs, and KPIs for service reliability, performance, and capacity planning.
- Build and optimize incident management, disaster recovery, and resilience engineering frameworks.
- Leverage AI/ML-driven automation for proactive failure detection and remediation.
- Implement robust security practices and ensure cloud security, compliance with standards such as SOC2, GDPR, and NIST, and oversee the zero-trust security model for IoT data protection.
- Collaborate with security and compliance teams to manage risk and ensure regulatory adherence across cloud platforms.
- Lead and mentor a global team of Cloud Engineers, SREs, and SW professionals, fostering a culture of continuous learning and innovation.
- Partner with product management, software engineering, and customer support to optimize IoT device onboarding, firmware updates, and cloud-to-edge performance.
- Collaborate with finance and executive leadership to develop long-term cloud investment strategies.
Requirements
What you’ll need- 15 + years in Computer Science, Electrical Engineering, or a related field
- 15+ years of experience in Cloud Operations, SRE, or Infrastructure Engineering, with 8+ years in technical leadership roles
- 5+ years of experience managing large-scale, distributed IoT cloud environments supporting billions of data points per day
- 5+ years of deep professional experience in Azure cloud platforms including networking, storage, compute, and database services
- 5+ years of experience in Kubernetes, Terraform, CI/CD pipelines, and observability tools (e.g., Prometheus, Grafana, ELK, etc.)
- 5+ years of experience in large-scale systems design and architecture, with a focus on reliability, performance, and scalability of cloud-native platforms
- 5+ years of hands-on experience with tools like Terraform, Ansible, CDK, Pulumi for Infrastructure-as-Code (IaC), and managing cloud-native architectures
Benefits
Comp & perks- Resideo provides comprehensive benefits, including life and health insurance, life assistance program, accidental death and dismemberment insurance, disability insurance, 401k Plan, vacation & holidays.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
cloud operationssite reliability engineering (SRE)multi-cloud infrastructureIoT servicescost optimizationautomated deploymentmonitoring systemsdisaster recoverysecurity compliancelarge-scale systems design
Soft Skills
leadershipmentoringcollaborationcontinuous learninginnovation
Certifications
SOC2GDPRNIST