
Senior Site Reliability Engineer – Hybrid
Broadridge
full-time
Posted on:
Location Type: Hybrid
Location: Manila • Philippines
Visit company websiteExplore more
Job Level
About the role
- Design and implement high-availability, fault-tolerant architectures across on-prem and cloud platforms (AWS)
- Lead multi-region DR planning, implementation, and testing, including RTO/RPO definition and validation
- Define and enforce SLOs, SLIs, and error budgets to balance reliability with delivery velocity
- Drive self-healing automation and proactive remediation strategies
- Build and maintain infrastructure using Terraform and configuration management tools (e.g., Chef)
- Develop automation to eliminate manual operational tasks (TOIL reduction)
- Create reusable modules, pipelines, and guardrails for standardized deployments
- Automate certificate lifecycle management, key rotation, and security updates
- Design and implement end-to-end observability (metrics, logs, traces, synthetic monitoring)
- Build dashboards, alerts, and runbooks to enable fast detection and resolution of incidents
- Perform root cause analysis (RCA) and lead post-incident reviews with actionable follow-ups
- Engineer and operate platforms on AWS, including services such as EKS, EC2, RDS/Aurora, Lambda, API Gateway, CloudFront, WAF, ALB/NLB, CloudWatch, X-Ray, IAM, Secrets Manager
- Lead cloud migrations and modernization initiatives, including legacy system refactoring
- Identify and resolve performance bottlenecks through testing and analysis
- Design and support CI/CD pipelines enabling safe, repeatable deployments
- Partner with security and legal teams to meet regulatory and compliance requirements (e.g., data residency, GDPR-related controls)
Requirements
- 8+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or Systems Engineering
- Strong programming experience in Python, Java, or similar languages
- Deep experience with Linux/Unix systems
- Hands-on expertise with AWS and cloud-native architectures
- Proven experience with Terraform and Infrastructure as Code
- Strong understanding of networking, security, and distributed systems
- Experience operating mission-critical, high-volume platforms
Benefits
- Professional development opportunities
- Flexible working hours
- Health insurance
- Paid time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonJavaLinuxAWSTerraformInfrastructure as CodeCI/CDobservabilityautomationroot cause analysis
Soft Skills
leadershipcommunicationproblem-solvingcollaborationproactive remediation