Senior Site Reliability Engineer

finova

full-time

Posted on: 9/29/2025

Location Type: Hybrid

Location: London • 🇬🇧 United Kingdom

✨ AI Apply

Senior

AnsibleAWSAzureCloudGrafanaKubernetes.NETPrometheusPythonTerraform

About the role

Spearhead the Site Reliability Engineering function to ensure availability, scalability, and performance of core systems
Take responsibility for monitoring .NET applications deployed in AKS, EKS, App Services, and VMs
Design, implement, and maintain robust monitoring and alerting systems
Analyse system performance metrics, establish baselines, identify bottlenecks, and implement improvements for scalability and efficiency
Set up, configure, and optimise observability tools (Prometheus, Grafana, Datadog) to monitor metrics, logs, and traces
Ensure high availability and disaster recovery for critical systems; lead incident response and post-incident analysis
Develop and maintain SLOs, SLIs, and error budgets to meet reliability targets
Automate routine tasks and use infrastructure-as-code (Terraform, Ansible, Bicep) to manage cloud resources
Collaborate with DevOps/CloudOps and product development teams to build and deploy infrastructure via CI/CD (Azure DevOps, GitLab CI)
Mentor junior SREs and drive best practices across the engineering organisation
Identify areas for continuous improvement and stay up-to-date with industry trends, tools, and technologies

5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering with a strong focus on monitoring, alerting and incident management
Hands-on experience monitoring .NET applications in production (Grafana, Datadog, Azure Monitor)
Extensive experience with AKS, EKS, App Services, and VMs in cloud environments (AWS, Azure)
Strong proficiency in cloud platforms (AWS, Azure) and container orchestration (Kubernetes, AKS, EKS)
Proficiency in infrastructure-as-code tools (Terraform, Azure Resource Manager, Bicep, Ansible)
Experience with monitoring and observability tools (Prometheus, Grafana, Datadog)
Strong scripting skills (PowerShell, Bash, Python)
Proven ability to work independently and manage multiple projects in a fast-paced environment
Excellent verbal and written communication skills and strong problem-solving abilities
Preferred: experience with monitoring and maintaining financial services or FinOps platforms
Preferred: certifications in cloud platforms (AWS Certified Solutions Architect, Azure DevOps, Kubernetes Certified Administrator)
Preferred: experience scaling and maintaining high-performance systems with large data throughput

Benefits

Tip: use these terms in your resume and cover letter to boost ATS matches.

.NETmonitoringalertinginfrastructure-as-codescriptingcloud platformscontainer orchestrationscalabilityperformance optimizationincident management

problem-solvingcommunicationindependenceproject managementmentoringcollaborationcontinuous improvementleadershipadaptabilityanalytical thinking

AWS Certified Solutions ArchitectAzure DevOpsKubernetes Certified Administrator