Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
DOCOsoft

Site Reliability Engineer

DOCOsoft

Senior Azure Site Reliability Engineer focused on automation and continuous improvement of the Vew SaaS platform. Collaborating with Engineering and DevOps to enhance system reliability as it scales.

Posted 7/2/2026full-timeDublin • 🇮🇪 IrelandMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
AzureCloudDistributed SystemsDockerGrafanaKubernetesPrometheusTerraform

About the role

Key responsibilities & impact
  • Design, implement, and operate highly available, scalable, and fault-tolerant systems on Microsoft Azure.
  • Define, track, and improve reliability metrics, including service health indicators and operational performance reporting.
  • Develop and maintain Infrastructure as Code using Bicep, ARM, Terraform, or similar tooling to ensure consistent and reproducible environments.
  • Build automation for provisioning, deployment, scaling, and operational workflows, reducing manual intervention and operational toil.
  • Enhance CI/CD pipelines in collaboration with DevOps to improve deployment safety, reliability, and efficiency.
  • Implement and maintain monitoring, logging, tracing, and alerting solutions to ensure real-time visibility and rapid issue detection.
  • Define meaningful alerting strategies that reduce noise and improve response effectiveness.
  • Lead incident response activities, including structured troubleshooting, stakeholder communication, root cause analysis, and post-incident reviews.
  • Strengthen incident and problem management processes to improve SLA adherence and customer impact mitigation.
  • Implement systemic improvements to prevent repeat incidents rather than applying short-term workarounds.
  • Embed security and compliance best practices across infrastructure, including access control, encryption, and policy enforcement.
  • Drive continuous service improvement initiatives to enhance performance, reliability, efficiency, and operational maturity.
  • Collaborate closely with Development and QA teams to improve application resilience and supportability.

Requirements

What you’ll need
  • Proven experience as a Site Reliability Engineer or in a similar reliability-focused role within a SaaS or cloud-native environment.
  • Strong hands-on experience operating production workloads on Microsoft Azure across compute, networking, storage, and monitoring services.
  • Infrastructure as Code expertise using Bicep, ARM, Terraform, or similar tools.
  • Strong automation and scripting capability (PowerShell essential; additional scripting languages advantageous).
  • Experience working with containerised environments (Docker) and orchestration concepts such as Kubernetes.
  • Practical experience with observability tooling such as Azure Monitor, Grafana, Prometheus, Datadog, or OpenTelemetry.
  • Strong understanding of structured incident response, root cause analysis, SLA/SLO concepts, and reliability engineering practices.
  • Knowledge of security best practices and compliance standards such as ISO27001, SOC 2, and GDPR.
  • Strong problem-solving capability with the ability to troubleshoot complex, distributed systems.
  • Effective communication skills and ability to collaborate across engineering, operations, and business stakeholders.
  • Azure certifications (e.g., Azure Administrator Associate, Azure Solutions Architect Expert) are desirable.

Benefits

Comp & perks
  • 25 days Annual Leave
  • Private pension
  • Bonus scheme
  • Private health
  • Life assurance

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Site Reliability EngineeringProduction Workloads ManagementContainerization (Docker)Kubernetes OrchestrationReliability Engineering Practices
Soft Skills
Effective CommunicationCollaboration Across TeamsProblem-Solving Capability
Certifications
Azure Administrator AssociateAzure Solutions Architect Expert