
SRE – Incident Response
XBOW
full-time
Posted on:
Location Type: Remote
Location: Ireland
Visit company websiteExplore more
About the role
- Keeping XBOW’s production systems stable, observable, and resilient as the product scales.
- Building and maintaining automated reliability tooling, covering monitoring, alerting, and self healing.
- Defining and tracking service level goals for both production and development environments.
- Close collaboration with infrastructure and feature teams to manage cloud systems through IaC.
- Conducting root-cause investigations and incident analysis across the organization.
- Helping maintain internal and customer-facing status dashboards that clearly communicate system health and uptime.
Requirements
- Strong experience with TypeScript
- Hands-on experience with AWS
- Solid expertise in Linux, plus experience with infrastructure & DevOps tooling such as Kubernetes, Docker, Terraform, and CI/CD pipelines (especially GitHub Actions)
- Background in infrastructure automation and/or incident response (depth may vary by candidate)
- Familiarity with monitoring and observability tools such as OpenTelemetry, Prometheus, VictoriaMetrics, Grafana, and Datadog
- Experience with Python and/or Go (advantageous)
- Experience with additional cloud providers beyond AWS (advantageous)
Benefits
- Competitive salary and a generous equity package, making you a true owner of the company.
- Shape your role, lead the function, and grow with the company as we redefine cybersecurity.
- You will tackle technically complex challenges and play a pivotal role in the growth of our business, working alongside an amazing team and some of the world’s experts to shape how AI transforms cybersecurity.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
TypeScriptAWSLinuxKubernetesDockerTerraformCI/CDGitHub ActionsPythonGo
Soft Skills
collaborationincident analysisroot-cause investigationcommunication