Senior Site Reliability Engineer

Cribl

Senior Site Reliability Engineer at Cribl, enhancing observability data for improved incident response. Engaging with teams for reliable service delivery and operational excellence in a remote setting.

Posted 5/15/2026full-timeRemote • 🇵🇱 PolandSeniorWebsite

Tech Stack

Tools & technologies

AnsibleAWSAzureCloudGrafanaJavaScriptLinuxNode.jsPrometheusSplunkTerraformTypeScript

About the role

Key responsibilities & impact

Engage with teams and improve service delivery and reliability across their entire lifecycle
Measure and monitor all production systems with an eye towards availability, latency and overall system health
Seek out the cause of errors and instability in our production cloud services and drive teams towards better operational excellence
Engage with product and platform teams to improve and evolve systems by lobbying for changes that improve reliability, resilience, and observability
Help identify and drive down toil with creative innovation and automation
This position will require stand-by, on-call, or off-hours duties

Requirements

What you’ll need

Proven experience designing, implementing, and operating observability systems for complex cloud-based platforms
Experience with Configuration Management and Infrastructure as a Code Tools like Terraform (preferred) or Ansible
Knowledge of cloud platforms (prefer AWS and Azure)
Experience with APM and Observability and related tools such as, New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, Sentry etc.
Extensive experience with enterprise scale continuous delivery environments
Development with JavaScript/Node.js/TypeScript in a Linux/Mac environment
Experience with sustainable incident response in a blameless environment
Background in Linux Systems Engineering
Experience with Incident response related tools for instance, PagerDuty, FireHydrant, Blameless etc.
Comfortable with a high level of autonomy and working with a distributed team
Knowledge of Cloud and application security best practices
Strong knowledge of cloud design patterns for scale, data management, resiliency, etc.
A love for high quality and a knack for testing
Opinions about business metrics, and SLOs

Benefits

Comp & perks

Diversity drives innovation and better decisions
Remote-first culture
Welcoming and valuing differences

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

observability systemsConfiguration ManagementInfrastructure as CodeTerraformAnsiblecloud platformsAWSAzureJavaScriptNode.js

Soft Skills

service deliveryoperational excellencecreative innovationautonomydistributed team collaborationincident responseblameless environmenthigh qualitytestingbusiness metrics