Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Cribl

Senior Site Reliability Engineer

Cribl

. Engage with teams and improve service delivery and reliability across their entire lifecycle .

Posted 5/15/2026full-timeRemote • 🇵🇱 PolandSeniorWebsite

Tech Stack

Tools & technologies
AnsibleAWSAzureCloudGrafanaJavaScriptLinuxNode.jsPrometheusSplunkTerraformTypeScript

About the role

Key responsibilities & impact
  • Engage with teams and improve service delivery and reliability across their entire lifecycle
  • Measure and monitor all production systems with an eye towards availability, latency and overall system health
  • Seek out the cause of errors and instability in our production cloud services and drive teams towards better operational excellence
  • Engage with product and platform teams to improve and evolve systems by lobbying for changes that improve reliability, resilience, and observability
  • Help identify and drive down toil with creative innovation and automation
  • This position will require stand-by, on-call, or off-hours duties

Requirements

What you’ll need
  • Proven experience designing, implementing, and operating observability systems for complex cloud-based platforms
  • Experience with Configuration Management and Infrastructure as a Code Tools like Terraform (preferred) or Ansible
  • Knowledge of cloud platforms (prefer AWS and Azure)
  • Experience with APM and Observability and related tools such as, New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, Sentry etc.
  • Extensive experience with enterprise scale continuous delivery environments
  • Development with JavaScript/Node.js/TypeScript in a Linux/Mac environment
  • Experience with sustainable incident response in a blameless environment
  • Background in Linux Systems Engineering
  • Experience with Incident response related tools for instance, PagerDuty, FireHydrant, Blameless etc.
  • Comfortable with a high level of autonomy and working with a distributed team
  • Knowledge of Cloud and application security best practices
  • Strong knowledge of cloud design patterns for scale, data management, resiliency, etc.
  • A love for high quality and a knack for testing
  • Opinions about business metrics, and SLOs

Benefits

Comp & perks
  • Diversity drives innovation and better decisions
  • Remote-first culture
  • Welcoming and valuing differences

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
observability systemsConfiguration ManagementInfrastructure as CodeTerraformAnsiblecloud platformsAWSAzureJavaScriptNode.js
Soft Skills
service deliveryoperational excellencecreative innovationautonomydistributed team collaborationincident responseblameless environmenthigh qualitytestingbusiness metrics