Oscilar

Sr./Staff Infrastructure/Site Reliability Engineer, SRE

Oscilar

full-time

Posted on:

Origin:  • 🇵🇱 Poland

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AWSCloudDistributed SystemsGoJavaKafkaKubernetesMicroservicesTerraform

About the role

  • Take ownership of reliability across a multi-region, cloud-native platform powering Oscilar's AI Risk Decisioning™ platform.
  • Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes).
  • Lead initiatives to improve availability, latency, and performance at scale.
  • Design and evolve CI/CD pipelines for speed, safety, and repeatability.
  • Define metrics, alerts, and runbooks forming the observability backbone.
  • Run chaos experiments and failure simulations to harden the platform.
  • Mentor engineers and set SRE best practices across the company.

Requirements

  • Proven track record as a senior SRE, DevOps, or infrastructure engineer in high-scale environments.
  • Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform).
  • Strong programming ability in Go and Java.
  • Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture.
  • Mastery of container orchestration (Kubernetes) and production debugging.
  • Strong sense of ownership and judgment to balance velocity with reliability.