Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Finom

Senior Site Reliability Engineer

Finom

Senior SRE Engineer at Finom driving the design and implementation of a Kubernetes-based platform. Focused on reliability and scalability in a high-load, multi-cloud environment.

Posted 5/27/2026full-timeRemote • 🇧🇬 BulgariaSeniorWebsite

Tech Stack

Tools & technologies
AWSGoogle Cloud PlatformGrafanaKubernetesPrometheusTerraform

About the role

Key responsibilities & impact
  • Lead the Platform Evolution: Design and operate our Kubernetes ecosystem (GKE, multi-cluster) with a focus on high availability and zero-downtime operations.
  • Build "Paved Roads": Own and evolve our PaaS strategy, using GitOps (ArgoCD) and CI/CD (GitLab) to empower domain teams to deploy independently.
  • Architect Reliability: Define and implement our observability strategy across metrics, logs, and tracing (Prometheus, VictoriaMetrics, OpenTelemetry).
  • Drive Infrastructure-as-Code: Lead the automation of our infrastructure using Terraform, ensuring all resources are standardized and version-controlled.
  • Own the Error Budget: Partner with engineering teams to establish and manage SLOs, SLAs, and incident management frameworks.
  • Disaster Recovery Mastery: Design and participate in regular DR drills, implementing blue/green and active/passive strategies across regions to ensure service continuity.
  • Innovate Operations: Proactively apply AI-driven approaches to improve operational efficiency and automated bottleneck detection.

Requirements

What you’ll need
  • Strong hands-on experience managing Kubernetes (GKE preferred) in high-load, multi-cluster production environments
  • Deep experience with GCP (AWS is a strong plus) and Terraform for large-scale infrastructure
  • Solid experience with ArgoCD, GitLab CI, and the "Infrastructure as Code" philosophy
  • Deep knowledge of the Prometheus/Grafana stack and implementing tracing/logging at scale
  • Proven ability to design highly available 24/7 systems with automated failover and rollback capabilities
  • English level B2+ for effective cross-functional communication

Benefits

Comp & perks
  • Make a genuine impact on the product
  • Work in the EU
  • Become a stock options holder
  • Receive unwavering support and care
  • Work & Swim program
  • Equal Opportunity Statement

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesGKEGitOpsArgoCDCI/CDGitLabTerraformPrometheusOpenTelemetryInfrastructure as Code
Soft Skills
cross-functional communicationleadershipproblem-solvingcollaborationproactive approach