Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Epidemic Sound

Site Reliability Engineer

Epidemic Sound

Site Reliability Engineer at Epidemic Sound ensuring the platform's reliability and scalability while collaborating with product teams. Responsible for CI/CD, traffic management, and observability.

Posted 6/26/2026full-timeStockholm • 🇸🇪 SwedenMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
CloudDistributed SystemsFirewallsKubernetesLinuxTerraformUnix

About the role

Key responsibilities & impact
  • Build and operate the platform our services run on - GKE clusters, the controllers that extend them, and the Terraform that defines our cloud.
  • Own the path from commit to production - CI/CD, GitOps, and the progressive-delivery patterns that turn a merge into a safe release.
  • Strengthen the networking and routing layer - traffic management on top of the VPC, firewalls, and network policies that keep it safe and predictable.
  • Govern access and guardrails - IAM across every layer, policy-as-code, and break-glass paths - so teams move fast within safe defaults rather than waiting on tickets.
  • Grow reliability and observability - alert hygiene, runbooks, SLOs, and the metrics and tracing that show how the platform behaves in production.
  • Enable product teams and raise the bar - make production readiness the default, and drive healthy adoption of the standards and docs you would rather share than gatekeep.

Requirements

What you’ll need
  • Kubernetes fundamentals: a solid grasp of controllers, core components, and CNI and networking - depth in the domain matters more than any single tool (GKE a plus).
  • Infrastructure as code and delivery: Terraform, Helm or Kustomize, CI/CD and GitOps (ArgoCD), and the traffic-management and progressive-delivery mechanisms that move releases out safely.
  • Networking and access: routing fundamentals, the VPC, firewall, and network-policy primitives beneath it, and IAM and access management at different levels.
  • Operational depth: monitoring fundamentals (a clear view of when to reach for metrics versus tracing, and experience with an open-source observability stack), strong troubleshooting across distributed systems, and solid Unix/Linux.
  • Agentic development mindset: you use AI agents actively in your own work, knowing where they add leverage and where human judgement is non-negotiable.
  • Collaboration and judgement: you do your best work on large, cross-cutting projects, communicate openly, and stay opinionated but open to discussion - reaching for the right tool over your own creation.

Benefits

Comp & perks
  • Equal opportunity employer
  • We value diversity and encourage everyone to come and soundtrack the world with us.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesTerraformCI/CDGitOpstraffic managementnetworkingmonitoringtroubleshootingUnixLinux
Soft Skills
collaborationjudgementcommunicationtroubleshootingagentic development mindset