Staff Site Reliability Engineer – Volcano

Kong Inc.

Staff Site Reliability Engineer for Kong's Volcano platform overseeing reliability and infrastructure scaling. Collaborating on SRE practices and emerging technology evaluations.

Posted 6/22/2026full-timeRemote • 🇺🇸 United StatesLead💰 $150,000 - $210,000 per yearWebsite

Tech Stack

Tools & technologies

GrafanaKubernetesPostgresPrometheusRedisTerraform

About the role

Key responsibilities & impact

Own reliability for Volcano end-to-end: Define and drive SLOs, error budgets, and incident response practices for all Volcano services — edge deployments, managed Postgres, auth, realtime, storage, and the control plane.
Architect the platform's infrastructure: Design and build the multi-region Kubernetes infrastructure, networking, and data plane that powers Volcano's edge deployment pipeline and backend-as-a-service capabilities.
Build the GitOps and CI/CD backbone: Establish deployment automation, canary pipelines, and preview environment provisioning using ArgoCD, Helm, and Terraform/Terragrunt — setting patterns the broader team will follow.
Scale managed data services: Design, operate, and harden multi-tenant PostgreSQL clusters, Redis caching layers, and object storage — with a focus on data isolation, performance, and disaster recovery.
Drive observability from day one: Instrument every Volcano service with meaningful SLIs; build dashboards, alerts, and runbooks using Datadog, Prometheus, and Grafana before services go live, not after incidents.
Lead cross-functional reliability work: Collaborate with the OCTO team, product engineering, and security to bake reliability and compliance into Volcano's architecture — not bolt it on later.
Set SRE culture and standards: Mentor engineers across Volcano's contributing teams on reliability principles; lead postmortems, define on-call practices, and build a blameless engineering culture.
Evaluate and adopt emerging technologies: Given Volcano's greenfield nature, evaluate and make architectural decisions on edge runtimes, serverless compute, vector databases, and AI-native infrastructure components.

Requirements

What you’ll need

BS in Computer Science or equivalent; substantial experience at Staff or Principal IC level in SRE/Platform Engineering.
Proven track record building SRE or platform engineering practices for developer-facing platforms or PaaS/SaaS products — ideally at greenfield stage.
Deep Kubernetes expertise: multi-tenant cluster design, networking (CNI, service mesh, ingress), autoscaling, and security hardening.

Benefits

Comp & perks

healthcare benefits
401(k) plan
short and long term disability benefits
basic life and AD&D insurance

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

KubernetesPostgreSQLRedisGitOpsCI/CDTerraformArgoCDHelmDatadogPrometheus

Soft Skills

leadershipcollaborationmentoringincident responsereliability engineeringblameless culturecross-functional teamworkcommunication

Certifications

BS in Computer Science