Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Everbridge

Site Reliability Specialist – Observability, Kubernetes

Everbridge

Senior Platform Security Engineer embedding security directly into Everbridge's cloud platform at scale. Supporting observability and security operations to enhance organizational resilience.

Posted 5/2/2026full-timeRemote • 🇺🇸 United StatesMid-LevelSenior💰 $118,700 - $145,000 per yearWebsite

Tech Stack

Tools & technologies
GrafanaKubernetesTerraform

About the role

Key responsibilities & impact
  • Head the design, operation, and evolution of Everbridge’s observability stack
  • Build and maintain a highly available, scalable observability platform
  • Standardize instrumentation, dashboards, alerts, and SLOs
  • Support incident response, root cause analysis, and capacity planning
  • Operate and scale Grafana and technology
  • Maintain reliability and security of EKS clusters running observability
  • Manage cluster lifecycle and upgrades
  • Terraform for infrastructure provisioning
  • Gitlab CI/CD at Scale

Requirements

What you’ll need
  • 6+ years in SRE / Platform Engineering
  • Strong Grafana ecosystem experience
  • Kubernetes and Amazon EKS expertise
  • Terraform proficiency
  • OpenTelemetry experience (preferred)
  • Large-scale observability systems (preferred)
  • Cost optimization experience (preferred)

Benefits

Comp & perks
  • healthcare
  • dental
  • parental planning
  • mental health benefits
  • disability income benefits
  • life and AD&D insurance
  • 401(k) plan and match
  • paid time off
  • fitness reimbursements

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
observabilityincident responseroot cause analysiscapacity planninginfrastructure provisioningGrafanaKubernetesAmazon EKSTerraformOpenTelemetry