Grafana Labs

Staff Backend Engineer – Grafana Databases, Managed Services

Grafana Labs

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $174,986 - $209,983 per year

Job Level

About the role

  • Operating and evolving 100+ multi-cloud streaming clusters and related database infrastructure.
  • Diagnosing and eliminating cross-layer failure modes (e.g., object storage latency, noisy neighbors, control-plane bottlenecks, query performance regressions, etc.).
  • Designing safe upgrade and rollout strategies at scale.
  • Improving observability, automation, and operational ergonomics.
  • Partnering closely with database and platform teams to ensure safe scaling, partitioning, consumer fan-out, and query performance.
  • Working directly with distributed systems behavior, Kubernetes scheduling dynamics, storage engines, compression trade-offs, etc.
  • Serving as a primary escalation point and on-call for relevant incidents.
  • Owning the relationship with all system vendors, including WarpStream Labs and others.

Requirements

  • 8+ years of engineering experience, including meaningful time in SRE, platform engineering, production engineering, infrastructure engineering, or distributed systems roles.
  • Experience with high-throughput streaming systems, analytical or storage backends, or large-scale database infrastructure. Examples of these include Kafka, Redpanda, WarpStream, Postgres, ClickHouse, Snowflake, or Cassandra.
  • Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.).
  • Experience leading or driving complex technical efforts, even without formal management responsibilities
  • Ability to influence technical direction and align teams around reliability improvements.
  • Strong understanding of distributed systems failure modes in multi-cloud environments.
  • Proficiency in at least one systems-oriented language (Go preferred, but not required).
  • Working knowledge of Linux internals, networking, cloud storage, and performance/scaling behavior.
  • Experience participating in blameless incident response and writing high-quality post-incident reviews.
  • Clear communicator who can collaborate across teams and work autonomously.
  • Intellectually curious, transparent, action-oriented, and kind (this is important!).
Benefits
  • All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success.
  • We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.
  • 30 days annual leave.
  • 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesGoLinuxTerraformHelmJsonnetKafkaPostgresClickHouseCassandra
Soft Skills
clear communicatorcollaborationinfluence technical directionautonomous workintellectual curiositytransparencyaction-orientedkindness