
Senior Software Engineer – Grafana Databases, SRE
Grafana Labs
full-time
Posted on:
Location Type: Remote
Location: Sweden
Visit company websiteExplore more
Salary
💰 SEK 775,444 - SEK 930,533 per year
Job Level
About the role
- Help support our highest value Grafana Cloud customers by increasing the reliability of our Cloud databases
- Own production reliability for high-SLA and complex customer environments
- Design and implement automation to scale our reliability practices
- Ensure our customers meet our SLO targets
- Define and evolve per-tenant SLOs and reliability models
- Proactively reduce SLO burn to prevent repeat incidents
- Serve as a primary escalation point and on-call for relevant incidents
- Lead customer-impacting incident response and post-incident reviews
- Contribute to design docs and code reviews
- Influence feature design to ensure production scalability and operability
- Build automation to eliminate toil where needed
- Improve alert quality and reduce noisy escalations
Requirements
- 6+ years engineering experience, 3+ in SRE/CRE/production engineering. Strong preference for those with formal customer reliability engineering experience.
- Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.).
- Experience operating multi-tenant systems in production
- Strong experience designing and implementing SLOs
- Experience with one or more programming languages (e.g. Go, Python, Java, etc)
- Experience with Linux operating systems internals, and some knowledge of networking, cloud storage, and scaling.
- Excellent problem-solving and troubleshooting skills.
- Experience with calmly and actively participating in blame-free Incident Response, following up on actions, and writing high quality PIRs (Post Incident Reviews, a.k.a. post-mortem documents)
- Ability to reason about performance, scaling, and failure modes
- Comfortable working within an engineering team where individuals are encouraged to have a strong sense of autonomy and self-direction.
- Ability to partner deeply with product engineering teams
- We highly value those who are intellectually curious, who default to transparency, possess a high bias towards action, and who are also kind (this is important!)
Benefits
- 100% Remote, Global Culture
- Scaling Organization
- Transparent Communication
- Innovation-Driven
- Open Source Roots
- Empowered Teams
- Career Growth Pathways
- Approachable Leadership
- Passionate People
- In-Person onboarding
- Balance is Key - 30 days of annual leave
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
SRECREproduction engineeringKubernetesAWSGCPAzureinfrastructure-as-codeSLO designprogramming languages
Soft Skills
problem-solvingtroubleshootingincident responseautonomyself-directioncollaborationintellectual curiositytransparencybias towards actionkindness