Senior Site Reliability Engineer

Coterie

Senior Site Reliability Engineer at Coterie Insurance, responsible for managing Azure infrastructure and enhancing CI/CD processes. Join a mission-driven team focusing on small business insurance solutions.

Posted 5/17/2026full-timeRemote • 🇺🇸 United StatesSenior💰 $140,000 - $170,000 per yearWebsite

ATS Keywords

Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills

AzureKubernetesAKSCI/CD pipelinesGitHub ActionsGrafanaPrometheusLokiinfrastructure as codescripting

Soft Skills

analytical skillscommunication skillscollaborationincident managementproblem-solving

Tools & Technologies

Azure Kubernetes ServiceGitHub ActionsGrafanaPrometheusLoki

Industry Keywords

Site Reliability EngineeringDevOpscloud-based infrastructurecapacity planningperformance tuning

Tech Stack

Tools & technologies

AzureCloudDNSGrafanaKubernetesPrometheusPython

About the role

Key responsibilities & impact

Manage and maintain cloud infrastructure on Azure, including Azure Kubernetes Service (AKS) clusters and supporting resources
Build, improve, and maintain CI/CD pipelines using GitHub Actions to support reliable and repeatable deployments
Own and enhance our Grafana implementation; designing dashboards, configuring alerts, and supporting incident management workflows
Monitor system health, triage incidents, and drive root cause analysis to prevent recurrence
Collaborate with development teams to define and track SLIs, SLOs, and error budgets that align with business goals
Contribute to infrastructure-as-code practices using Pulumi
Identify and resolve reliability risks through capacity planning, performance tuning, and proactive system improvements
Participate in an on-call rotation to support production systems and respond to incidents
Document runbooks, operational procedures, and architectural decisions to support team knowledge sharing

Requirements

What you’ll need

5+ years of experience in a Site Reliability Engineering, DevOps, or Infrastructure role
3+ years experience working with infrastructure as code
2+ years of experience architecting CI/CD pipelines and cloud-based infrastructure
Strong hands-on experience with: Azure Cloud services and resource management
Kubernetes and AKS administration, including deployments, networking, and troubleshooting
GitHub Actions for CI/CD pipeline development and maintenance
3+ experience with Grafana or similar tooling, including dashboard creation, alerting configuration, and incident management
Hands-on experience with Prometheus, Loki, or other observability tools in the Grafana ecosystem
Proficiency in at least one scripting or programming language such as Python or Bash
Understanding of networking fundamentals, DNS, load balancing, and container orchestration concepts
Strong analytical and communication skills; able to diagnose complex system issues and clearly communicate findings
Demonstrated ability to collaborate across teams and contribute to a culture of reliability
Experience working in an agile environment with modern DevOps practices

Benefits

Comp & perks

100% remote
Health insurance through Aetna (we pay 100% of premiums)
Dental and vision insurance through Guardian (we pay 100% of premiums)
Basic life insurance (we pay 100% of premiums)
Access to flexible spending account (FSA) or health savings account (HSA) (for those using HSA eligible plans)
401K plan (up 4% match with immediate vest).
Must be 21 years of age or older to participate
Flexible PTO policy offering employees up to 4 weeks of PTO in their first 12 months. Thereafter, PTO usage aligns with company standards and typically does not exceed 5 weeks per calendar year.
12 company-paid holidays each year
Continuing education annual stipend