
Principal SRE
Chalkboard
full-time
Posted on:
Location Type: Hybrid
Location: New York City • New York • United States
Visit company websiteExplore more
Salary
💰 $230,000 - $250,000 per year
Job Level
About the role
- Own platform reliability end-to-end, proactively identifying and mitigating risks before they impact users
- Build and evolve observability (metrics, logs, tracing) to enable fast detection, diagnosis, and resolution of issues
- Scale infrastructure ahead of demand by identifying bottlenecks and implementing durable architecture improvements
- Reduce developer friction by improving CI/CD pipelines, deployment workflows, and internal tooling
- Lead incident response and root cause analysis, driving systemic fixes—not just short-term patches
- Establish and enforce best practices for infrastructure, deployments, and system reliability
- Build reusable, self-service infrastructure that enables teams to ship quickly and safely
- Continuously improve systems through automation and Infrastructure-as-Code
Requirements
- Cloud Infrastructure (GCP preferred): networking, IAM, databases, storage
- Kubernetes: cluster operations and workload management
- Infrastructure as Code: Terraform, Helm
- CI/CD: GitHub Actions or similar
- Observability: metrics, logging, tracing, alerting
- 8+ years of experience in SRE, platform engineering, or infrastructure roles
- Strong experience with distributed systems and backend architectures
- Proven ability to improve system reliability, scalability, and performance
- Experience building and improving CI/CD pipelines and deployment workflows
- Strong debugging skills using data (logs, metrics, traces)
- Experience leading incident response and driving root cause analysis
- Ability to make pragmatic tradeoffs between speed, reliability, and scale
- Experience partnering across engineering teams to improve developer velocity
Benefits
- Comprehensive medical, dental, and vision coverage starting day 1, with the majority of premiums covered by Chalkboard
- 401(k) with company match
- Lunch on us everyday with a corporate DoorDash account
- Refuel in the office with protein shakes, energy drinks, and a snack buffet
- Flexible time off policy, plus 10 company holidays, WFH during the holidays
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Cloud InfrastructureKubernetesInfrastructure as CodeTerraformHelmCI/CDGitHub ActionsObservabilitydistributed systemsbackend architectures
Soft Skills
incident responseroot cause analysissystem reliabilityscalabilityperformance improvementdebuggingpragmatic tradeoffscollaborationleadershipproblem-solving