
Lead Site Reliability Engineer
DraftKings Inc.
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $148,000 - $185,000 per year
Job Level
Tech Stack
About the role
- Lead SRE initiatives across multiple projects and products, collaborating with cross-functional teams to shape platform and infrastructure engineering efforts across the organization.
- Drive technical excellence by mentoring and guiding engineers, fostering a culture of continuous learning and innovation.
- Architect and automate self-healing, fault-tolerant infrastructure with declarative configurations, GitOps, and event-driven automation for scalable deployments across public clouds and on-premise.
- Design, develop, and maintain software-driven infrastructure automation to build internal tools and eliminate repetitive operational tasks.
- Own and drive decisions on product deployment, performance tuning, monitoring, and alerting to ensure high availability and system efficiency in production.
- Define key metrics and SLAs around new web services being created to support our rapid traffic growth.
- Design and implement monitoring and alerting strategies to enforce application SLAs.
Requirements
- At least 6 years of experience managing distributed cloud environments (GCP, AWS, vSphere, Nutanix) and platform automation at scale.
- Deep expertise in container orchestration (Kubernetes) and container runtimes (Docker, containers), with the ability to design, scale, and troubleshoot complex workloads.
- Expert-level understanding of networking and web concepts, with the ability to debug issues down to the packet level.
- Strong experience developing software for automation and infrastructure tooling (Go, Python).
- Strong understanding of Linux-based operating systems, including performance tuning, bootloaders, storage, partitioning, kernel debugging, and low-level system optimizations.
- Experience with Infrastructure as Code (IaC) and configuration management tools (Terraform, Ansible, Chef, etc.), ensuring scalable and repeatable infrastructure provisioning.
- Understanding of applications written in various programming languages (C#/.NET, Java, Elixir, Ruby, etc).
- Experience in AWS Greengrass IoT management and A/B booting.
Benefits
- bonus
- equity
- benefits as applicable
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GitOpsevent-driven automationinfrastructure automationKubernetesDockerGoPythonTerraformAnsibleChef
Soft Skills
mentoringcollaborationcontinuous learninginnovationdecision making