DraftKings Inc.

Lead Site Reliability Engineer

DraftKings Inc.

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $148,000 - $185,000 per year

Job Level

About the role

  • Lead SRE initiatives across multiple projects and products, collaborating with cross-functional teams to shape platform and infrastructure engineering efforts across the organization.
  • Drive technical excellence by mentoring and guiding engineers, fostering a culture of continuous learning and innovation.
  • Architect and automate self-healing, fault-tolerant infrastructure with declarative configurations, GitOps, and event-driven automation for scalable deployments across public clouds and on-premise.
  • Design, develop, and maintain software-driven infrastructure automation to build internal tools and eliminate repetitive operational tasks.
  • Own and drive decisions on product deployment, performance tuning, monitoring, and alerting to ensure high availability and system efficiency in production.
  • Define key metrics and SLAs around new web services being created to support our rapid traffic growth.
  • Design and implement monitoring and alerting strategies to enforce application SLAs.

Requirements

  • At least 6 years of experience managing distributed cloud environments (GCP, AWS, vSphere, Nutanix) and platform automation at scale.
  • Deep expertise in container orchestration (Kubernetes) and container runtimes (Docker, containers), with the ability to design, scale, and troubleshoot complex workloads.
  • Expert-level understanding of networking and web concepts, with the ability to debug issues down to the packet level.
  • Strong experience developing software for automation and infrastructure tooling (Go, Python).
  • Strong understanding of Linux-based operating systems, including performance tuning, bootloaders, storage, partitioning, kernel debugging, and low-level system optimizations.
  • Experience with Infrastructure as Code (IaC) and configuration management tools (Terraform, Ansible, Chef, etc.), ensuring scalable and repeatable infrastructure provisioning.
  • Understanding of applications written in various programming languages (C#/.NET, Java, Elixir, Ruby, etc).
  • Experience in AWS Greengrass IoT management and A/B booting.
Benefits
  • bonus
  • equity
  • benefits as applicable
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GitOpsevent-driven automationinfrastructure automationKubernetesDockerGoPythonTerraformAnsibleChef
Soft Skills
mentoringcollaborationcontinuous learninginnovationdecision making