Underdog Fantasy

Senior Site Reliability Engineer – Infrastructure

Underdog Fantasy

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $160,000 - $240,000 per year

Job Level

About the role

  • Own and maintain the incident response process, including defining procedures, tools, and best practices
  • Guide teams in establishing and monitoring Service Level Objectives (SLOs), including setting up alerts and reporting systems
  • Lead capacity planning initiatives, focusing on both short and long-term scalability while optimizing costs
  • Develop and implement disaster recovery plans, including regular testing and regulatory compliance
  • Collaborate with teams on architecture decisions to ensure high availability and scalability
  • Manage launch and event planning for high-traffic occasions, focusing on infrastructure preparation and capacity management (a.k.a. Launch Readiness)
  • Act as an internal expert and consultant for monitoring tools like Datadog and Pagerduty and infrastructure like AWS and Kubernetes
  • Emphasis on automation and tooling to scale our workload
  • Contribute across codebases in Ruby, Python, Go, TypeScript, Swift, and Kotlin as needed to support the initiatives described above.

Requirements

  • A strong written and verbal communicator
  • Collaborative by nature
  • Someone who enjoys using research, data, and experiments to make decisions; you believe “Hope is not a strategy.”
  • You enjoy working directly with customers (generally engineers or other people inside the company)
  • You think long-term about what is best for the business and its customers
  • You are excited to take ownership
  • You are very comfortable around an IDE, working with multiple languages, multiple web application frameworks, AWS services, Kubernetes, PostgreSQL
  • You can work independently to learn new languages/technologies as needed
  • You enjoy deploying changes to production quickly, multiple times a week if necessary
Benefits
  • Unlimited PTO (we're extremely flexible with the exception of the first few weeks before & into the NFL season)
  • 16 weeks of fully paid parental leave
  • Home office stipend
  • A connected virtual first culture with a highly engaged distributed workforce
  • 5% 401k match, FSA, company paid health, dental, vision plan options for employees and dependents

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
RubyPythonGoTypeScriptSwiftKotlindisaster recoverycapacity planningautomationmonitoring
Soft skills
written communicationverbal communicationcollaborationdata-driven decision makingcustomer engagementownershipindependencelong-term thinkingadaptabilityproblem-solving