SearchStax

Staff Site Reliability Engineer, AWS

SearchStax

full-time

Posted on:

Location: 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $170,000 - $240,000 per year

Job Level

Lead

Tech Stack

ApacheAWSCloudDistributed SystemsDockerEC2ElasticSearchGoGrafanaJenkinsKubernetesOpen SourcePrometheusPythonTerraform

About the role

  • Lead and own scaling AWS infrastructure to support thousands of servers and high-growth workloads.
  • Design and implement automation frameworks for provisioning, monitoring/logging, scaling, and recovery to minimize manual operations.
  • Continuously evaluate and tune systems for latency, throughput, and cost efficiency.
  • Build resilient, self-healing, and observable systems using SLOs, error budgets, and reliability best practices.
  • Partner closely with Development, QA, and Product Engineering teams to deliver highly available and performant systems.
  • Own on-call processes, lead incident management and root-cause analysis, and implement preventive measures.
  • Mentor engineers, act as technical leader, and set standards for best practices.

Requirements

  • 7+ years in Site Reliability, DevOps, or Infrastructure Engineering roles.
  • Startup experience and track record of scaling infrastructure to thousands of servers.
  • Hands-on mastery of AWS services (EC2, EKS, RDS, S3, CloudFront, VPC, IAM).
  • Proficiency in Infrastructure as Code (Terraform, CloudFormation, or similar tools).
  • Strong automation and scripting skills in Python, Go, or similar languages (beyond basic scripting).
  • Expertise with monitoring & observability tools (Prometheus, Grafana, Loki, ELK/EFK, Datadog).
  • Experience with CI/CD and containers (Docker, Kubernetes, Jenkins or GitHub Actions).
  • Performance engineering experience: identify bottlenecks and optimize systems for scalability and efficiency.
  • Proven problem solving diagnosing complex production issues at scale.
  • Experience designing, deploying, and managing multi-region, highly available AWS architectures in production.
  • Experience owning end-to-end observability and leading production incident response/root-cause analysis.
  • Legal authorization to work in the United States (E-Verify and application questions indicate requirement).
Coates Group

Senior DevOps Engineer

Coates Group
Seniorfull-time$125k–$140k / yearIllinois · 🇺🇸 United States
Posted: 3 hours agoSource: jobs.lever.co
AWSCloudDockerIoTLinuxMicroservicesPython
Eduphoria! Inc.

AWS DevOps Engineer

Eduphoria! Inc.
Mid · Seniorfull-time$110k–$125k / yearFlorida, Illinois, Kansas, Maryland, North Carolina, Ohio, Tennessee, Texas, Virginia · 🇺🇸 United States
Posted: 17 hours agoSource: eduphoria.applytojob.com
AWSAzureCloudEC2LinuxMySQL.NETSQLTerraform
GEICO

DevOps Engineer II – FinTech Commissions, Substantiation

GEICO
Mid · Seniorfull-time$75k–$160k / yearDistrict of Columbia, Maryland, Texas, Virginia · 🇺🇸 United States
Posted: 18 hours agoSource: geico.wd1.myworkdayjobs.com
AWSAzureCloudDistributed SystemsJava.NETNoSQLPythonSQL
ParentSquare

Site Reliability Engineer

ParentSquare
Mid · Seniorfull-time$170k–$200k / year🇺🇸 United States
Posted: 18 hours agoSource: ats.rippling.com
AnsibleAWSAzureChefCloudDistributed SystemsDockerGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheus+4 more
Leidos

DevOps Technical Lead

Leidos
Seniorfull-time$105k–$189k / year🇺🇸 United States
Posted: 19 hours agoSource: leidos.wd5.myworkdayjobs.com
AWSCloudGrafanaJenkinsJMeterKafkaLinuxMavenSeleniumSplunkZookeeper