Lirio

Senior System Reliability Engineer

Lirio

full-time

Posted on:

Location Type: Hybrid

Location: TennesseeUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $130,000 - $150,000 per year

Job Level

About the role

  • Reliability Engineering & Automation
  • Develop and manage infrastructure as code (e.g., Terraform, AWS CloudFormation).
  • Review infrastructure changes, automation scripts, and reliability-impacting code changes to ensure production readiness.
  • Monitor system health using modern observability tools (e.g., Prometheus, Grafana, Datadog).
  • Lead incident response, root cause analysis, and postmortems for production issues.

Requirements

  • 5-7 years related experience
  • Bachelor's Degree in related field
  • Linux systems and networking fundamentals (DNS, TCP/IP, TLS)
  • Distributed systems debugging and failure analysis
  • Load, stress, and fault-injection testing
  • CI/CD tools and processes
  • Version control (e.g., Git)
  • Cloud platforms (e.g., AWS, Azure)
  • Containers and orchestration (Kubernetes)
  • Kafka (messaging/streaming)
  • Scripting and programming languages (e.g., Java, TypeScript, Groovy, Python)
  • Agile methodologies (e.g., Scrum, XP, SAFe)
  • Databases/SQL
  • Observability/monitoring tools (DataDog)
Benefits
  • Medical (HSA available)
  • Dental
  • Vision
  • Short-term & long-term disability (company-paid)
  • Life & AD&D (company-paid)
  • 401K with company match
  • 10 paid holidays, quarterly company closure dates, + holiday week company closure
  • Flexible time off policy
  • Work from home
  • 6 weeks paid parental leave
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
infrastructure as codeTerraformAWS CloudFormationLinux systemsnetworking fundamentalsdistributed systems debuggingload testingCI/CDcontainersscripting languages
Soft Skills
incident responseroot cause analysispostmortemsleadership
Certifications
Bachelor's Degree