
Senior System Reliability Engineer
Lirio
full-time
Posted on:
Location Type: Hybrid
Location: Tennessee • United States
Visit company websiteExplore more
Salary
💰 $130,000 - $150,000 per year
Job Level
Tech Stack
About the role
- Reliability Engineering & Automation
- Develop and manage infrastructure as code (e.g., Terraform, AWS CloudFormation).
- Review infrastructure changes, automation scripts, and reliability-impacting code changes to ensure production readiness.
- Monitor system health using modern observability tools (e.g., Prometheus, Grafana, Datadog).
- Lead incident response, root cause analysis, and postmortems for production issues.
Requirements
- 5-7 years related experience
- Bachelor's Degree in related field
- Linux systems and networking fundamentals (DNS, TCP/IP, TLS)
- Distributed systems debugging and failure analysis
- Load, stress, and fault-injection testing
- CI/CD tools and processes
- Version control (e.g., Git)
- Cloud platforms (e.g., AWS, Azure)
- Containers and orchestration (Kubernetes)
- Kafka (messaging/streaming)
- Scripting and programming languages (e.g., Java, TypeScript, Groovy, Python)
- Agile methodologies (e.g., Scrum, XP, SAFe)
- Databases/SQL
- Observability/monitoring tools (DataDog)
Benefits
- Medical (HSA available)
- Dental
- Vision
- Short-term & long-term disability (company-paid)
- Life & AD&D (company-paid)
- 401K with company match
- 10 paid holidays, quarterly company closure dates, + holiday week company closure
- Flexible time off policy
- Work from home
- 6 weeks paid parental leave
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
infrastructure as codeTerraformAWS CloudFormationLinux systemsnetworking fundamentalsdistributed systems debuggingload testingCI/CDcontainersscripting languages
Soft Skills
incident responseroot cause analysispostmortemsleadership
Certifications
Bachelor's Degree