
Senior Site Reliability Engineer – Production Engineering
Yelp
full-time
Posted on:
Location Type: Remote
Location: Ireland
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Working with engineers across Yelp in supporting new features and services.
- Integrating tools to monitor platform stability and performance.
- Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.
- Ensure the reliability of Yelp’s primary datastores (MySQL and Cassandra).
- Troubleshoot site issues using industry-leading tools like Splunk, Grafana, and Prometheus.
- Automate everything with Python, Puppet, Git, Jenkins, Terraform and more!
- Develop custom tools, when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects.
- Design and implement new systems, tests, and procedures.
- Foster and build a fun, diverse, and inclusive culture that reflects Yelp’s values.
- Participate in light on-call rotations - we have geographically distributed SRE teams for follow-the-sun support, which reduces the need to be on-call 24h a day!
Requirements
- Mastery of Linux (we use Ubuntu but any distro is fine), with the view of debugging ambiguous OS behaviours!!
- Command of your favorite modern programming language to appreciate delivering safe and secure services: Python, Typescript, Ruby, Go, Rust, Java, C++, etc.
- A solid understanding of Internet fundamental technologies in delivering services on the Internet (TCP/IP, HTTP, DNS, etc).
- Experience with public cloud platforms (we use AWS and GCP, but others are also fine) and related tooling (Terraform, Puppet, Chef, Ansible etc.).
- Experience with Linux containerisation and orchestration (e.g., Docker, Podman and Kubernetes).
- Self-motivated to investigate, fix and improve Yelp in an ever changing environment.
- Leading, Collaborating and Sharing technical activities with teams.
- Own the total lifecycle of a system.
Benefits
- Competitive salary, a pension scheme, and an optional employee stock purchase plan.
- 25 days paid holiday (rising to 29 with service), plus one floating holiday.
- €150 monthly reimbursement to help cover remote working expenses.
- €95 caregiver reimbursement to support dependent care for families.
- Private health insurance, including dental and vision.
- Flexible working hours and meeting-free Wednesdays.
- Regular 3-day Hackathons, bi-weekly learning groups, and productivity spending to support and encourage your career growth.
- Opportunities to participate in digital events and conferences.
- €95 per month to use toward qualifying wellness expenses.
- Quarterly team offsites.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
LinuxPythonTypescriptRubyGoRustJavaC++TCP/IPHTTP
Soft Skills
self-motivatedcollaborationleadershipcommunicationproblem-solvinginvestigationadaptabilityteamworkinclusivityculture building