
Site Reliability Engineer – SRE Team
Semrush
full-time
Posted on:
Location Type: Remote
Location: Spain
Visit company websiteExplore more
About the role
- Collaborate with development teams to design and implement scalable, reliable, and efficient system architectures
- Establish and refine SLOs in partnership with stakeholders to guarantee service reliability and performance
- Read and write code in Python/Go
- Induce application failure and work to recover it from that state
- Debug applications using metrics and add traces/metrics as needed
- Participate in on-call duties to provide constant support
- Lead the changes in common engineering practices in the Company
- Possible night shifts (on-call)
Requirements
- 3+ years of experience as a Site Reliability Engineer
- Experience with Kubernetes, Helm, Cloud providers
- Experience with coding in Python/Go
- Strong understanding of what an application failure is and how to handle it
- Ability to debug applications using metrics
- Familiarity with traces and the ability to implement them in an application
- Willingness to be on call and work flexible hours
- Team player with good communication abilities
- GCP knowledge (not required, but a plus)
Benefits
- Flexible working hours
- Unlimited PTO
- Flexi Benefit for your hobby
- Employee Support Program
- Loss of family member financial aid
- Employee Resource Groups
- Meals, snacks, and drinks at the office
- Corporate events
- Teambuilding
- Training, courses, conferences
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonGodebuggingapplication failure handlingmetricstracesSLOsscalable system architecturereliable system architectureefficient system architecture
Soft Skills
team playercommunicationleadershipcollaborationflexibilityproblem-solvingsupporton-call dutiesstakeholder engagementengineering practices