Semrush

Site Reliability Engineer – SRE Team

Semrush

full-time

Posted on:

Location Type: Remote

Location: Spain

Visit company website

Explore more

AI Apply
Apply

About the role

  • Collaborate with development teams to design and implement scalable, reliable, and efficient system architectures
  • Establish and refine SLOs in partnership with stakeholders to guarantee service reliability and performance
  • Read and write code in Python/Go
  • Induce application failure and work to recover it from that state
  • Debug applications using metrics and add traces/metrics as needed
  • Participate in on-call duties to provide constant support
  • Lead the changes in common engineering practices in the Company
  • Possible night shifts (on-call)

Requirements

  • 3+ years of experience as a Site Reliability Engineer
  • Experience with Kubernetes, Helm, Cloud providers
  • Experience with coding in Python/Go
  • Strong understanding of what an application failure is and how to handle it
  • Ability to debug applications using metrics
  • Familiarity with traces and the ability to implement them in an application
  • Willingness to be on call and work flexible hours
  • Team player with good communication abilities
  • GCP knowledge (not required, but a plus)
Benefits
  • Flexible working hours
  • Unlimited PTO
  • Flexi Benefit for your hobby
  • Employee Support Program
  • Loss of family member financial aid
  • Employee Resource Groups
  • Meals, snacks, and drinks at the office
  • Corporate events
  • Teambuilding
  • Training, courses, conferences
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonGodebuggingapplication failure handlingmetricstracesSLOsscalable system architecturereliable system architectureefficient system architecture
Soft Skills
team playercommunicationleadershipcollaborationflexibilityproblem-solvingsupporton-call dutiesstakeholder engagementengineering practices