Rover.com

Production Support Engineer

Rover.com

full-time

Posted on:

Location Type: Hybrid

Location: BarcelonaSpain

Visit company website

Explore more

AI Apply
Apply

About the role

  • Act as the primary technical point of contact when CX escalates customer-impacting issues, translating business impact into clear technical problem statements.
  • Triage incoming incidents, assess severity and urgency, and communicate status updates across stakeholders in a clear and timely manner.
  • Manage the incident lifecycle from initial report through resolution, communicating status updates clearly to relevant stakeholders (CX, Product, Engineering).
  • Develop and maintain comprehensive runbooks and knowledge base articles for common issues and standard operational procedures.
  • Troubleshoot and debug complex production issues utilizing logging platforms (e.g., Splunk, ELK stack), monitoring tools (e.g., Datadog, Prometheus, Grafana), and database query tools (SQL, NoSQL) to diagnose the root cause of problems.
  • Perform code-level analysis when necessary to pinpoint defects or architectural weaknesses contributing to production instability.
  • Collaborate effectively with Product Development teams to prioritize, document, and hand off confirmed bugs and large-scale systemic issues for permanent resolution.
  • Act as the escalation point for the CX team when issues require deeper technical investigation or coordination with engineering teams.

Requirements

  • 2+ years of experience in a Production Support, Application Support, Technical Operations, Site Reliability Engineering (SRE), Support Helpdesk or Engineering role focused on production system operations.
  • Hands-on experience using monitoring and observability platforms to investigate live incidents. (e.g., Splunk, Datadog, ELK).
  • Solid experience with database systems, including the ability to write and execute complex SQL queries for data analysis and issue resolution.
  • Experience coordinating between CX or non-technical teams and engineering, comfortable with technical and non technical communication.
  • Proficiency in at least one scripting language (e.g., Python, Bash) for automation and ad-hoc analysis.
  • Bonus: Experience with incident management frameworks (e.g. PagerDuty, OpsGenie) and platforms such as Jira Service Management or Zendesk.
Benefits
  • Long-term incentive plan with a company performance-based cash payout
  • Pension plan
  • Private medical insurance
  • 25 days PTO
  • Meal allowance and flexible compensation plan (transport and nursery)
  • Gym membership
  • €450 to cover the costs associated with the adoption of a pet
  • Annual €150 wellness reimbursement
  • Flexible work hours, sometimes you'll need to be in at certain times, but on the whole, we're pretty flexible when it comes to managing workload and time
  • Grab snacks, fresh fruit, in our kitchen to keep yourself going
  • Regular team activities, events, game nights, and more
  • Dog-friendly office
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
SQLNoSQLPythonBashincident managementtroubleshootingdebuggingcode-level analysisproduction supportapplication support
Soft Skills
communicationcollaborationproblem-solvingprioritizationstakeholder managementtechnical documentationincident triagestatus updatescustomer impact assessmentcoordination