Expedia Group

Software Development Engineer III – SRE

Expedia Group

full-time

Posted on:

Location Type: Office

Location: GurgaonIndia

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • Engage with domain owners on Business Continuity plans and run simulated Disaster Recovery scenarios to improve application resilience.
  • Implement and support monitoring and alerting strategies to ensure the health, availability, capacity, and performance standards.
  • Monitor and proactively identify system errors & opportunities to improve customer experience.
  • Share domain and industry knowledge between cross-functional teams.
  • Facilitate collaboration with different stakeholders with varied perspectives to develop effective solutions on Disaster Recovery Processes.
  • Build reporting capabilities to showcase operational health and quality.
  • Provide technical support, identification, troubleshooting, and resolution to issues and impacts.
  • Drive a culture of root cause analysis and continuous improvement.
  • Operationally support applications and services across multiple environments.

Requirements

  • Bachelor’s or Master’s degree in a Technical Field with 6+ years or equivalent related professional experience
  • Excellent problem-solving and analytical skills with strong attention to detail
  • Experience in System Design, and Architecture
  • Strong written and verbal communication skills
  • Expert in AWS and EKS, with in-depth knowledge of infrastructure setup and multi-region environments
  • In-depth knowledge of Reliability Concepts such as SLOs, SLIs, Error Budgets, and Disaster Recovery processes
  • Exposure in Python Scripting (preferred)
  • Knowledge of AI/ML concepts (preferred)
  • Knowledge of various monitoring tools like Splunk, Datadog, and Catchpoint (preferred)
Benefits
  • exciting travel perks
  • generous time-off
  • parental leave
  • flexible work model
  • career development resources

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
System DesignArchitectureAWSEKSPython ScriptingReliability ConceptsSLOsSLIsError BudgetsDisaster Recovery
Soft skills
problem-solvinganalytical skillsattention to detailwritten communicationverbal communicationcollaborationstakeholder engagementroot cause analysiscontinuous improvement