Salla E-Commerce Platform

Senior Site Reliability Engineer, SRE

Salla E-Commerce Platform

full-time

Posted on:

Location Type: Hybrid

Location: MakkahSaudi Arabia

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Lead reliability initiatives, handle complex incidents, improve platform performance, and guide engineering teams toward building resilient systems.
  • Participate in the **on-call rotation** as part of our commitment to platform reliability.
  • Troubleshoot complex issues across applications, infrastructure, and networks.
  • Identify and resolve performance bottlenecks and scaling challenges.
  • Enhance cloud-native infrastructure, deployment processes, and automation.
  • Build and refine dashboards, alerts, metrics, logs, and traces.
  • Develop tools that reduce operational toil and increase reliability.
  • Mentor engineers on reliability, debugging, and operational best practices.

Requirements

  • Strong experience with **Kubernetes**, **service mesh technologies**, and cloud platforms (AWS/GCP/Azure).
  • Deep understanding of **Linux**, networking, distributed systems, and load balancers.
  • Hands-on with **Terraform** or similar IaC tools.
  • Experience with **Prometheus**, **Grafana**, **Loki**, **Mimir**, **Elastic**, or similar observability tools.
  • Proficiency in scripting/programming (Bash, Python, Go).
  • Experience with CI/CD and GitOps.
  • Strong debugging, incident response, and performance analysis skills.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Kubernetesservice mesh technologiesAWSGCPAzureLinuxTerraformPrometheusGrafanaBash
Soft Skills
mentoringdebuggingincident responseperformance analysis