Site Reliability Engineer – SRE

SiGMA World

full-time

Posted on: 1/21/2026

Location Type: Hybrid

Location: Belgrade • Serbia

Visit company website

Explore more

DevOps Engineer jobs

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

Cloud Go Grafana Linux Prometheus Python

About the role

Ensures the availability, performance, and resilience of production systems supporting live events and iGaming platforms.
Builds and maintain monitoring, alerting, and observability systems to detect issues before they impact users.
Conducts capacity planning, load testing, and performance tuning to support traffic spikes during major events.
Leads incident response, root‑cause analysis, and post‑mortem processes.
Develops automation for deployments, scaling, configuration management, and routine operational tasks.
Implements Infrastructure‑as‑Code (IaC) to standardise and automate environment provisioning.
Improves CI/CD pipelines to ensure fast, reliable, and repeatable releases.
Reduces manual toil through scripting, tooling, and process optimisation.
Introduces AI‑powered tools to enhance reliability and operational efficiency.
Supports the deployment and scaling of AI‑enabled products and services across event and iGaming platforms.
Collaborates with AI and data teams to ensure infrastructure supports model training, inference, and real‑time AI workloads.
Manages cloud infrastructure (compute, storage, networking) with a focus on scalability and cost efficiency.
Implements best practices for security, resilience, and compliance across cloud environments.
Ensures systems are event‑ready, with robust failover, redundancy, and real‑time monitoring.
Supports event operations teams with technical readiness, live‑event monitoring, and rapid issue resolution.
Builds systems capable of handling unpredictable traffic patterns common in iGaming and live events.
Implements secure‑by‑design principles across infrastructure and operations.
Ensures compliance with data‑privacy regulations and responsible‑gaming requirements where applicable.
Identifies and mitigates operational risks, vulnerabilities, and single points of failure.
Works closely with engineering, product, data, and platform teams to ensure reliability is embedded throughout the development lifecycle.
Provides guidance on best practices for performance, scalability, and operational readiness.
Communicates system health, risks, and improvements to stakeholders.

Requirements

Strong proficiency in cloud platforms, Linux systems, and distributed architectures
Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog, New Relic)
Strong scripting and automation skills (Python, Bash, Go, or similar)
Familiarity with AI‑assisted operations and emerging intelligent‑monitoring tools
Experience with CI/CD, containerisation, and orchestration
Strong problem‑solving and analytical skills
Ability to thrive in fast‑paced, event‑driven environments
Excellent communication and collaboration skills
Educated to degree level in a numerate or technical discipline, Masters preferred.
5–7+ years of technical experience in SRE, DevOps, platform engineering, or systems engineering
1–2+ years of management or mentorship experience, such as leading incident response, guiding junior engineers, or owning reliability initiatives
Experience supporting high‑availability, high‑traffic systems in production
Background working with event‑driven architectures or iGaming platforms
Proven track record of implementing automation and reliability improvements.

Benefits

Free iGaming Academy access -Learn the ins and outs of the industry with access to courses.
Travel perks - Visit our international offices and attend industry events worldwide.
Performance rewards - High performers are recognized and fast-tracked with annual reviews and bi-yearly performance checks ins.
Interest-free car loan after probation (T&Cs apply)

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

cloud platformsLinux systemsdistributed architecturesmonitoring toolsscriptingautomationCI/CDcontainerisationorchestrationInfrastructure-as-Code

Soft Skills

problem-solvinganalytical skillscommunicationcollaborationleadershipmentorshipadaptabilitytechnical readinessincident responsestakeholder communication

Certifications

degree in numerate or technical disciplineMasters preferred