Senior Site Reliability Engineer

MariaDB

full-time

Posted on: 1/6/2026

Location Type: Remote

Location: Malaysia

✨ AI Apply

About the role

Design, implement, and evolve large-scale, cloud-native infrastructure supporting our global SaaS platform.
Lead reliability and scalability initiatives that span multiple teams and services, driving automation and resilience through infrastructure-as-code and GitOps practices.
Proactively identify and remediate systemic reliability issues, ensuring high service availability and performance across multi-cloud environments.
Collaborate with software and platform teams to integrate reliability principles, SLOs, and observability standards into every stage of the development lifecycle.
Act as a key technical leader during major incidents—coordinating response efforts, conducting root cause analysis, and implementing long-term corrective actions.
Contribute to continuous improvement by defining infrastructure patterns, refining CI/CD workflows, and mentoring other engineers in automation and reliability best practices.

At least 7 years of hands-on experience as an SRE, DevOps, or Infrastructure Engineer in production cloud environments.
Strong expertise with Kubernetes operations and ecosystem tooling in production-scale clusters.
Proven experience designing and maintaining multi-cloud infrastructure across Azure, AWS, or GCP.
Advanced proficiency with Terraform and Terragrunt, capable of designing modular, reusable, and secure IaC components.
Solid understanding of GitOps principles and deployment automation using ArgoCD or similar tools.
Deep experience with Linux systems administration, performance tuning, and troubleshooting.
Proficiency in one or more programming/scripting languages (Python, Bash, Go preferred).
Strong understanding of observability concepts and experience working with monitoring and alerting tools such as Prometheus, Grafana, and Thanos.
Experience participating in or leading on-call rotations, handling incident response, and conducting post-incident reviews.

Benefits

Tip: use these terms in your resume and cover letter to boost ATS matches.

cloud-native infrastructureinfrastructure-as-codeGitOpsKubernetesTerraformTerragruntLinux systems administrationPythonBashGo

leadershipcollaborationproblem-solvingmentoringincident responseroot cause analysiscontinuous improvement