Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Pave Bank

Senior Site Reliability Engineer

Pave Bank

Site Reliability Engineer ensuring high availability and performance of production systems at Pave Bank. Collaborating with teams for infrastructure reliability in a fintech environment.

Posted 6/22/2026full-timeRemote • 🇲🇾 MalaysiaSeniorWebsite

Tech Stack

Tools & technologies
CloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaKubernetesMicroservicesPrometheusPythonTerraform

About the role

Key responsibilities & impact
  • Monitor, maintain, and improve the reliability, availability, and performance of production systems and services.
  • Build and maintain infrastructure as code (IaC), deployment pipelines, and automation to support continuous delivery, scalability, and disaster recovery.
  • Respond to incidents, perform root-cause analysis, and drive postmortems to ensure lessons learned are applied.
  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration.
  • Document operational runbooks, on-call procedures, and system architecture to support maintenance, knowledge sharing, and compliance.

Requirements

What you’ll need
  • Strong programming or scripting skills (Go, Python, Bash, or similar) for automation, tooling, and operational tasks.
  • Hands-on experience with cloud infrastructure, ideally Google Cloud Platform (GCP).
  • Familiarity with containerization and orchestration (Docker, Kubernetes, or equivalent).
  • Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar).
  • Experience with either FluxCD or ArgoCD for GitOps-based delivery.
  • Solid understanding of distributed systems, microservices architecture, and reliability patterns.
  • Experience setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing).
  • Strong troubleshooting skills and ability to respond to incidents under pressure.
  • Knowledge of backup and disaster recovery strategies, database management, and secure operations.

Benefits

Comp & perks
  • Competitive salary and meaningful equity with room for growth.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoPythonBashcloud infrastructureGoogle Cloud PlatformDockerKubernetesTerraformFluxCDArgoCD
Soft Skills
troubleshootingincident responsecollaborationroot-cause analysispostmortem analysisknowledge sharingcapacity planningoperational best practices