BTSE

DevOps – Infrastructure Engineer

BTSE

full-time

Posted on:

Location Type: Remote

Location: Hong Kong

Visit company website

Explore more

AI Apply
Apply

About the role

  • Set up a multi-tenant Kubernetes cluster: shared services namespace, per-tenant namespaces for isolated workloads, GPU node pools for model inference.
  • Build CI/CD pipeline: source control → container build → automated deployment with zero-downtime rolling updates.
  • Configure GPU management: scheduling, resource quotas per tenant, device plugins.
  • Set up comprehensive monitoring: per-tenant metrics, SLA tracking, data pipeline health, GPU utilisation, API latency percentiles, WebSocket connection stability.
  • Implement backup and disaster recovery: cross-region replication, automated database backups.
  • Build tenant provisioning automation: scripted creation of new tenant namespaces, storage, network policies, and service accounts.
  • Security hardening: network policies between namespaces, vulnerability scanning, audit logging.
  • 24/7 on-call during initial pilot (rotating with Tech Lead).

Requirements

  • 4+ years DevOps/SRE; Kubernetes cluster operations including multi-tenant patterns.
  • GPU workloads on Kubernetes (GPU Operator, device plugins, resource scheduling).
  • CI/CD pipelines: GitHub Actions, ArgoCD or FluxCD.
  • Terraform IaC.
  • On-call experience and incident management.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesCI/CDGPU managementTerraformincident managementbackup and disaster recoverynetwork policiesvulnerability scanningAPI latencydata pipeline
Soft Skills
on-call experiencecollaborationproblem-solvingcommunicationorganizational skills