Salary
💰 $91,500 - $219,700 per year
Tech Stack
AnsibleCloudGoGrafanaKubernetesLinuxOpenStackPrometheusPythonTerraform
About the role
- Spans the Infrastructure Reliability and Platform Engineering domains, with equal responsibility for OpenStack and Rancher-managed Kubernetes environments.
- Lead the lifecycle of both platforms, ensuring scalable, secure, and observable foundations that support mission-critical workloads.
- Define how Five9 builds and runs both OpenStack and Kubernetes platforms at scale, ensuring parity in automation, performance, security, and observability.
- OpenStack: Designing, automating, and operating multi-tenant clusters across Nova, Neutron, Octavia, Manila, Cinder, Ironic, and Glance.
- Kubernetes (Rancher RKE2/K3s): Designing and maintaining bare-metal clusters with GitOps-driven automation (Fleet, Argo CD, Terraform, Helm), optimized for performance and secured with RBAC, network policies, and compliance guardrails.
- Cross-Platform Integration: Enabling interoperability (e.g., Magnum, storage backends, network overlays) and shared observability stacks (Prometheus, Grafana, Loki/EFK).
- Apply SRE principles uniformly across both stacks for capacity planning, high availability, resilience, incident response, and continuous improvement.
Requirements
- 7+ years in infrastructure, platform, or SRE engineering roles.
- Hands-on mastery of both OpenStack and Rancher Kubernetes in production, with focus on automation, lifecycle management, and high availability.
- Proficiency with Terraform, Ansible, Helm, Argo CD, and scripting in Python/Go/Bash.
- Deep understanding of Linux internals and system performance.
- Proven application of SRE practices in high-stakes, always-on environments.
- Strong experience with observability platforms (Prometheus, Grafana, Loki/ELK/EFK).
- Preferred: Advanced networking and scheduling (VXLAN, BGP, NIC offload, NUMA-aware scheduling, SR-IOV, PCI passthrough).
- Preferred: OpenStack Magnum integration, Tempest/Rally for upgrades and validation.
- Preferred: Kubernetes Rancher add-ons (Fleet, Longhorn, CIS Scanning) and security hardening (CKA/CKS).
- Preferred: Compliance familiarity (PCI, FedRAMP, SOC2, FIPS, CIS, NIST standards).
- Preferred: Upstream contributions to OpenStack or Kubernetes projects.