Tech Stack
AWSAzureCloudDistributed SystemsDNSFluxGoogle Cloud PlatformGrafanaJavaScriptKubernetesPrometheusTerraformTypeScript
About the role
- Design and implement scalable, secure, and cost-efficient cloud infrastructure using Infrastructure as Code (IaC) principles.
- Manage and optimize multi-account cloud environments to ensure resilience, performance, and compliance.
- Continuously improve CI/CD pipelines to support safe, rapid, and reliable releases.
- Implement and maintain monitoring, logging, and alerting systems to proactively detect and resolve issues.
- Collaborate with cross-functional teams to streamline deployments, troubleshoot issues, and optimize performance.
- Identify technical risks and gaps, and propose solutions aligned with the team’s long-term vision.
- Participate in on-call rotations, incident response, and post-mortem reviews to drive learning and improvement.
- Resolve escalated support issues while contributing to root-cause analysis and systemic improvements.
- Document processes, designs, and operational runbooks to share knowledge and improve team efficiency.
Requirements
- Proficiency in JavaScript/TypeScript , with experience building backend services or tooling.
- 5+ years of experience managing production cloud infrastructure (AWS required; GCP/Azure a plus).
- 3+ years of experience operating Kubernetes clusters at scale.
- Knowledge of observability platforms (Grafana, Prometheus, Honeycomb, Datadog).
- Solid understanding of networking, DNS, and multi-region infrastructure design .
- Experience with CI/CD pipelines (GitHub Actions required).
- Hands-on experience with IaC tools (Pulumi preferred; Terraform or AWS CDK acceptable).
- Experience participating in on-call rotations and incident management .
- Strong problem-solving and debugging skills in complex, distributed systems.
- Excellent communication and collaboration skills , with experience working in a distributed, remote team environment.
- Optional Familiarity with GitOps tooling (e.g., Argo CD, Flux).
- Optional Background in security and compliance standards (SOC 2, ISO 27001, etc.).