Contribute to driving the overall QA strategy for Kubernetes and cloud-native environments, aligning quality goals with business objectives and release deadlines.
Evaluate and champion best-in-class testing best practices and frameworks for distributed systems.
Lead risk analysis and test planning for large-scale, multi-cloud infrastructure and container orchestration.
Architect test frameworks and infrastructure for validating microservices and infrastructure components in multi-cluster and hybrid-cloud environments.
Oversee the design of complex test scenarios simulating production-like workloads, resource scaling, failure injection, and recovery across distributed clusters.
Spearhead the development of scalable and maintainable test automation integrated with CI/CD (Jenkins, GitHub Actions, etc.).
Leverage Kubernetes APIs, Helm, and service mesh tools to build comprehensive automation coverage, including system health, failover behavior, and network resilience.
Promote test infrastructure-as-code and drive IaC forward on the team making sure the infrastructure code is repeatable, extensible and reliable.
Mentor QA engineers and developers in advanced testing techniques such as: risk-based testing, chaos engineering, performance and load testing etc.
Serve as the QA authority in design reviews, production readiness assessments, and incident retrospectives(escaped defects).
Collaborate with and devops engineers to refine monitoring, alerting, and debugging strategies for testing automation being run in CI/CD.
Establish quality gates, release readiness metrics, and data driven feedback loops to ensure released software is production quality.
Drive initiatives to integrate performance, security, and chaos testing into the development lifecycle.
Advocate for a culture of quality across teams and influence architecture decisions with testing in mind.
Requirements
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
10+ years in software testing or quality engineering roles, with 4+ years focused on Kubernetes, cloud-native applications, and distributed systems.
Proven leadership experience in building and scaling test strategies in enterprise-grade environments.
Deep understanding of Kubernetes internals, cluster lifecycle management, Helm, service meshes (e.g., Istio or Linkerd), and network policies.
Strong scripting and automation capabilities (Python, Pytest, Bash, etc.).
Familiarity with observability stacks (Prometheus, Grafana, Jaeger), Kubernetes security (RBAC, secrets management), and performance benchmarking tools (e.g., K6).
Solid grounding in cloud architecture (AWS, Azure, GCP), infrastructure provisioning, and containerized CI/CD.
Moderate to advanced linux knowledge and proficiency is required: Bash scripting and debugging, systemd/logs, networking/firewalling/routing, certificate/PKI management, containers (Docker/containerd), and Kubernetes tooling (kubectl/Helm with OCI registries, GitOps/Flux).
Benefits
Work with an established Silicon Valley leader in the cloud infrastructure industry;
Work with exceptionally passionate, talented and engaging colleagues, helping Fortune 500 and Global 2000 customers implement next-generation cloud technologies;
Be a part of cutting-edge, open-source innovation;
Thrive in the high-energy environment of a young company where openness, collaboration, risk-taking, and continuous growth are valued;
Professional development and training;
Attend conferences and working groups;
Company outings, happy hours, hackathons, and tech talks;
Receive a competitive compensation package with a strong benefits plan.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.