Extreme Networks

Staff Cloud Operations Engineer

Extreme Networks

full-time

Posted on:

Location Type: Remote

Location: Ireland

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Architect & Scale Infrastructure: Design and implement multi-cluster, multi-region Kubernetes deployments using EKS, GKE, and AKS. Build infrastructure that scales across regions and cloud providers.
  • Own Production Systems: Take end-to-end ownership of production infrastructure. Drive incident response, postmortems, and improvements to prevent recurrence.
  • Infrastructure as Code at Scale: Build and maintain Terraform modules for complex infrastructure patterns. Manage thousands of configuration files across clusters, regions, and environments using GitOps principles.
  • GitOps & Deployment Excellence: Design and optimize ArgoCD ApplicationSets and Helm chart architectures. Build deployment pipelines that enable safe, automated releases across hundreds of microservices.
  • Performance & Reliability Engineering: Analyze system performance, identify bottlenecks, and implement optimizations. Improve SLOs through capacity planning, autoscaling, and architectural improvements.
  • Observability & Monitoring: Build and enhance monitoring, alerting, and observability using Prometheus, Grafana, Loki, and custom tooling. Drive visibility into complex distributed systems.
  • Security & Compliance: Implement security controls, compliance frameworks, and best practices across cloud infrastructure. Design secure multi-tenant architectures.
  • Technical Leadership: Mentor engineers, establish best practices, and drive technical decisions. Collaborate with platform, SRE, and product teams to deliver reliable infrastructure.

Requirements

  • 5+ years in cloud infrastructure engineering, with deep expertise in at least one major cloud provider (AWS preferred)
  • Strong Kubernetes experience: cluster design, operators, controllers, and multi-cluster management
  • Proficiency with Infrastructure as Code: Terraform, CloudFormation, or similar
  • GitOps expertise: ArgoCD, Flux, or similar; experience with ApplicationSets and complex deployment patterns
  • Deep Linux and networking knowledge
  • Experience with distributed systems: Elasticsearch, PostgreSQL, Redis, Kafka, RabbitMQ
  • Monitoring and observability: Prometheus, Grafana, ELK stack, or similar
  • Strong problem-solving skills and experience debugging complex distributed systems
  • Experience with cloud security, compliance (SOC2, ISO27001), and secure-by-design practices
  • Excellent communication skills for working across time zones and with distributed teams
  • Self-directed with a track record of owning problems end-to-end.
Benefits
  • Equal employment opportunities to all employees and applicants.
  • Prohibits discrimination and harassment of any type.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesTerraformCloudFormationArgoCDLinuxElasticsearchPostgreSQLRedisKafkaRabbitMQ
Soft Skills
problem-solvingcommunicationself-directedmentoringcollaboration
Certifications
SOC2ISO27001