Tech Stack
AnsibleCloudDNSDockerGoogle Cloud PlatformGrafanaJenkinsKubernetesLinuxPrometheusPythonServiceNowTerraform
About the role
- Ensure high availability, performance, and security of production systems across Windows, Linux, and GCP environments.
- Engineer and support containerized workloads using Kubernetes (GKE) and Docker.
- Lead infrastructure provisioning and configuration using Terraform, Ansible, and GCP-native tools.
- Develop automation scripts and pipelines to eliminate manual toil and accelerate incident response.
- Implement observability frameworks using SLIs/SLOs, Prometheus, Grafana, and GCP Operations Suite.
- Drive proactive monitoring, alerting, and telemetry across hybrid environments.
- Lead incident response, root cause analysis, and postmortems.
- Build self-healing systems and automated remediation workflows using GCP-native services and scripting.
- Collaborate with InfoSec to enforce hardening standards, manage vulnerabilities, and support compliance initiatives.
- Partner with developers, application owners, and infrastructure teams to deliver reliable, cloud-native platforms.
- Document configurations, runbooks, and operational procedures to enable cross-team reuse and transparency.
Requirements
- 4+ years of Technology Infrastructure Engineering and Solutions experience
- 4 + years of experience in Windows Server administration and production support.
- Strong scripting skills in PowerShell, Python, or Shell.
- Hands-on experience with GCP services, including GKE, IAM, Cloud Functions, and Cloud Monitoring.
- Proficiency in container technologies: Docker and Kubernetes.
- Familiarity with Linux system administration and hybrid cloud environments.
- Experience with infrastructure-as-code tools: Terraform, Ansible.
- Strong understanding of Active Directory, DNS, DHCP, and Windows security principles.
- Security certifications (e.g., CISSP, Security+, GCP Professional Cloud Security Engineer).
- Experience with CI/CD tools (e.g., GitLab CI and Jenkins).
- Familiarity with ITIL practices and change management.
- Exposure to ServiceNow, load balancers, certificate management, and endpoint protection tools.
- Ability to work on-site in one of the listed locations in a hybrid environment
- Ability to work outside of normal business hours including nights and weekends on a limited/rotational basis
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
GCPKubernetesDockerTerraformAnsiblePowerShellPythonShell scriptingCI/CDLinux administration
Soft skills
leadershipcollaborationincident responseroot cause analysisdocumentation
Certifications
CISSPSecurity+GCP Professional Cloud Security Engineer