Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Palo Alto Networks

Senior Staff Production Engineer – Cloud Platform, Reliability, Machine Identity Security

Palo Alto Networks

Senior Staff Production Engineer at Palo Alto Networks responsible for cloud platform reliability and operational excellence. Leading design and improvements across production environments while mentoring engineers.

Posted 6/11/2026full-timeSanta Clara • California • 🇺🇸 United StatesSenior💰 $126,000 - $203,500 per yearWebsite

Tech Stack

Tools & technologies
AnsibleAWSAzureCloudDistributed SystemsDNSGoGoogle Cloud PlatformJenkinsKubernetesLinuxPythonTCP/IPTerraform

About the role

Key responsibilities & impact
  • Design, build, and evolve highly available cloud infrastructure platforms with a focus on scalability, resilience, and reliability
  • Lead improvements across production systems, including performance, availability, and incident response
  • Drive and standardize Infrastructure as Code (IaC) practices to improve consistency and reduce operational overhead
  • Design and optimize CI/CD pipelines to support fast, secure, and reliable software delivery at scale
  • Partner with development teams to improve system reliability, observability, and cloud-native design patterns
  • Define and implement monitoring, alerting, and observability strategies across distributed systems
  • Lead incident response efforts, including root cause analysis and long-term remediation strategies
  • Identify and eliminate operational toil through automation and system improvements
  • Mentor engineers and contribute to raising the bar for production engineering practices

Requirements

What you’ll need
  • 5+ years of experience in DevOps, Platform Engineering, or Site Reliability Engineering (SRE)
  • Strong experience designing and operating cloud infrastructure on AWS, Azure, or GCP
  • Deep expertise managing and scaling Kubernetes environments (EKS, AKS, or GKE)
  • Strong experience with Infrastructure as Code tools (Terraform, Ansible, or Pulumi)
  • Proven experience designing and maintaining complex CI/CD systems (Jenkins, GitLab CI, ArgoCD, GitHub Actions)
  • Strong programming/scripting skills (Python, Go, or similar) for automation and tooling
  • Experience operating in high-scale, 24/7 production environments with ownership of incident response and reliability
  • Solid understanding of Linux systems and networking fundamentals (DNS, TCP/IP, load balancing, VPC, mTLS)
  • Strong problem-solving skills and ability to work across teams
  • Nice to Have: Experience implementing DevSecOps practices in cloud environments, professional certifications (CKA/CKAD, AWS Solutions Architect, Azure Administrator).

Benefits

Comp & perks
  • Employee benefits may include restricted stock units and a bonus.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
cloud infrastructureInfrastructure as CodeCI/CD pipelinesKubernetesTerraformAnsibleJenkinsPythonGoLinux systems
Soft Skills
problem-solvingmentoringcollaborationincident responseroot cause analysisautomationcommunicationleadershiporganizational skillsscalability
Certifications
CKACKADAWS Solutions ArchitectAzure Administrator