Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Cisco

Kubernetes Platform Engineer – AI Infrastructure

Cisco

Kubernetes Platform Engineer leading on-prem Kubernetes infrastructure for AI/ML platforms. Focusing on scalable solutions, automation, and collaborating with data scientists and engineers.

Posted 5/15/2026full-timeSan Jose • California, North Carolina • 🇺🇸 United StatesMid-LevelSenior💰 $152,500 - $219,200 per yearWebsite

Tech Stack

Tools & technologies
Distributed SystemsGoKubernetesOpenShiftPython

About the role

Key responsibilities & impact
  • Design, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), with ownership of control plane, etcd, and cluster lifecycle.
  • Architect scalable, multi-tenant platform infrastructure as the foundation for AI/ML and GenAI workloads.
  • Enable and optimize AI/ML workloads, including GPU-based environments for training, inference, and model deployment.
  • Partner with data scientists and ML engineers to onboard and scale ML pipelines and workflows.
  • Build platform capabilities using Kubernetes controllers, operators, CRDs, and Golang/Python services.
  • Implement Infrastructure as Code, automation, and AIOps-driven self-healing using platform telemetry and observability.
  • Ensure reliability through performance tuning (scheduling, resource utilization) and participate in on-call support and incident response.

Requirements

What you’ll need
  • 5+ years of software engineering experience, including supporting AI/ML or GPU-based workloads on Kubernetes platforms
  • 3+ years operating Kubernetes in production with control plane ownership, preferably in on-prem or self-managed environments
  • Strong experience with etcd management (backup, restore, recovery) and Kubernetes cluster upgrades
  • Proficiency in Go with experience building Kubernetes controllers/operators, CRDs, and webhooks
  • Deep understanding of Kubernetes internals (API server, scheduler, controller loops, reconciliation patterns)
  • Proven ability to debug and operate large-scale distributed systems in production environments, including participation in on-call rotations

Benefits

Comp & perks
  • Medical, dental and vision insurance
  • 401(k) plan with a Cisco matching contribution
  • Paid parental leave
  • Short and long-term disability coverage
  • Basic life insurance
  • 10 paid holidays per full calendar year
  • 1 floating holiday for non-exempt employees
  • 1 paid day off for employee’s birthday
  • Paid year-end holiday shutdown
  • 4 paid days off for personal wellness determined by Cisco
  • 16 days of paid vacation time per full calendar year for non-exempt employees
  • Flexible vacation time off program for exempt employees
  • 80 hours of sick time off provided on hire date and each January 1st
  • Up to 80 hours of unused sick time carried forward from one calendar year to the next
  • Additional paid time away for critical issues for family members
  • Optional 10 paid days per year to volunteer

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesOpenShiftAnthosGolangPythonInfrastructure as CodeAIOpsGPU-based environmentsetcd managementdistributed systems
Soft Skills
collaborationproblem-solvingdebuggingincident responseperformance tuning