FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Kubernetes Platform Engineer – AI Infrastructure
CiscoKubernetes Platform Engineer leading on-prem Kubernetes infrastructure for AI/ML platforms. Focusing on scalable solutions, automation, and collaborating with data scientists and engineers.
Posted 5/15/2026full-timeSan Jose • California, North Carolina • 🇺🇸 United StatesMid-LevelSenior💰 $152,500 - $219,200 per yearWebsite
Tech Stack
Tools & technologiesDistributed SystemsGoKubernetesOpenShiftPython
About the role
Key responsibilities & impact- Design, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), with ownership of control plane, etcd, and cluster lifecycle.
- Architect scalable, multi-tenant platform infrastructure as the foundation for AI/ML and GenAI workloads.
- Enable and optimize AI/ML workloads, including GPU-based environments for training, inference, and model deployment.
- Partner with data scientists and ML engineers to onboard and scale ML pipelines and workflows.
- Build platform capabilities using Kubernetes controllers, operators, CRDs, and Golang/Python services.
- Implement Infrastructure as Code, automation, and AIOps-driven self-healing using platform telemetry and observability.
- Ensure reliability through performance tuning (scheduling, resource utilization) and participate in on-call support and incident response.
Requirements
What you’ll need- 5+ years of software engineering experience, including supporting AI/ML or GPU-based workloads on Kubernetes platforms
- 3+ years operating Kubernetes in production with control plane ownership, preferably in on-prem or self-managed environments
- Strong experience with etcd management (backup, restore, recovery) and Kubernetes cluster upgrades
- Proficiency in Go with experience building Kubernetes controllers/operators, CRDs, and webhooks
- Deep understanding of Kubernetes internals (API server, scheduler, controller loops, reconciliation patterns)
- Proven ability to debug and operate large-scale distributed systems in production environments, including participation in on-call rotations
Benefits
Comp & perks- Medical, dental and vision insurance
- 401(k) plan with a Cisco matching contribution
- Paid parental leave
- Short and long-term disability coverage
- Basic life insurance
- 10 paid holidays per full calendar year
- 1 floating holiday for non-exempt employees
- 1 paid day off for employee’s birthday
- Paid year-end holiday shutdown
- 4 paid days off for personal wellness determined by Cisco
- 16 days of paid vacation time per full calendar year for non-exempt employees
- Flexible vacation time off program for exempt employees
- 80 hours of sick time off provided on hire date and each January 1st
- Up to 80 hours of unused sick time carried forward from one calendar year to the next
- Additional paid time away for critical issues for family members
- Optional 10 paid days per year to volunteer
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesOpenShiftAnthosGolangPythonInfrastructure as CodeAIOpsGPU-based environmentsetcd managementdistributed systems
Soft Skills
collaborationproblem-solvingdebuggingincident responseperformance tuning