Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Cisco

Senior Kubernetes Platform Engineer – AI/ML Infrastructure

Cisco

Senior Kubernetes Platform Engineer designing and operating large-scale Kubernetes infrastructure for AI/ML workloads. Leading technical direction and ensuring performance, reliability, and scalability within complex systems.

Posted 5/15/2026full-timeRTP • North Carolina, Texas • 🇺🇸 United StatesSenior💰 $137,000 - $200,500 per yearWebsite

Tech Stack

Tools & technologies
Distributed SystemsGoKubernetesOpenShift

About the role

Key responsibilities & impact
  • Architect, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), including control plane and etcd lifecycle management
  • Define and evolve scalable, multi-tenant platform architecture supporting AI/ML and GPU-based workloads
  • Enable and optimize ML workloads including training, inference, and LLM deployment pipelines on Kubernetes
  • Build platform extensions using Kubernetes controllers, operators, CRDs, and Golang-based services
  • Implement Infrastructure as Code and automation to improve scalability, consistency, and operational efficiency
  • Drive AIOps capabilities using telemetry, automation, anomaly detection, and self-healing systems for platform reliability
  • Improve observability (metrics, logs, traces) and optimize resource utilization, scheduling, and cluster performance
  • Partner with ML engineers and data scientists to operationalize ML workflows and ensure platform readiness for AI workloads
  • Participate in on-call rotations, owning incident response, reliability, and continuous operational improvement
  • Mentor engineers and contribute to defining platform engineering standards and best practices

Requirements

What you’ll need
  • 8+ years of software engineering experience
  • 4+ years of hands-on Kubernetes production experience with control plane ownership
  • Strong experience operating on-prem or self-managed Kubernetes environments
  • Deep expertise in etcd management (backup, restore, recovery, upgrades)
  • Strong proficiency in Go with experience building Kubernetes controllers, operators, CRDs, and webhooks
  • Deep understanding of Kubernetes internals (API server, scheduler, controller loops, reconciliation)
  • Experience supporting AI/ML or GPU-based workloads on Kubernetes platforms
  • Proven experience operating and debugging large-scale distributed systems
  • Experience participating in on-call rotations and production incident management

Benefits

Comp & perks
  • Medical, dental and vision insurance
  • 401(k) plan with Cisco matching contribution
  • Paid parental leave
  • Short and long-term disability coverage
  • Basic life insurance
  • 10 paid holidays per full calendar year
  • 1 floating holiday for non-exempt employees
  • 1 paid day off for employee’s birthday
  • Paid year-end holiday shutdown
  • 4 paid days off for personal wellness
  • 16 days of paid vacation time for non-exempt employees
  • Flexible vacation time off program for exempt employees
  • 80 hours of sick time off provided on hire date and each January 1st thereafter
  • Optional 10 paid days per full calendar year to volunteer

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesOpenShiftAnthosetcd managementGolangInfrastructure as CodeAIOpsML workloadsKubernetes controllersdistributed systems
Soft Skills
mentoringincident responseoperational improvementcollaboration