FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

AI Infrastructure & Platform Operations Engineer
MirantisAI Infrastructure & Platform Operations Engineer for Mirantis enabling organizations with scalable AI infrastructure. Supporting NVIDIA GPU platforms and collaborating on operational stability across environments.
Tech Stack
Tools & technologiesCloudKubernetesLinux
About the role
Key responsibilities & impact- Monitor, operate, and support production AI infrastructure platforms.
- Investigate and resolve infrastructure, networking, hardware, and platform-related incidents.
- Support NVIDIA GPU infrastructure and associated platform services.
- Monitor and troubleshoot Kubernetes-based environments.
- Investigate performance, availability, and reliability issues across infrastructure and platform components.
- Collaborate with engineering teams, hardware vendors, datacenter personnel, and service delivery teams to resolve technical issues.
- Participate in incident response, root cause analysis, and operational improvement activities.
- Contribute to improvements in monitoring, observability, automation, and operational processes.
- Maintain operational documentation, runbooks, and knowledge articles.
Requirements
What you’ll need- 3+ years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or related technical roles.
- Strong Linux administration and troubleshooting skills.
- Good understanding of networking concepts and experience diagnosing infrastructure-related issues.
- Working knowledge of Kubernetes in production environments.
- Experience supporting production infrastructure and services.
- Strong analytical and problem-solving skills.
- Experience working within structured operational and incident management processes.
- Excellent communication and collaboration skills.
- Ability to work within a shift-based operational environment.
Benefits
Comp & perks- Work with some of the most advanced AI infrastructure environments in production today.
- Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments.
- Help define how next-generation AI infrastructure is operated and supported.
- Be part of a team shaping the future of AI-powered operations through k0rdent AI.
- Join a growing organisation investing heavily in AI infrastructure and platform services.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Linux administrationKubernetesnetworking conceptsinfrastructure operationsplatform operationssite reliability engineeringcloud operationsdatacenter operationstroubleshootingincident management
Soft Skills
analytical skillsproblem-solving skillscommunication skillscollaboration skills