Gcore

DevOps Engineer, Cloud AIaaS

Gcore

full-time

Posted on:

Location Type: Remote

Location: Cyprus

Visit company website

Explore more

AI Apply
Apply

About the role

  • Design, develop, and maintain infrastructure for AI inference workloads, including GPU scheduling, model deployment pipelines, and data access patterns in on-prem environments
  • Build and manage monitoring and observability tools for AI inference platforms, including dashboards, alerts, and runbooks for model health and system performance
  • Collaborate with ML engineers and platform teams to design system architecture for AI workloads, integrate inference runtimes, and test performance at scale

Requirements

  • Hands-on Experience In Containerization and Container Orchestration: Kubernetes, Helm, Docker/CRI-O
  • Linux and networks
  • Programming and Scripting: Python/Go/Bash
  • Infrastructure as Code (IaC) approach: Ansible, Terraform
  • Creating CI/CD pipelines: GitLab/GitHub actions
  • Experience with Cluster API or any other "Kubeception" technology
  • Deep experience with Kubernetes CNI, CSI, and Operators
  • Nice to Have Knowledge in Kubernetes-related technologies such as ArgoCD, Helmfile
  • Experience with Prometheus stack
  • Experience with other Cloud Native technologies.
Benefits
  • Competitive compensation
  • Flexible working hours and hybrid or remote options, depending on your role
  • Work from anywhere in the world for up to 45 days per year
  • Private medical insurance for you and your family*
  • Extra paid vacation and sick leave days*
  • Support for life’s important moments and celebrations
  • Language courses to help you connect and grow
  • Modern, welcoming offices with snacks, drinks, and entertainment*
  • Team sports and social activities*
  • *Benefits may vary depending on your location.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GPU schedulingmodel deployment pipelinesdata access patternsmonitoring toolsobservability toolsPythonGoBashAnsibleTerraform
Soft Skills
collaborationsystem architecture designperformance testing