FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAnsibleCloudDockerGoKubernetesLinuxPrometheusPythonTerraform
About the role
Key responsibilities & impact- Design, develop, and maintain infrastructure for AI inference workloads, including GPU scheduling, model deployment pipelines, and data access patterns in on-prem environments
- Build and manage monitoring and observability tools for AI inference platforms, including dashboards, alerts, and runbooks for model health and system performance
- Collaborate with ML engineers and platform teams to design system architecture for AI workloads, integrate inference runtimes, and test performance at scale
Requirements
What you’ll need- Hands-on experience deploying, operating, and troubleshooting Kubernetes clusters, including Helm, Docker, or CRI-O.
- Strong understanding of Linux systems and networking concepts, including troubleshooting connectivity and performance issues.
- Ability to develop automation and operational tooling using Python, Go, or Bash.
- Experience provisioning and managing infrastructure with tools such as Terraform and Ansible.
- Experience designing, implementing, and maintaining CI/CD pipelines using GitLab CI or GitHub Actions.
- Preferred Qualifications
- Experience operating or administering Slurm clusters.
- Experience with Cluster API (CAPI) or other Kubernetes cluster lifecycle management ("Kubeception") technologies.
- Deep understanding of Kubernetes internals, including CNI, CSI, Operators, and cluster architecture.
- Nice to Have
- Experience with Kubernetes ecosystem tools such as Argo CD and Helmfile.
- Experience with Prometheus.
- Familiarity with other Cloud Native technologies
Benefits
Comp & perks- Competitive compensation
- Flexible working hours and hybrid or remote options, depending on your role
- Work from anywhere in the world for up to 45 days per year
- Private medical insurance for you and your family*
- Extra paid vacation and sick leave days*
- Support for life’s important moments and celebrations
- Language courses to help you connect and grow
- Modern, welcoming offices with snacks, drinks, and entertainment*
- Team sports and social activities*
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesPythonGoBashTerraformAnsibleGitLab CIGitHub ActionsLinux SystemsNetworking Concepts
