ClearML

Senior Python Systems Engineer, Agent & Infrastructure

ClearML

full-time

Posted on:

Location Type: Remote

Location: Germany

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Agent Development: Design and optimize the clearml-agent, a Python service responsible for pulling jobs, setting up environments, and executing ML pipelines.
  • Kubernetes Integration: Write logic to interact directly with K8s APIs, manage Pod life-cycles, and handle Custom Resource Definitions (CRDs).
  • Resource Management: Implement logic for dynamic resource allocation (GPU/CPU/Memory) and container orchestration.
  • Systems Programming: Build robust daemons and services that interact with OS-level primitives (systemd, signals, I/O streams).
  • Networking: Troubleshoot and optimize TCP/IP connections, DNS resolution, and firewall traversal to ensure seamless connectivity for users.

Requirements

  • 8+ years of development experience with a strong focus on Systems Programming.
  • Kubernetes Mastery: Deep understanding of Kubernetes architecture (beyond just writing YAML). You should know how to write code that controls K8s.
  • Container Internals: Extensive experience with Docker, including building and maintaining images.
  • Python for Systems: Experience using Python for automation, daemons, or CLI tools (using libraries like subprocess, socket, asyncio).
  • Networking Fundamentals: Strong grasp of HTTP/S, WebSockets, TCP/IP, Proxies, and Reverse Proxies.
  • OS Knowledge: strong understanding of Linux internals and shell scripting.
Benefits
  • Fully Remote & Global – Work from anywhere with a distributed team of top-tier engineers.
  • Engineering-First & Autonomous – High ownership, real responsibility, and freedom to design and ship impactful solutions.
  • High Growth, High Impact – Your work directly affects thousands of users, from startups to large enterprises.
  • Technically Deep Challenges – Build complex, performance-critical systems at the core of modern AI infrastructure.
  • Fast Feedback, Real Users – See your work in production quickly and make a measurable difference.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonKubernetesDockerSystems ProgrammingResource ManagementNetworkingDynamic Resource AllocationContainer OrchestrationOS-level ProgrammingShell Scripting