
Senior Python Systems Engineer, Agent & Infrastructure
ClearML
full-time
Posted on:
Location Type: Remote
Location: Germany
Visit company websiteExplore more
Job Level
About the role
- Agent Development: Design and optimize the clearml-agent, a Python service responsible for pulling jobs, setting up environments, and executing ML pipelines.
- Kubernetes Integration: Write logic to interact directly with K8s APIs, manage Pod life-cycles, and handle Custom Resource Definitions (CRDs).
- Resource Management: Implement logic for dynamic resource allocation (GPU/CPU/Memory) and container orchestration.
- Systems Programming: Build robust daemons and services that interact with OS-level primitives (systemd, signals, I/O streams).
- Networking: Troubleshoot and optimize TCP/IP connections, DNS resolution, and firewall traversal to ensure seamless connectivity for users.
Requirements
- 8+ years of development experience with a strong focus on Systems Programming.
- Kubernetes Mastery: Deep understanding of Kubernetes architecture (beyond just writing YAML). You should know how to write code that controls K8s.
- Container Internals: Extensive experience with Docker, including building and maintaining images.
- Python for Systems: Experience using Python for automation, daemons, or CLI tools (using libraries like subprocess, socket, asyncio).
- Networking Fundamentals: Strong grasp of HTTP/S, WebSockets, TCP/IP, Proxies, and Reverse Proxies.
- OS Knowledge: strong understanding of Linux internals and shell scripting.
Benefits
- Fully Remote & Global – Work from anywhere with a distributed team of top-tier engineers.
- Engineering-First & Autonomous – High ownership, real responsibility, and freedom to design and ship impactful solutions.
- High Growth, High Impact – Your work directly affects thousands of users, from startups to large enterprises.
- Technically Deep Challenges – Build complex, performance-critical systems at the core of modern AI infrastructure.
- Fast Feedback, Real Users – See your work in production quickly and make a measurable difference.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonKubernetesDockerSystems ProgrammingResource ManagementNetworkingDynamic Resource AllocationContainer OrchestrationOS-level ProgrammingShell Scripting