Senior Deployment Engineer, AI Inference

Cerebras Systems

full-time

Posted on: 10/15/2025

Location Type: Remote

Location: Remote • 🇨🇦 Canada

✨ AI Apply

Senior

AWSDockerGrafanaKubernetesLinuxPrometheusPython

About the role

Deploy AI inference replicas and cluster software across multiple datacenters.
Operate across heterogeneous datacenter environments undergoing rapid 10x growth.
Maximize capacity allocation and optimize replica placement using constraint-solver algorithms.
Operate bare-metal inference infrastructure while supporting transition to K8S-based platform.
Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale.
Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale.
Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams.
Stay up to date with the latest advancements in AI compute infrastructure and related technologies.

5-7 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or developing and managing complex AWS plane infrastructure for hybrid deployments.
Strong proficiency in Python for automation, orchestration, and deployment tooling.
Solid understanding of Linux-based systems and command-line tools.
Extensive knowledge of Docker containers and container orchestration platforms like K8S.
Familiarity with spine-leaf (Clos) networking architecture.
Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana.
Strong ownership mindset and accountability for complex deployments.
Ability to work effectively in a fast-paced environment.

Benefits

Tip: use these terms in your resume and cover letter to boost ATS matches.

PythonLinuxDockerKubernetestelemetryobservabilityconstraint-solver algorithmsautomated deployment pipelineAWSspine-leaf networking architecture

ownership mindsetaccountabilityability to work effectively in fast-paced environment