Full-stack Engineer

• Design, build, and operate Kubernetes clusters on bare metal at scale
• Engineer full cluster lifecycle management (Talos bootstrapping, upgrades, node reprovisioning, HA control planes, recovery workflows)
• Architect networking, load balancing, and service mesh solutions optimized for bare metal
• Implement performant CNIs (Calico, Cilium), integrate L2/L3 networking, routing (BGP/ECMP), and optimize traffic across racks and datacenters
• Automate provisioning via PXE/iPXE, Tinkerbell, MAAS, and manage BMCs/IPMI/Redfish with standardized BIOS/firmware across heterogeneous hardware fleets
• Design and operate persistent storage (local disks, block, object) including Ceph, Rook, and openEBS
• Build automation and tooling (Go, Python, Bash) for provisioning, drift detection, upgrades, and incident response
• Extend observability with Prometheus, Alertmanager, Grafana, OpenTelemetry, and define SLOs for cluster health, latency, and workload availability
• Implement security best practices: Vault, cert-manager, RBAC hardening, network policies, and OS/K8s patch pipelines
• Mentor engineers and shape technical direction for Crusoe’s Kubernetes platform

Senior Software Engineer

Salary

Job Level

Tech Stack

About the role

Requirements