DataCrunch

Engineering Manager – Bare Metal & Clusters

DataCrunch

full-time

Posted on:

Origin:  • 🇫🇮 Finland

Visit company website
AI Apply
Manual Apply

Job Level

Mid-LevelSenior

Tech Stack

CloudGrafanaKubernetesPrometheus

About the role

  • Lead and coordinate the development of bare-metal and virtualized GPU cluster offerings.
  • Work closely with SRE, hardware, and cluster teams to deliver robust infrastructure.
  • Drive the infrastructure and cluster roadmaps, aligning priorities across teams and ensuring clear delivery goals.
  • Oversee tracking of server infrastructure, ensuring visibility and accountability for hardware usage and deployments.
  • Align team efforts with company objectives, resolving priorities across multiple streams of work.
  • Implement and improve processes for roadmapping, prioritization, and cross-team collaboration.
  • Mentor and support engineers, fostering a culture of collaboration, delivery, and innovation.
  • Contribute to long-term scalability by identifying and addressing systemic infrastructure challenges.

Requirements

  • Proven experience as an Engineering Manager or Senior Technical Lead, ideally in a start-up or scale-up environment.
  • Strong background in bare metal server management and distributed computing/HPC.
  • Experience with virtualization in large-scale environments.
  • Strong leadership, organizational, and cross-functional communication skills.
  • Excellent communication skills, both technical and non-technical.
  • Nice-to-haves: Experience with MaaS and infrastructure automation; Experience with latest generation GPU systems; Experience with high-performance networking (RDMA, InfiniBand, RoCE); Experience with HPC workload orchestration using Slurm and/or Kubernetes; Experience with observability and monitoring stack (Grafana, Prometheus, ELK); Exposure to hardware lifecycle management and data center operations.