Senior Infrastructure Engineer

Somnia

Senior Infrastructure Engineer building and operating Somnia's key backend services. Responsible for SLI, SLO, monitoring, and incident response.

Posted 7/2/2026full-timeRemote • 🇪🇺 Anywhere in EuropeSeniorWebsite

Tech Stack

Tools & technologies

CloudDistributed SystemsDockerGoGrafanaKubernetesLinuxNode.jsPrometheusTerraformTypeScriptWeb3

About the role

Key responsibilities & impact

Define and maintain SLOs, SLIs, and error budgets, plus the observability—metrics, logs, traces and alerts—that catches regressions before users do.
Build repeatable, self-service infrastructure through infrastructure-as-code, CI/CD and golden paths so teams can provision, deploy and recover without reinventing the wheel.
Own rollouts end-to-end—progressive delivery, canaries, safe migrations and clean rollbacks.
Operate the systems behind Somnia's nodes, validators, RPC and indexing, tuning for performance and cost across regions.
Lead incident response and on-call, run blameless postmortems, and continuously harden the platform.
Partner with product and protocol teams to design and operate production-ready services. You'll rotate between embedding with engineering teams and building the shared platform, tooling and operational standards that underpin the wider organisation.

Requirements

What you’ll need

Strong experience operating production infrastructure at scale (cloud and/or bare metal), with deep Linux fundamentals.
Experience with infrastructure-as-code such as Terraform or Pulumi, alongside configuration management.
Experience running containers and orchestration platforms (Docker, Kubernetes) in production.
Strong programming skills, ideally in Go and/or TypeScript, for building automation and internal tooling.
Experience with observability stacks (Prometheus, Grafana, OpenTelemetry or equivalents).
Experience operating and monitoring distributed systems, including capacity planning and performance tuning.
Comfortable operating in high-stakes production environments and responding to incidents.
Genuine interest in crypto and on-chain systems.
Experience operating blockchain node infrastructure (validators, RPC, archive nodes) for an L1/L2.
Experience with high-performance networking, low-latency systems or load balancing at scale.
Multi-region and geo-distributed deployments with failover strategies.
Security and key management (HSMs, secrets management, hardening).
EVM tooling and the wider Web3 infrastructure ecosystem.

Benefits

Comp & perks

Competitive compensation with token incentives

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Linux FundamentalsDistributed Systems MonitoringCapacity PlanningPerformance TuningSecurity and Key ManagementHigh-Performance NetworkingLoad BalancingMulti-Region DeploymentsIncident ResponseBlockchain Node Infrastructure

Soft Skills

CollaborationProblem-SolvingLeadership