SentinelOne

Senior Infrastructure Engineer – Data Streaming, Kafka, Redis

SentinelOne

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $128,000 - $176,000 per year

Job Level

Senior

Tech Stack

AWSCassandraCloudDistributed SystemsFluxGoGoogle Cloud PlatformJenkinsKafkaKubernetesPythonRedisTerraform

About the role

  • Operate and maintain distributed data services—including Kafka and Redis—running at massive scale across Kubernetes clusters and multi-cloud environments.
  • Unlock complete cloud portability for SentinelOne’s services by building a highly automated, self-service infrastructure that can run seamlessly across AWS, GCP, and air-gapped on-prem environments.
  • Manage data infrastructure supporting 5+ PB/day ingestion, ensuring low-latency, high-throughput, and cost-effective operation at global scale.
  • Consolidate and optimize multi-tenant Kafka clusters to reduce cost, improve resilience, and streamline operations.
  • Drive Redis and Kafka lifecycle automation using GitOps principles (ArgoCD, Terraform), reducing operational toil and minimizing pager fatigue.
  • Define and implement standards for observability, HA, backup, and DR of stateful workloads in Kubernetes.
  • Partner with FinOps and engineering stakeholders to continuously optimize performance, cost, and operational overhead across data platform components.
  • Own the end-to-end platform experience for mission-critical open-source systems such as Kafka, Redis, and Cassandra, serving hundreds of product teams.

Requirements

  • 5+ years of experience in infrastructure/platform engineering, with a proven track record of operating stateful distributed systems at scale.
  • Deep hands-on experience with Kafka and Redis running in Kubernetes, including performance tuning, scaling, partitioning, persistence, and operator-based lifecycle management.
  • Strong understanding of Kubernetes internals and best practices for managing both stateless and stateful workloads in production environments.
  • Hands-on with GitOps and IAC: ArgoCD and/or Flux
  • Terraform/Terragrunt is desired
  • CI/CD: Github Action / Jenkins
  • Python/Golang knowledge (ability to automate day-to-day operations, read and understand the comments, and provide improvement suggestions)
  • Understanding of SRE principles including SLA/SLO and incident response (role includes oncall)
  • US Citizenship and the ability to work in a government-regulated environment.
Benefits
  • Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
  • Unlimited PTO
  • Industry-leading gender-neutral parental leave
  • Paid Company Holidays
  • Paid Sick Time
  • Employee stock purchase program
  • Disability and life insurance
  • Employee assistance program
  • Gym membership reimbursement
  • Cell phone reimbursement
  • Numerous company-sponsored events, including regular happy hours and team-building events

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
KafkaRedisKubernetesTerraformGitOpsCI/CDPythonGolangCassandraIAC
Soft skills
performance optimizationcollaborationproblem-solvingcommunicationincident response