
Senior DevOps Data Engineer
EverOps
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Design, implement, and validate disaster recovery architectures for relational, NoSQL, and managed data services across AWS, Azure, or GCP
- Plan and execute database migration cutovers including blue-green database swaps, read-replica promotion, and zero-downtime schema migration workflows
- Architect replication topologies (cross-region, cross-account, active-passive, active-active) and validate RPO/RTO targets through runbook-driven DR drills
- Build and maintain Infrastructure as Code for data platform provisioning (RDS, Aurora, DynamoDB, ElastiCache, Redshift, managed Kafka/MSK, etc.) using Terraform, Atlantis, and/or CloudFormation
- Design backup, snapshot, and point-in-time recovery strategies with automated validation and alerting
- Develop automation tooling for data platform operations: failover orchestration, health checks, capacity scaling, and credential rotation
- Implement observability for data infrastructure—replication lag monitoring, connection pool health, query performance baselines, and storage growth forecasting
- Support production workload migrations including data tier cutovers with rollback plans and data integrity verification
- Contribute to multi-tenant Kubernetes platform operations where data services intersect (e.g., External Secrets Operator for DB credentials, sidecar patterns for connection pooling)
- Participate in regular customer and internal EverOps scrums, providing data architecture guidance and operational status
- Document runbooks, architecture decision records (ADRs), and operational playbooks for data platform operations
Requirements
- 5+ years of professional experience as a DevOps Engineer, Data Platform Engineer, Database Reliability Engineer, or Site Reliability Engineer with a data infrastructure focus
- Deep hands-on experience designing and operating disaster recovery architectures for production databases (failover, replication, backup/restore, cross-region DR)
- Production experience planning and executing database cutover workflows—blue-green database swaps, read-replica promotions, DMS-based migrations, and zero-downtime schema changes
- Strong experience with AWS managed data services: RDS/Aurora (Multi-AZ, Global Database, cross-region replicas), DynamoDB (Global Tables, PITR, on-demand backup), ElastiCache, Redshift, and/or MSK
- Hands-on experience with Infrastructure as Code (Terraform + Atlantis and/or CloudFormation) for data platform provisioning and lifecycle management
- Hands-on experience and deep understanding of Linux
- Strong professional experience with at least one of: Python, Golang, Bash, or Rust for automation and tooling
- Production experience with Amazon EKS including understanding of how data workloads intersect with Kubernetes (StatefulSets, PVCs, External Secrets Operator, connection pooling)
- Experience with HashiCorp Vault for secrets management, particularly database credential rotation and dynamic secrets
- Understanding of GitOps workflows, repository structures, and governance patterns
- Experience with CI/CD tools like Jenkins, GitHub Actions, ArgoCD, etc.
- Experience with monitoring tools such as Datadog, Splunk, ELK, or Prometheus/Grafana—specifically for data infrastructure observability (replication lag, connection health, query latency)
- Relational database experience with PostgreSQL or MySQL including operational knowledge of replication, failover, and performance tuning
- NoSQL experience with at least one of: DynamoDB, Cassandra, or MongoDB including understanding of consistency models and partition strategies
Benefits
- 100% remote workplace – We’ve been remote since Day 1!
- Unlimited Paid Time Off
- Equity – If you display ownership of the work you’re doing you’ll become a true owner of the company
- 401K with company contribution
- Company sponsored healthcare
- Competitive compensation
- Opportunities to accelerate professional growth with access to training and certification programs
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
disaster recovery architecturedatabase migrationInfrastructure as CodeTerraformAWS RDSKubernetesPythonLinuxCI/CDNoSQL
Soft Skills
communicationcollaborationproblem-solvingdocumentationcustomer engagementoperational guidance