FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
S
Staff Database Reliability Engineer
ScribeStaff Database Reliability Engineer managing data infrastructure and leading database initiatives at Scribe. Ensuring operational excellence and driving observability across database systems.
Posted 5/7/2026full-timeRemote • California • 🇺🇸 United StatesLead💰 $225,000 - $250,000 per yearWebsite
Tech Stack
Tools & technologiesAmazon RedshiftAWSBigQueryDjangoGoKafkaPostgresPythonRabbitMQRedisSQLTerraform
About the role
Key responsibilities & impact- Own the data tier end-to-end
- Design schemas and access patterns that scale, tune Aurora for latency and throughput, and set the standards for how engineers interact with our databases
- Review migrations for safety at scale — locks, backfills, concurrent index builds, NOT VALID constraints
- Catch N+1 patterns and missing select_related/prefetch_related in review
- Establish conventions for QuerySet usage and physical schema design (indexes, constraints, partitioning)
- Scale review through automation, not heroics — author AGENTS.md files and DNA scaffolding that encode our conventions, configure AI review bots (Claude Code, Cursor, etc.) to flag risky migrations and ORM anti-patterns, and iterate on those configs as new failure modes emerge
- Capacity planning as traffic and engineering throughput grow
- Zero-downtime schema migrations and cutovers
- Multi-AZ resilience within a single region — Aurora writer/reader placement, failover behavior and RTO/RPO, ElastiCache and OpenSearch AZ topology, RabbitMQ survivability across AZs
- Backups, PITR, failover testing, retention
- Own the CDC pipeline (Aurora → DMS → S3 Parquet → Snowflake)
- DMS task design and tuning, replication slot hygiene on the Postgres side
- Schema evolution as Django migrations roll through — so a column rename doesn't silently break the warehouse at 6 AM
- Parquet layout and partitioning, reliability of the Snowflake handoff
- Automated checks that flag migrations likely to break downstream consumers
- Drive observability across three complementary tools: pganalyze, CloudWatch, Honeycomb
Requirements
What you’ll need- Deep PostgreSQL - EXPLAIN (ANALYZE, BUFFERS), MVCC, bloat, lock contention, vacuum/autovacuum. Aurora Serverless V2 / Limitless experience strongly preferred (storage model, reader/writer split, ACU scaling)
- Strong ORM fluency (Django, SQLAlchemy, ActiveRecord, or similar) - predict the SQL a query will generate, spot N+1 problems on sight and how to control eager loading (joins vs. batched IN queries), column projection, aggregations, and subqueries
- Single-region multi-AZ design - practical understanding of what it does and doesn't protect against
- Production CDC experience, ideally AWS DMS - comfortable with logical replication, slot hygiene, schema evolution, and Parquet-based data lakes feeding Snowflake (or BigQuery/Redshift)
- Hands-on with pganalyze (or Datadog DBM / Performance Insights / pg_stat_statements pipelines), CloudWatch (custom metrics, composite alarms, log insights), and Honeycomb (or another high-cardinality tracing tool) - comfortable with OpenTelemetry and opinionated about what makes a trace useful
- Real experience making AI coding and review tools useful for a team - writing AGENTS.md files, configuring review agents, versioning and iterating on prompts and configs
- OpenSearch at scale - sizing, sharding, JVM tuning, rolling upgrades, snapshots
- Production Redis - persistence tradeoffs, cluster mode, hot keys, thundering herds
- At least one production message broker (SQS, RabbitMQ, Kafka) - delivery semantics, idempotency, failure modes
- Strong automation and IaC background - real code (Python, Go, or similar) and Terraform
- Track record leading cross-team initiatives, writing design docs that hold up, influencing without authority
- Comfortable in a high-growth environment where the right answer for 50 engineers isn't the right answer for 100
- Pragmatic outlook during incidents - focused on preventing the next one
Benefits
Comp & perks- Some of the nicest and smartest teammates you’ll ever work with
- Competitive salaries
- Comprehensive healthcare benefits
- Exciting and motivating equity
- Flexible PTO
- 401k
- Parental Leave
- Commuter Benefits (SF office employees)
- WFH Stipend
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PostgreSQLAurora Serverless V2DjangoSQLAlchemyActiveRecordAWS DMSOpenSearchRedisTerraformPython
Soft Skills
leadershipinfluencing without authorityautomationpragmatic outlookcross-team initiativesdesign documentationhigh-growth adaptabilityincident prevention