FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSAzureCassandraCloudDockerElasticSearchGoGoogle Cloud PlatformKubernetesLinuxMongoDBMySQLPrometheusPython
About the role
Key responsibilities & impact- Design, deploy, and operate highly available MySQL and MongoDB clusters across our cloud environments
- Tune query performance, schema, and index strategy in partnership with application engineers and push fixes upstream into the application when that's the right answer
- Extend our observability stack — Prometheus, Loki, and Tempo
- Participate in the Platform on-call rotation, lead incident response for data-tier issues, and write postmortems that drive durable change
- Improve disaster recovery, security posture, and compliance for our database footprint — encryption, access control, audit logging, backup integrity
- Evaluate and operate ScyllaDB/Cassandra and Elasticsearch where they fit the workload
- Write the automation, tooling, and operators that take repetitive work off the team's plate
- Use AI to compress incident response and root-cause analysis
Requirements
What you’ll need- 7+ years in SRE, DevOps, platform, infrastructure, or database reliability roles, with at least 3 years owning production databases
- BSc in Computer Science or equivalent practical experience
- You've operated highly available MySQL and MongoDB in production at scale
- replication, sharding, backups, point-in-time recovery, and failover drills you've actually run, not just designed on paper
- You diagnose database performance end-to-end; query plan, indexes, locking, OS, storage, network
- You've shipped meaningful work on at least two of bare metal Linux, containerized workloads (Docker, Kubernetes, or similar), and a major cloud (GCP preferred; AWS or Azure equivalent is fine)
- You instrument what you build. You've used Prometheus, OpenTelemetry, or comparable systems to close real incidents, and you've written the dashboard the next on-call engineer will actually open.
- You write code that runs in production: Python, Go, Bash, or similar for automation, tooling, or operators. You don't hand off scripting to someone else.
- You communicate clearly under pressure and after the fact. Your postmortems are blameless, specific, and lead to changes that stick
- You bring an opinion on managed vs. self-managed databases, and can defend the trade-off based on availability, cost, and operational burden.
- ScyllaDB/Cassandra or Elasticsearch experience is a plus
- You've used AI tooling: copilots, agents, or custom automation to expedite incident response, root-cause analysis, or developer workflows.
Benefits
Comp & perks- competitive pay
- equity with significant upside
- sabbatical after every five years of service
- flexible schedules and time off
- free snacks in our break room
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
MySQLMongoDBScyllaDBCassandraElasticsearchPythonGoBashDockerKubernetes
Soft Skills
communication under pressureincident responsepostmortem writingblameless culturediagnostic skills
Certifications
BSc in Computer Science
