
Senior Database Reliability Engineer
ClickUp
full-time
Posted on:
Location Type: Remote
Location: Poland
Visit company websiteExplore more
Job Level
About the role
- Build a deep understanding of how ClickUp's systems behave, scale, interact and fail, and use that insight to identity risks and opportunities for remediation
- Own, drive and improve the incident management process across engineering org and participate in the team's follow-the-sun model
- Define SLOs and SLIs for all of our services and introduce error budgeting
- Own and improve our observability on all of our services
- Build software solutions to enable reliability and operability of large scale distributed systems handling petabytes of data and serving
- Build tools and automation to eliminate toil and reduce operational overhead. Create frameworks, processes and best practices to be used across ClickUp Engineering
- Automate critical portions of ClickUp engineering processes, to minimize risk and maximize the speed of innovation
- Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world
Requirements
- Software engineering: At the very core, we are looking strong software engineers with operational, infrastructural or SRE mentality who can design and build systems for platform and infrastructure layers
- Cloud experience: Production working experience in a major cloud environment around doing CI/CD deployments, using managed services, bootstrapping and provisioning services via infrastructure-as-code (IAC) systems, automations and operations
- Infrastructure Management: You have worked with and managed production grade infrastructure with IaC tools or configuration management tools
- Operating systems: Strong knowledge of *nix based operating systems, their internals and advanced troubleshooting commands
- Compute: Experience of working with VMs, containers and container orchestration systems
- Database: Experience of working with RDBMS and NoSQL storage solutions within production capacity and know your way around running and inspecting queries. A good understanding of indexing, locking, replication and sharding are a bonus!
- Observability: You have worked with logging, monitoring and alerting tools before and you know how logs are collected, aggregated and injected. You have set up monitors and alerts for production services and know your way around concepts such as SLOs and SLIs
- Bonus points: We believe strong engineers can pick up any technologies and tools fast and hit the ground up running. Therefore, we avoid listing specific technologies. However, if you have worked with at least one of the technologies we have in our stack that would definitely be a bonus point.
- CloudFormation/CDK, ECS, ElasticBeanstalk
- PostgreSQL, DynamoDB, AuroraDB
- Typescript or any JavaScript based framework
Benefits
- Unsure if you meet all the qualifications of this job description but are deeply excited about the role? We hire based on ambition, grit, and a passion for improving the way people work. If you think ClickUp is the company for you, we encourage you to apply!
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
software engineeringinfrastructure managementcloud experienceobservabilityoperating systemscomputedatabaseautomationerror budgetingincident management
Soft Skills
problem-solvingcollaborationcommunicationleadershiporganizational skills