GitLab

Intermediate Site Reliability Engineer, Database Operations

GitLab

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇪🇺 Anywhere in Europe

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AnsibleChefDistributed SystemsGoKubernetesPostgresPuppetRubySQLTerraform

About the role

  • Automating every operational task is a core requirement for this role. For example, package updates, configuration changes across all environments, creating tools for automatic provisioning of user facing services, etc.
  • Responding to platform emergencies, alerts, and escalations from Customer Support.
  • Ensure systems exist to manage software life-cycles (e.g. Operating Systems) with a minimum of manual effort.
  • Develop a fully automated multi-environment observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns.
  • Plan for new service roll-outs, expansion and capacity management of existing services, and work with users to optimize their resource consumption.
  • Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product.
  • Analyze solutions and implement best practices for our PostgreSQL database clusters and its components.
  • Work on observability of relevant database metrics and make sure we reach our database objectives.
  • Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents.
  • OnCall support on rotation with the team.
  • Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations).
  • Work on automation of database infrastructure and help engineering succeed by providing self-service tools.
  • Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible.
  • Plan the growth of GitLab's database infrastructure.
  • Design, build and maintain core database infrastructure components that allow GitLab to scale to support hundreds of thousands of concurrent users.
  • Support and debug database production issues across services and levels of the stack.
  • Make monitoring and alerting alert on symptoms and not on outages.
  • Document every action so your learnings turn into repeatable actions and then into automation.

Requirements

  • Have primary experience running PostgreSQL in high-growth, large production environments using both self-managed (VM, Kubernetes with modern PostgreSQL Operators) as well DBaaS services.
  • Have hands-on experience using data from PostgreSQL internals to design, build and troubleshoot systems.
  • Have primary experience with infrastructure automation, orchestration and configuration management (Chef, Ansible, Puppet, Terraform)
  • Have solid understanding of SQL and PL/pgSQL
  • Significant experience working in a Large SaaS distributed Systems production environment
  • Share our values, and work in accordance with those values.
  • Have excellent written and verbal English communication skills, with an urge to collaborate and communicate asynchronously.
  • Have an urge to document all the things so you don't need to learn the same thing twice, and an urge for delivering quickly and iterating fast.
  • Have a proactive, go-for-it attitude. When you see something broken, you can't help but fix it
  • Solid data modeling and data structure design skills
  • Bonus: Solid programming skills as a (former) backend engineer - Preferably with Ruby and/or Go.
  • Bonus: Experience with Clickhouse, or other modern OLAP database.
Benefits
  • GitLab is proud to be an equal opportunity workplace
  • GitLab’s policies and practices are based solely on merit

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
PostgreSQLSQLPL/pgSQLinfrastructure automationorchestrationconfiguration managementdata modelingdata structure designRubyGo
Soft skills
written communicationverbal communicationcollaborationdocumentationproactive attitudeproblem-solvingiterationasynchronous communicationteamworkurgency