ClickHouse

Senior Site Reliability Engineer

ClickHouse

full-time

Posted on:

Location Type: Remote

Location: Australia

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud.
  • Ensure all the infrastructure components in ClickHouse Cloud (including Data Plane, Control Plane, ClickHouse Core, etc) have monitoring and alerting in place to ensure timely detection and resolution of incidents.
  • Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers.
  • Continuously improve the reliability and performance of our ClickHouse services.
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities.
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

Requirements

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • At least 8 years of experience in Site Reliability Engineering or a related field.
  • Hands-on experience with Go and/or Python.
  • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
  • Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus.
  • Hands-on experience with container orchestration tools such as Kubernetes or Docker Swarm.
  • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
  • You are a strong problem solver and have solid production debugging skills.
  • You are passionate about efficiency, availability, scalability, and data governance.
  • You thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
  • You have a high level of responsibility, ownership, and accountability.
  • Excellent communication and interpersonal skills.
Benefits
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
  • Healthcare - Employer contributions towards your healthcare.
  • Equity in the company - Every new team member who joins our company receives stock options.
  • Time off - Flexible time off in the US, generous entitlement in other countries.
  • A $500 Home office setup if you’re a remote employee.
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoPythoncloud computingdistributed databasesSQLClickHouseKubernetesDocker SwarmAnsibleTerraform
Soft Skills
problem solvingproduction debuggingefficiencyavailabilityscalabilitydata governanceresponsibilityownershipaccountabilitycommunication
Certifications
Bachelor’s degreeMaster’s degree