ClickHouse is hiring a
Senior Site Reliability Engineer, Remote - United Kingdom

Logo of ClickHouse

Senior Site Reliability Engineer

🏢 ClickHouse

💵 ~$150k-$222k
📍United Kingdom

Summary

The job is for a Site Reliability Engineer at ClickHouse, a fast and scalable open-source database management system company. The role involves collaborating with various engineering teams to design and implement scalable systems for ClickHouse, managing service level objectives (SLOs) and agreements (SLAs), ensuring monitoring and alerting, improving reliability and performance, planning Chaos initiatives, managing on-call processes, and more.

Requirements

  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 8 years of experience in Site Reliability Engineering or a related field
  • Previous experience using ClickHouse in production
  • Hands on experience with Go and/or Python
  • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus
  • Hands on experience with container orchestration tools such as Kubernetes or Docker Swarm
  • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet

Responsibilities

  • Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud
  • Ensure all the infrastructure components in ClickHouse Cloud have monitoring and alerting in place to ensure timely detection and resolution of incidents
  • Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers
  • Continuously improve the reliability and performance of our ClickHouse services
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime

Benefits

  • Flexible work environment - ClickHouse is a distributed company offering remote-first work to all employees
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in all countries
  • A $500 Home office setup if you’re a remote employee
  • Employee-driven international mobility - we enable you to relocate internationally if you wish (within certain countries and timelines and subject to role requirements, time zones and work permit considerations)

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs

Please let ClickHouse know you found this job on JobsCollider. Thanks! 🙏