Senior Cloud Performance Engineer

ClickHouse Logo

ClickHouse

πŸ“Remote - Netherlands

Summary

Join ClickHouse's Cloud Performance Engineering team and build the cloud-native ClickHouse Cloud Platform. This role requires 6+ years of experience in building and operating scalable, fault-tolerant, distributed systems, proficiency in languages like Go, C/C++, or Java, and expertise with cloud infrastructure (preferably Kubernetes and a public cloud provider). You will benchmark system performance, analyze database performance, optimize capacity, troubleshoot errors, and collaborate with various teams. The ideal candidate is a strong problem solver with excellent communication skills and a passion for efficiency and scalability. ClickHouse offers a remote-first work environment, healthcare contributions, stock options, flexible time off, a home office setup allowance, and opportunities for international mobility.

Requirements

  • 6+ years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems
  • Software development experience in Go, C/C++, Java, or similar
  • Experience with concurrency, multithreading, and the deployment of distributed system architectures
  • Experience developing cloud infrastructure services, preferably with Kubernetes
  • Experience leading and shipping large scope technical projects in collaboration with multiple experienced engineers
  • Expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g. EC2)
  • Excellent communication skills and the ability to work well within a team and across engineering teams
  • Strong problem solver and solid production debugging skills
  • Passionate about efficiency, availability, scalability and data governance
  • Thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward
  • High level of responsibility, ownership, and accountability

Responsibilities

  • Benchmark system performance, database performance analysis, capacity sizing and optimization
  • Ability to troubleshoot and debug application and server errors and logs and triage accordingly
  • Recommend configuration tuning/optimizations for performance bottlenecks
  • Work closely with ClickHouse core development team, cloud team, security team and partner with them to improve the performance of ClickHouse Cloud
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Develop, deploy and manage tools to systematically run chaos experiments and measure impact
  • Enjoy working on, and gaining a deep understanding of, large scale distributed systems
  • Study the problems in the software resilience, operational, and delivery spaces
  • Extend our entire backend to enable Chaos Engineering techniques in the system
  • Observe running systems, and determine/prioritize innovative ways to disrupt them

Benefits

  • Flexible work environment - ClickHouse is a distributed company offering remote-first work to all employees
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in all countries
  • A $500 Home office setup if you’re a remote employee
  • Employee-driven international mobility - we enable you to relocate internationally if you wish (within certain countries and timelines and subject to role requirements, time zones and work permit considerations)
  • Cash compensation and a stock options grant

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.