Senior Software Engineer

Airbnb
Summary
Join Airbnb's Transactional Storage Services team as a Senior engineer and contribute to the design, development, and operation of a new, open-source NewSql database. You will work with a talented team to build a unified storage backend for Airbnb's online data, ensuring reliability, scalability, and security. Responsibilities include designing control plane and operations, automating critical database operations, delivering a generalized database platform, and building a robust backup and restore system. The ideal candidate possesses 5+ years of relevant experience, a strong understanding of distributed systems, and expertise with public cloud providers. This is a remote-eligible position with a competitive salary and benefits package.
Requirements
- 5+ years of relevant industry experience
- Solid understanding of distributed systems and infrastructure fundamentals
- Experience in deep diving and then owning a complex code base
- Knack for writing clean, readable, testable, maintainable code
- Ability to decompose large-scale distributed systems and figure out monitoring metrics, failure scenarios and debug them in an efficient manner
- Strong collaboration and communication skills in a remote-working environment
- Expertise with a public cloud provider (AWS, GCP, Azure) and their Storage, VM, Networking, Security offerings. E.g. external-dns, route53, ebs etc
Responsibilities
- Design frameworks and maintain the general ecosystem around our NewSql database’s monitoring, permissions, service discovery integration, etc
- Design, automate critical database operations such as centralized and hierarchical config management system, fully automated image building and release certification for major version upgrades, zero-downtime Blue/Green deployment
- Be part of the team that defines and delivers a generalized database platform for partner KVStore, ORM, MySql teams
- Deliver a zero-downtime forward and reverse replication pipeline with near-real-time consistency between two transactional databases, with correctness guarantee across transactional boundaries
- Deliver a robust failover/failback mechanism to guarantee correctness and continuity during unexpected outages
- Conduct case study of all Airbnb’s disaster recovery scenarios, leverage existing open source and/or design and implement software that satisfies Airbnb’s requirements on database backup and restore, cross-region data resiliency, PiTR, etc
- Design the right cluster topology, restore logic, and ransomware policy to safeguard Airbnb’s business continuity
Preferred Qualifications
- Experience in Java, Go, Rust or C++
- Experience with writing robust automation frameworks and tooling
- Experience with Kubernetes, operator pattern, helm, etc; experience with Infrastructure as Code, such as Chef and Terraform
Benefits
- Bonus
- Equity
- Benefits
- Employee Travel Credits