Site Reliability Engineer

Bitwarden
Summary
Join Bitwarden's Site Reliability Engineering team as a remote employee based in the U.S. and contribute to the operational success and growth of the Bitwarden product through cloud technology. You will own the current and future state of the cloud infrastructure, design and build new infrastructure, and operationalize tools and technologies to enable scaling. Responsibilities include evaluating cloud environments, making infrastructure design recommendations, identifying optimization opportunities, and rolling out future cloud offerings. You will also build observability, alerts, and automation capabilities. The role requires experience with multi-region deployments in public cloud environments, Kubernetes, a programming language, cloud deployment tools, and Git. The company offers a competitive salary range of $90,000-$145,000 and additional benefits (see careers page for details). Visa sponsorship is not offered.
Requirements
- Sense of curiosity, resourcefulness, pragmatism
- Experience with multi-region deployments in public cloud environments
- Demonstrable production Kubernetes experience (Managed Kubernetes, Helm, kubectl, kOps, etc)
- Fluency with least one programming language, such as C#, Python, Go, etc
- Working knowledge with cloud deployment and automation tools/methodologies (i.e. GitOps, Terraform, Pulumi)
- Proficiency using source control such as Git
- Ability to maintain discretion, handle sensitive information, and improve security best-practices
- Technocrat at heart, staying current with trends and new technologies
- Collaborative and adaptable mindset
- Openness and authenticity combined with excellent communication skills
- Excitement and enthusiasm for open source and for better internet security
- Excellent problem-solving skills β you might not know all the answers, but you know how to find and communicate the possible solutions
Responsibilities
- Take ownership of the Bitwarden cloud infrastructure, with an emphasis on quality that translates directly to user delight
- Evaluate current infrastructure and, on a regular basis, make recommendations for reliability, security, availability, scalability and cost management
- Implement site reliability tools, monitoring, early warning and alert systems, and observability across Bitwarden cloud environments
- Respond to infrastructure based outages; participate and contribute to ongoing strategy for 24x7 support (There is an on-call rotation with a weekend shift every 5-6 weeks)
- Active participation in code reviews, learning and spreading technical knowledge
- Contribute and mature incident management/escalation processes
- Collaborate with cross functional teams to refine priorities and deliverables
- Ongoing engagement with product owners to align SLI/SLOs/SLAs
- Evaluate and identify opportunities for new initiatives to support organizational needs
Preferred Qualifications
- Startup experience
- Open source experience
- User of Bitwarden
- Prior SaaS experience