Senior Site Reliability Engineer

BenchSci
Summary
Join BenchSci's growing Platform Infrastructure group as a Senior Site Reliability Engineer. Reporting to the Engineering Manager - Infrastructure, you will leverage your expertise to solve complex challenges, respond to production incidents, participate in design discussions and code reviews, and collaborate on innovative solutions. Key responsibilities include building and maintaining observability platforms, leading design initiatives, improving incident response processes, and promoting SRE best practices. The ideal candidate possesses 5+ years of experience as a Senior Site Reliability Engineer, expert knowledge of reliability tools, and experience with cloud design patterns and specific coding languages. BenchSci offers a remote-first culture, competitive compensation, robust vacation policy, comprehensive benefits, and professional development opportunities.
Requirements
- 5+ years of experience working as a Senior Site Reliability Engineer preferred
- Expert knowledge of incident response, observability, and reliability tools and techniques in a cloud-native environment (Google Cloud is preferred, but AWS experience is also valuable)
- Experience with cloud design patterns (Google Cloud is considered an asset) and developing specialized application stacks on cloud services (Python backend, TypeScript frontend)
- Experience working in Python and JavaScript/TypeScript codebases
- Eagerness to share your own ideas, and openness to those of others
Responsibilities
- Build, deploy, and maintain observability platforms to enable teams to self-serve their metrics gathering and dash-boarding needs
- Lead software and system design initiatives by leveraging cloud-native design patterns and injecting your cloud expertise into the entire development lifecycle
- Partner with other teams to iterate on and improve BenchSciβs Incident Response processes
- Help other teams to respond, mitigate, and remediate production incidents
- Help other teams write effective post-mortems and improve our reliability culture and processes
- Work with your team, Staff Engineers, and Engineering Managers to help promote SRE best practices
- Help reduce toil and improve developer productivity by automating our team and business processes
- Partner with engineering and product stakeholders and other cross-functional teams to devise and refine requirements
- Communicate cross-cutting decisions to all potentially impacted teams
Benefits
- An engaging remote-first culture
- A great compensation package that includes BenchSci equity options
- A robust vacation policy plus an additional vacation day every year
- Company closures for 14 more days throughout the year
- Flex time for sick days, personal days, and religious holidays
- Comprehensive health and dental benefits
- Annual learning & development budget
- A one-time home office set-up budget to use upon joining BenchSci
- An annual lifestyle spending account allowance
- Generous parental leave benefits with a top-up plan or paid time off options
- The ability to save for your retirement coupled with a company match!