Senior Site Reliability Engineer

Filevine
Summary
Join Filevine's Reliability team as a Site Reliability Engineer and play a pivotal role in building and maintaining our autonomous systems. You will collaborate with a cross-functional team, developing and improving systems for reliability, scalability, and performance. Over time, you will take on mission-critical objectives, contributing to the growth of our internet-scale applications. This position requires a minimum of 8 years of hands-on technical experience, including 2 years in SRE, and a bachelor's degree in a related field or equivalent experience. The ideal candidate possesses a passion for continuous improvement and a proactive approach to problem-solving. Filevine offers a competitive compensation and benefits package, including 100% remote work for engineers.
Requirements
- Curiosity, a willingness to learn, a passion to continually improve, and unbridled enthusiasm to make things better everyday without the need to be directed to do so
- Proficiency in all of the skills expected of our SRE II's
- A bachelors degree in computer science, information systems, a related field; comparable certifications; or equivalent direct work experience
- A minimum of 8 years of experience in hands on technical roles
- A minimum of 2 years of Site Reliability Engineering experience
- Experience building autonomous systems that manage software operational details without human intervention
Responsibilities
- Develop autonomous systems that manage the details necessary to build, deploy, test, and operate all Filevine Inc. products
- Be the voice of Reliability on your team throughout the SDLC
- Collecting, monitoring, aggregating, dashboarding, and alerting on software and server events
- Improving the CI/CD pipeline
- Developing playbooks, tools, and scripts to streamline processes and shorten problem resolution time
- Identifying and fixing gaps in the availability of systems
- Improving and defending the security of software and systems
- Documenting and diagramming processes, procedures, and best practices
- Finding, learning, improving, or creating new tools that are reliable, usable, and helpful to enable other engineers to perform their work more efficiently
- Work within assigned team to complete duties as assigned, while mentoring, training, and reviewing more junior engineers
- Work either individually or in conjunction with other engineers to complete assignments
- Be part of an on-call rotation with other team members to provide 24/7/365 production reliability support
- Be part of an on-call rotation with other team members to provide escalated emergency support for the services your team owns
- Communicate frequently, clearly, and effectively with various technical and management audiences
Preferred Qualifications
- M.S. in computer science, information systems, a related field; comparable certifications; or equivalent direct work experience
- 2-6 years of Site Reliability Engineering Experience
- Experience developing, deploying, and maintaining internet scale applications
- Experience incorporating Artificial Intelligence or Machine Learning into internet scale applications
Benefits
- Competitive compensation and benefits package
- Opportunity to learn from a dedicated leadership team
- Dynamic, rapidly growing company, focused on helping organizations thrive
- 100% remote work environment for engineers
- Hackathons for innovation
- A dynamic, rapidly growing company, focused on helping organizations thrive
- Medical, Dental, & Vision Insurance (for full-time employees)
- Competitive & Fair Pay
- Maternity & paternity leave (for full-time employees)
- Short & long-term disability
- Opportunity to learn from a dedicated leadership team
- Centrally located open office building in Sugar House (onsite employees)
- Top-of-the-line company swag