๐New Zealand
Principal Site Reliability Engineer
![Boomi Logo](https://cdn.jobscollider.com/logo/boomi.com-73e4-1.webp)
Boomi
๐Remote - India
Please let Boomi know you found this job on JobsCollider. Thanks! ๐
Summary
Join Boomi as a Principal Site Reliability Engineer and contribute to a fast-growing company making a global impact. You will be responsible for developing sophisticated systems and software, ensuring the reliability and scalability of Boomi's production systems. This role involves collaborating with various teams, participating in on-call rotations, and mentoring other engineers. The ideal candidate possesses expertise in SRE, DevOps, automation, and cloud technologies, particularly AWS. Boomi offers a remote work opportunity and a commitment to creating an inclusive and accessible work environment.
Requirements
- Passionate about SRE, DevOps, Automation and infrastructure platforms. Expert in developing Ansible playbooks and automation for Infrastructure as code using Terraform and Cloud Formation Templates
- Expert in defining, measuring, and improving Reliability Metrics (SLO/SLI/ Error budgets)
- Strong in implementing observability practices (Monitoring, Logging, Distributed Tracing etc.) preferably using Splunk and New Relic. Experience not limited to using the dashboards, but creating them from scratch
- Experience in conducting and automating DR exercise in AWS cloud thus validating RPOs and RTOs
- Strong understanding and working experience with AWS components
- Ability to design and implement APIโs for use by internal teams
Responsibilities
- Participate actively in detecting, remediating and reporting on Production incidents, ensuring the SLAs/ SLOs are defined and met
- Participate in on-call rotation to ensure coverage for planned/unplanned events
- Engage with other Engineering organizations to implement processes, identify improvements, and drive consistent results
- Working with your SRE and Engineering counterparts for driving DR exercises, Game days, training and other response readiness efforts
- Collaborate with Service Engineering organizations to build and automate tooling, implement best practices on Observability and manage the Boomi services in production and consistently achieve our market leading SLA
- Improving the scalability and reliability of Boomiโs systems in production
- Automate the provisioning and maintenance of Boomiโs infrastructure
- Work independently with a minimal level of guidance from technical leadership
- Mentor other Boomi engineers, including design collaboration and code reviews
Preferred Qualifications
- 6 to 8 years of related experience in the software engineering industry, with experience supporting large scale software systems in production
- Certified in Cloud (AWS/Azure/GCP), experience in using services such as computers, containers and databases
- Experience in Ansible/Terraform and Python
- A grasp of Cloud Native concepts, containerization best practices and security awareness in Cloud will be a strong plus
- Experience in Observability, creating dashboards for SLA/SLI/SLO
Benefits
Remote work
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
๐Australia
๐United States
๐Europe
๐Portugal
๐Worldwide
๐United Kingdom
๐Europe
![City and County of San Francisco Logo](https://cdn.jobscollider.com/logo/city-and-county-of-san-francisco-e4b8b67762-9abb-1.webp)
๐ฐ$159k-$200k
๐United States