Sprinto is hiring a
Lead Site Reliability Engineer

Logo of Sprinto

Sprinto

πŸ’΅ ~$170k-$200k
πŸ“Remote - India

Summary

The job is for an SDET2 at Sprinto, a leading information security compliance automation platform. The role involves managing observability pipelines, developing CI/CD pipelines, infrastructure management, collaborating with application engineers, incident response, and on-call process development. Requirements include expertise in IaC tools, experience with APM tools, application capacity planning, incident response, problem-solving skills, and communication skills. Bonus points for familiarity with the tech stack (Node.js, React, Apollo GraphQL, PostgreSQL, AWS). Benefits include remote work, flexible hours, group medical insurance, group accident cover, company-sponsored device, education reimbursement policy.

Requirements

  • Expertise in Infrastructure as Code (IaC) Tools: Proficiency with tools such as Terraform and Ansible
  • Experience with APM Tools: Skilled in using Application Performance Monitoring tools, setting up on-call practices, identifying bottlenecks across the stack, and collaborating with teams to address these issues effectively
  • Application Capacity Planning and Incident Response: Proven experience in application capacity planning, owning incident response workflows, and running processes such as Root Cause Analyses (RCAs) and maintaining runbooks
  • Problem-Solving and Communication Skills: Strong problem-solving abilities and excellent communication skills, both spoken and written

Responsibilities

  • Observability Pipeline Management: Take ownership of the observability pipeline to ensure high availability and optimal performance of applications
  • CI/CD Pipeline Development: Design, build, and maintain the Continuous Integration/Continuous Deployment (CI/CD) pipelines to facilitate smooth and reliable product deliveries
  • Infrastructure Management: Own the complete infrastructure stack of the product, contributing to scalability and enhancements of the overall offering
  • Collaboration with Application Engineers: Work closely with application engineers to develop and refine tooling necessary for efficient operations management
  • Incident Response and On-Call Process Development: Establish and maintain on-call protocols and incident response processes to ensure timely resolution of issues and maintain service reliability

Preferred Qualifications

Familiarity with Our Tech Stack (Bonus): While experience with our current tech stack is not mandatory, familiarity with it is a plus as it will enable you to start contributing sooner. Our tech stack includes Node.js , React, Apollo GraphQL, PostgreSQL, and AWS

Benefits

  • Remote First Policy
  • 5 Days Working With FLEXI Hours
  • Group Medical Insurance (Parents, Spouse, Children)
  • Group Accident Cover
  • Company Sponsored Device
  • Education Reimbursement Policy

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs

Please let Sprinto know you found this job on JobsCollider. Thanks! πŸ™