Incident Response Manager

Logo of Stripe

Stripe

πŸ“Remote - Worldwide

Job highlights

Summary

Join Stripe's Incident Ops team as an Incident Response Manager (IRM) and play a key role in driving incident response and management. You will lead incident resolution, collaborate with cross-functional teams, and ensure timely communication with users. Responsibilities include acting as Incident Commander, leading user-facing incidents, and contributing to root cause analysis. The ideal candidate possesses 5+ years of major incident experience, strong technical skills, and excellent communication abilities. Preferred qualifications include domain expertise in various incident classes and experience with user-facing communications. This is a 24/7 role requiring strong problem-solving and decision-making skills in high-pressure situations.

Requirements

  • 5+ years of demonstrable major incident experience for organizations that run mission critical applications or always-on Saas environments
  • Demonstrated ability to lead multiple incidents concurrently with authority and influence responders with agency and reasoning skills to resolve ambiguous problems and drive to root cause
  • Strong full stack technical skills with development/support experience with cloud based technologies
  • Demonstrated experience developing code and automation using Python, Ruby, JavaScript or shell scripting
  • Solid understanding of infrastructure, including physical, virtual, and container-based compute platforms
  • Strong quantitative, and analytical skills in data manipulation using SQL, Splunk or other tools
  • Excellent task management skills, must be detail-oriented with ability to remain composed, methodical, and think fast in a high-pressured environment
  • Exceptional written and verbal English communication skills, with the ability to translate complex technical issues for internal and external stakeholders

Responsibilities

  • Act as an on-call Incident Commander, responsible for driving and managing incident resolution with a high level of urgency, cross-functional collaboration, and accuracy, while partnering with a global and diverse set of teams, including Engineering, Product, Policy, Risks, PR, Legal, Execs, etc
  • Lead all user-facing incidents across domains at Stripe - including reliability, technical, security, and data privacy
  • "User First" approach to determine impact, providing accurate situation reports, facilitating comms bridges, and ensuring useful and timely external communications to users
  • Proactively update internal stakeholders, make decisions through data and influence by partnering with Engineering, Sales, Support and other cross-functional teams
  • Contribute to the root cause analysis process while conducting post-mortems, remediations identification, and ensure problem management tasks meet SLA and user expectations
  • Drive improvements in the incident handling process and incident management metrics and tooling based on trends and data of Stripe's incidents in collaboration with engineering, product and operations teams
  • Collaborate closely with leadership for building team strategy based on the team vision
  • Collaborate and coach other Incident Response Managers on the team

Preferred Qualifications

  • Domain expertise in classes of incidents such as technical, privacy, security or crisis with a strong desire to continuously learn about Stripe's products, technical issues and systems
  • Ability to review complex technical details regarding ongoing issues/events and convey the key details to senior stakeholders to facilitate real-time decision making
  • Experience with broad user-facing communications (e.g. status pages, tweets) and/or targeted communications (e.g. direct emails, support ticket responses)
  • Familiarity operating or managing distributed architectures with the ability to correlate system behaviors based on known inter-dependencies
  • Demonstrated experience with full stack development and support

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Stripe know you found this job on JobsCollider. Thanks! πŸ™