πWorldwide
Incident Response Manager
closed
Stripe
πRemote - Worldwide
Summary
Join Stripe's Incident Ops team as an Incident Response Manager (IRM) and play a crucial role in driving incident resolution. Lead user-facing incidents across various domains, ensuring timely communication and remediation. Collaborate with cross-functional teams to improve incident handling processes. Develop your skills in incident management, communication, and technical understanding of Stripe's products and services. Contribute to a 24/7 global team dedicated to maintaining Stripe's high reliability. This role requires strong incident management experience and excellent communication skills.
Requirements
- 3+ years of demonstrable major incident experience for organizations that run mission critical applications or always-on Saas environments
- Demonstrated ability to independently lead multiple incidents concurrently with minimal support and guidance from senior team members
- Basic understanding of application development, architectures, and cloud environments
- Familiarity with infrastructure concepts, including physical, virtual, and container-based compute platforms
- Practical experience using modern monitoring and telemetry tools such as Splunk Prometheus, and Grafana
- Basic data analysis skills using SQL, Splunk or other tools
- Strong task management skills, with attention to detail and ability to remain composed in high-pressure situations
- Good written and verbal English communication skills, with the ability to translate complex technical issues for various stakeholders
Responsibilities
- Act as an Incident Commander for incidents across various classes (reliability, technical, data privacy, product, or security), driving incident resolution with urgency and cross-functional collaboration
- Lead all user-facing incidents across domains at Stripe - including reliability, technical, security, and data privacy
- "User First" approach to determine impact, providing accurate situation reports, facilitating comms bridges, and ensuring useful and timely external communications to users
- Update internal stakeholders and support decision-making processes during incidents
- Participate in the root cause analysis process, conduct post-mortems for routine incidents, and identify remediations
- Collaborate with engineering, product, and operations teams to improve incident handling processes and tooling
- Contribute to team culture and processes that enhance incident response capabilities
Preferred Qualifications
- Familiarity with different types of incidents such as technical, privacy, security, or crisis with eagerness to continually learn about Stripe's products and systems
- Experience in conveying key details of technical issues to stakeholders
- Experience with broad public-facing communications (e.g. status pages, tweets) and/or targeted communications (e.g. direct emails, support ticket responses)
- Familiarity with distributed architectures and system inter-dependencies which operated in a cloud environment
This job is filled or no longer available
Similar Remote Jobs
π°$19k-$245k
πUnited States
π°$187k-$291k
πUnited States

πWorldwide
πIsrael
πWorldwide
πArgentina, Brazil
πWorldwide
π°$44k-$52k
πSouth Africa
πCanada







