Summary

Join AlphaSense as a Senior SRE to tackle complex reliability challenges at scale. You will support developers and the AlphaSense platform, working on platforming and tooling, proposing and experimenting with solutions. The role involves adapting open-source solutions to meet our needs and shaping the SRE culture. You will elevate product reliability, collaborate with engineering teams, participate in on-call rotations, and diagnose and resolve production issues. This remote position based in India requires covering 8 hours of US time zone with flexibility. The team is composed of talented individuals across Product, User Experience & Engineering.

Requirements

Strong experience in Linux, Kubernetes, Helm
Cloud-native architectures and modern web application behavior knowledge
Python (or similar) and AWS or GCP experience
Be adept at diagnosing complex issues and bottlenecks in operating systems, networks, and distributed systems
Understand complex, layered tech stacks from infrastructure to frontend. Identify failure points and implement preventative measures to protect software and systems
Thrive under pressure, demonstrating strong ownership and accountability
Quickly synthesize complex information for effective decision-making
Maintain composure and clear judgment under stress—ready to act decisively when needed, while avoiding overreaction
Exhibit excellent communication skills, fostering effective collaboration and problem-solving during incident response and day-to-day operations
Have experience in system monitoring; understanding of SLO, MTTI, MTTR, and MTTF
Actively monitor projects driven by other teams for uncovering dependencies and understand their impact
Have the ability to decompose reliability problems or business scenarios into multi-component solutions

Responsibilities

Support developers and the AlphaSense platform, working on platforming and tooling, proposing and experimenting with solutions
Adapt open-source solutions to meet our needs and shape the SRE culture
Elevate product reliability to the level of precision associated with Swiss watch brands, targeting 99.99% uptime
Engage with our engineering teams and contribute to the improvement of their software application through first-class observability
Participate in an on-call rotation, promptly addressing AlphaSense availability incidents, and offering support for application engineers during incidents
Diagnose and resolve production issues spanning multiple services and technology stacks
Analyze complex systems to pinpoint and address root causes of problems efficiently
Assist with daily production system operations, including incident response, troubleshooting, and maintenance
Help ensure smooth system operation, contribute to automation, and address operational challenges

Preferred Qualifications

Experience working with on-call Incident Response solutions (PagerDuty, FireHydrant)
Experience in modern monitoring systems (like Grafana LGTM stack)
Experience with running things at scale

Benefits

Remote work from India
The role requires covering full 8 hours of United States time zone (7 PM - 3 AM IST), with flexibility starting later by 1-2 hours

Senior Site Reliability Engineer

AlphaSense

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Abnormal Security

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior