StarTree is hiring a
Senior Staff - Site Reliability Engineer
closedStarTree
π΅ ~$150k-$222k
πRemote - India
Summary
StarTree is seeking a seasoned Site Reliability Engineer (SRE) to manage, tune, and debug large-scale distributed systems, focusing on Apache Pinot and SQL DBs. The role involves collaborating with customers, executing disaster recovery strategies, and influencing the roadmap of other teams.
Requirements
- 12+ years of experience as an engineer (SRE, SDET, or development)
- Experience managing highly available production facing distributed systems and in-depth knowledge of Java are a plus
- Experience with cloud platforms such as AWS, GCP, or Azure
- Experience with Kubernetes and container orchestration
- Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar
- Knowledge of standard methodologies related to security, performance, and disaster recovery
- Strong troubleshooting and critical thinking skills
Responsibilities
- Leverage various monitoring and alerting services to solve intricate programming problems at scale
- Manage and tune multiple critical customer-facing Apache Pinot clusters
- Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues
- Build a rapport with and work closely with customers to mitigate and resolve incidents
- Execute disaster recovery strategies with minimal downtime
- Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams
This job is filled or no longer available
Similar Jobs
- π°$200k-$250kπWorldwide
- π°$140k-$200kπUnited States
- π°~$85k-$200kπIndia
- π°$217k-$255kπUnited States
- π°~$166k-$203kπCanada
- π°~$84k-$126kπIreland
- π°$172k-$215kπUnited States
- π°$191k-$287kπUnited States
- π°~$48k-$59kπPoland
- π°$145k-$155kπUnited States