Senior Site Reliability Engineer

Obsidian Security
Summary
Join Obsidian Security's DevOps/SRE team and ensure engineering excellence translates into stable, scalable, and high-performing production systems. Collaborate with engineering, quality engineering, and customer support teams to deliver end-to-end services. Support and maintain the service quality of the customer-facing SaaS security platform, addressing complex challenges around scalability, reliability, observability, and cost efficiency. Maintain and enhance Helm charts, application deployment, monitoring, and CI/CD pipelines. Define service verification strategies and implement them as part of the CI/CD process to meet SLAs. Improve developer experience by optimizing CI/CD workflows and performance. Participate in the on-call rotation, providing 24/7 support. Monitor, debug, and optimize production infrastructure and services on AWS/GCP. Obsidian offers competitive benefits, including comprehensive healthcare, flexible paid time off, parental leave, and professional development resources.
Requirements
- 4+ years of experience in a DevOps or SRE role supporting SaaS services on GCP and/or AWS
- Bachelor's degree in Computer Science or related field
- Strong proficiency in Kubernetes, microservices architecture, Helm, GitLab CI/CD, and ArgoCD, Prometheus, Grafana
- Programming experience in at least one language; Golang or Python preferred
- Deep understanding of autoscaling, version upgrades, and cloud service optimization
Responsibilities
- Support and maintain the service quality of our customer-facing SaaS security platform
- Address complex challenges around scalability, reliability, observability, and cost efficiency
- Collaborate with Engineering teams to maintain and enhance Helm charts, application deployment, monitoring and CI/CD pipelines
- Embed into the engineering team so that you understand the application deeply
- Define service verification strategies and implement them as part of the CI/CD process to meet SLAs
- Improve developer experience by optimizing CI/CD workflows and performance
- Participate in the on-call rotation, providing 24/7 support in coordination with our global SRE team
- Monitor, debug, and optimize production infrastructure and services on AWS/GCP
Preferred Qualifications
Bonus if you're familiar with technologies like Kafka, Elasticsearch, PostgreSQL, ScyllaDB, Databricks, Dagster, Sentry, Kong
Benefits
- Competitive compensation with equity and 401k
- Comprehensive healthcare with dental and vision coverage
- Flexible paid time off and paid holiday time off
- 12 weeks of new parent or family leave
- Personal and professional development resources