Summary
Join Strike, a company revolutionizing global finance with its Bitcoin app, as their Head of SRE. Lead and mentor a team of Site Reliability Engineers, setting the SRE strategy and driving operational excellence. You will be a hands-on leader, defining best practices, collaborating with development teams, and ensuring optimal system performance. This role requires strong leadership, technical expertise in cloud computing and various technologies, and excellent communication skills. Strike offers competitive compensation, equity, comprehensive benefits, and a collaborative work environment.
Requirements
- 5+ years in software engineering, systems administration, or site reliability engineering
- 2+ years in a leadership or management role
- Strong understanding of cloud computing, containerization, and orchestration technologies (e.g., GCP, Kubernetes, ArgoCD, Helm, Terraform)
- Extensive experience in cloud environments, including but not limited to GCP
- Proficiency in programming and scripting languages (e.g., Python, Go, Bash)
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
- Excellent problem-solving abilities and a proactive approach to challenges
- Strong communication and interpersonal skills, with the ability to work effectively in a team environment
Responsibilities
- Lead, mentor, and coach a team of Site Reliability Engineers, fostering their professional development and growth
- Define the SRE strategy and roadmap, aligning with company-wide goals and objectives
- Establish and champion SRE best practices, processes, and tools across the organization
- Work closely with development teams to design and build scalable, resilient systems
- Establish and monitor service level objectives (SLOs) and service level indicators (SLIs) to ensure optimal performance
- Respond to incidents, conduct thorough post-mortems, and implement preventive measures to avoid future issues
- Lead initiatives to automate manual tasks, improving operational efficiency and reducing errors
- Ensure adequate coverage for incident response through effective on-call rotation management
- Promote a culture of blameless post-mortems and continuous learning within the team
Benefits
- Salary range $120-$202K
- Equity in a high-growth startup
- Health, dental, and vision insurance premium contributions; short & long-term disability insurance and basic life insurance
- Cell phone and internet reimbursement
- Flexible PTO, sick leave & parental leave
- Access to a company 401k plan
- No trading fees when you buy and sell bitcoin on Strike
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.