Summary

Join Strike, a company building a global Bitcoin app, and become a Site Reliability Engineer based in Europe. Lead technical initiatives to improve system reliability, performance, and scalability. Architect and implement advanced, resilient solutions leveraging your deep understanding of distributed systems. Master troubleshooting and optimization, and build automation frameworks. Elevate observability practices and provide leadership in incident management. Mentor and guide other engineers. This role requires extensive experience in SRE, systems engineering, or software development with a strong operational focus.

Requirements

Extensive experience with minimum 5 years in SRE, systems engineering, or software development with a strong operational focus
Demonstrated experience in providing technical leadership, guidance, or mentorship to engineering teams
Expert-level practical knowledge of cloud platforms, especially GCP
Deep hands-on experience with container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Helm, ArgoCD)
Strong command of multiple scripting and programming languages (Python, Go, Bash)
Proven expertise in building and leveraging advanced monitoring and observability tools (Prometheus, Grafana, ELK stack)
Exceptional analytical, problem-solving, and debugging skills at a senior level
Excellent communication, collaboration, and influencing skills

Responsibilities

Lead Technical Initiatives: Drive key technical initiatives focused on improving the reliability, performance, and scalability of our critical systems, often leading technical aspects within projects
Architect and Implement Advanced Solutions: Design and implement sophisticated resilient and scalable solutions, leveraging your deep understanding of distributed systems
Master Troubleshooting and Optimization: Lead complex troubleshooting efforts, identify deep-seated root causes, and implement advanced optimizations
Build and Evangelize Automation: Develop and champion the adoption of robust automation frameworks and tools, potentially guiding more junior engineers in their development
Elevate Observability Practices: Design and implement comprehensive and insightful monitoring and logging solutions, ensuring actionable insights are available across teams
Provide Leadership in Incident Management: Take a leadership role in incident response, providing critical technical direction and mentorship during high-pressure situations
Champion Post-Mortem Excellence: Lead and contribute to in-depth blameless post-mortem analyses, driving significant improvements based on learnings
Mentor and Guide Team Members: Share your extensive knowledge and experience to mentor and guide other SREs and engineers, fostering their technical growth

Benefits

Compensation for services is location dependent

Site Reliability Engineer

Strike

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Tailor

Remote

Software Development

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Senior