Remote Weekend Site Reliability Engineer

Logo of Sporty Group

Sporty Group

πŸ“Remote - United States

Job highlights

Summary

Join our Site Reliability Team as we strive to be the best in industry for our users, with millions of weekly active users. We are building a team that will focus on site reliability and security, involved deployment, configuration, and monitoring, as well as availability, latency, change management, emergency response, and capacity management of services in production.

Requirements

  • 4+ years SRE/DevOps experience
  • Be based in Europe or Latin America
  • Experience independently leading the planning and deployment of a project
  • Experienced with cloud platforms, especially AWS, including solid knowledge of how to utilize cloud resources to fulfill the demand from other teams and production
  • Familiar with one program language or script language (Python, Java....)
  • Experience managing multiple kubernetes clusters in production (virtualization, orchestration, scalability, security, and high availability), skillset such as Helm, Rancher, ArgoCD
  • Solid networking protocol and cyber security knowledge, especially the TCP / IP stack and HTTP protocol
  • A strong understanding of cache, including CDN, HTTP cache (CloudFlare, AWS CloudFront)
  • Experienced with CloudNative Monitoring solution in Large distributed system using observation model(Trace, Metric, Logging), skillset such as Prometheus, Jaeger, Loki, ELK, Grafana
  • Excellent troubleshooting skills, including Linux OS issue diagnosis and OS parameter optimization

Responsibilities

  • Work across the weekend and your choice of weekdays
  • Work with a team of DevOps/SRE and DBA professionals
  • Improve existing infrastructure and processes currently deployed in as well as streamlining processes deploy to new countries in the future
  • Holistically improve all aspects of our current infrastructure including: reducing costs; streamlining environment provisioning; lowering response times and incorporating the latest techniques and technologies
  • Monitor and maintain the existing cloud infrastructure via autoscaling, automated alerts, andOpsWork and Grafana dashboards
  • Take ownership and responsibility for our cloud operation activities
  • Liaise with external security agencies for annual audits as well as perform our own internal security sweeps
  • Aid in reconfiguring existing architecture to allow for rapid deployments to new countries
  • Mentoring less experienced team members

Benefits

  • Quarterly and flash bonuses
  • We have core hours of 10am-3pm in a local timezone, but flexible hours outside of this
  • Top-of-the-line equipment
  • Referral bonuses
  • 28 days paid annual leave
  • Annual company retreat

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Sporty Group know you found this job on JobsCollider. Thanks! πŸ™