Site Reliability Engineer

Arista Networks Logo

Arista Networks

πŸ“Remote - United States, Ireland

Summary

Join Arista Networks, a leader in data-driven networking, as an SRE responsible for the global CloudVision service fleet. You will build and manage the CI/CD lifecycle, improve operational processes through automation, identify key service indicators, own disaster recovery, drive infrastructure and cloud-based application security design, lead incident response, and be an active member of the on-call team. CloudVision is a SaaS offering deployed on Kubernetes across global regions using Spinnaker for CI/CD. The tech stack includes GKE, HBase/Hadoop, ElasticSearch, ClickHouse, Kafka, TensorFlow, Prometheus, Grafana, Loki, and other OSS tools.

Requirements

  • BS/MS degree in Computer Science or a relevant experience subject
  • 4+ years software engineering experience
  • Experience developing or managing deployments of distributed database systems or scale out applications for a SaaS environment

Responsibilities

  • Building the CI/CD lifecycle for services, from inception and design to deployment and scaling
  • Improving operational processes through automation
  • Identifying key service indicators to be used in capacity planning
  • Owning disaster recovery and management
  • Driving infrastructure and cloud-based application security design
  • Leading sustainable incident response and blameless postmortems
  • Being an active member of our globally distributed on-call team

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.