Aurora Labs is hiring a
Site Reliability Engineer

Logo of Aurora Labs

Aurora Labs

πŸ’΅ ~$82k-$120k
πŸ“Remote - Worldwide

Summary

Join Aurora Labs as a Reliability Engineer to help ensure the smooth operation of critical systems for their blockchain network. The role involves both site reliability (80%) and software engineering (20%), with responsibilities including ensuring high availability, automating configuration and maintenance, optimizing latency and throughput, and working on various software engineering projects.

Requirements

  • Strong emphasis on SRE as an engineering subject area, with proficiency in Golang
  • Successful track-record and proven experience as a backend internet services software developer
  • Knowledge of SDLC, including continuous integration and testing methodologies
  • Understanding of base internet infrastructure services including DNS, HTTP, server virtualization, server monitoring in critical, large scale distributed systems
  • Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Excellent verbal and written communication skills in English

Responsibilities

  • Ensuring high availability and failure tolerance of our infrastructure
  • Automating configuration and maintenance of software components such as K8s, NATS, Influxdb, Postgres, Cloudflare using e.g. Ansible, Terraform, Helm and kubernetes operators
  • Design and implementation of cloud-agnostic solutions without exclusively relying on specific cloud vendors
  • Validator and RPC nodes management automation
  • Optimizing the latency and throughput of the pub-sub infrastructure
  • Incident management, monitoring, distributed tracing and recovery automation

Preferred Qualifications

  • Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs
  • Experience with streaming and pubsub systems such as NATS, Apache Kafka, Apache Pulsar
  • Hardware bootstrap and associated security
  • Structured or unstructured storage and caching
  • Automating operations processes via services and tools
  • Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others
  • Cloud Services (AWS S3/EC2/CloudFront or equivalent)

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs

Please let Aurora Labs know you found this job on JobsCollider. Thanks! πŸ™