Aurora Labs is hiring a
Site Reliability Engineer, Remote - Worldwide

Logo of Aurora Labs

Site Reliability Engineer

🏢 Aurora Labs

💵 ~$180k-$252k
📍Worldwide

Summary

The job is for a Reliability Engineer at Aurora Labs, responsible for ensuring high availability and failure tolerance of infrastructure, automating configuration and maintenance of software components, designing cloud-agnostic solutions, and working on various software engineering projects. The candidate should have experience in SRE, Golang, backend internet services development, SDLC, understanding of base internet infrastructure services, and excellent communication skills.

Requirements

  • Strong emphasis on SRE as an engineering subject area, with proficiency in Golang
  • Successful track-record and proven experience as a backend internet services software developer
  • Knowledge of SDLC, including continuous integration and testing methodologies
  • Understanding of base internet infrastructure services including DNS, HTTP, server virtualization, server monitoring in critical, large scale distributed systems
  • Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Excellent verbal and written communication skills in English

Responsibilities

  • Ensuring high availability and failure tolerance of our infrastructure
  • Automating configuration and maintenance of software components such as K8s, NATS, Influxdb, Postgres, Cloudflare using e.g. Ansible, Terraform, Helm and kubernetes operators
  • Design and implementation of cloud-agnostic solutions without exclusively relying on specific cloud vendors
  • Validator and RPC nodes management automation
  • Optimizing the latency and throughput of the pub-sub infrastructure

Preferred Qualifications

  • Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs
  • Experience with streaming and pubsub systems such as NATS, Apache Kafka, Apache Pulsar
  • Hardware bootstrap and associated security
  • Structured or unstructured storage and caching
  • Automating operations processes via services and tools
  • Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others
  • Cloud Services (AWS S3/EC2/CloudFront or equivalent)

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs

Please let Aurora Labs know you found this job on JobsCollider. Thanks! 🙏