Remote Site Reliability Engineer, Cloud

closed
Logo of YugaByte

YugaByte

πŸ“Remote - Canada

Job highlights

Summary

Join the Database Revolution at Yugabyte. As a Site Reliability Engineer focused on database availability and reliability, you will be using your skills to operate and automate the life cycle of the YugabyteDB DBaaS.

Requirements

  • Strong software design and implementation skills in building infrastructure frameworks
  • Experience building and operating data systems for production applications, including fault tolerant designs, software lifecycles, and automation of critical operations
  • Strong track record of Incident Response and Management in a managed service which is mission critical for its customers
  • Experience with: Relational Database systems (PostgresQL preferred)
  • Public cloud infrastructure (AWS, GCP, and/or Azure)
  • Containerization tooling, theory and design (Docker, Kubernetes)
  • Infrastructure as Code (Terraform preferred)
  • Configuration Management Tooling (Ansible preferred)
  • Automation Scripting (Python and Bash preferred)
  • Monitoring systems (Prometheus preferred)
  • Version control systems (git preferred)
  • CI/CD systems (GitHub Actions preferred)
  • Solid understanding of Linux systems operations and troubleshooting
  • Willingness and ability to learn new languages and concepts
  • 1-6 yrs of relevant experience

Responsibilities

  • Design, develop, test, debug, troubleshoot, and maintain components of the DBaaS cloud infrastructure
  • Manage operational priorities of the DBaaS infrastructure
  • Establish process for handling and leading response to incidents on databases or infrastructure
  • Automate and manage regular maintenance operations such as upgrades etc
  • Design and build DBaaS processes for encryption, security key/password management, storage management, etc
  • Utilize SRE golden signals to analyze and optimize the DBaaS system's performance and reliability strategies
This job is filled or no longer available