Summary

Join Vultr as a Staff Site Reliability Engineer to help automate and make an impact working with cross-functional teams, designing state-of-the-art cloud provider solutions, and enhancing the resilience and stability of our systems.

Requirements

3+ years of experience in a hands-on SRE role delivering distributed architectures
2+ years working with and maintaining Kubernetes clusters for highly available and regulated environments
2+ years of hands-on experience with a modern Grafana stack, including Mimir, Loki, and Tempo
Comfortable working with complex CI/CD Pipelines (Gitlab/Jenkins), configuration management (Puppet/Salt), and IaC solutions such as Terraform
Experience working with observability pipelines or Open Telemetry is a plus
A background in performance optimization for Webstacks, including components such as PHP-FPM, Ningx, and Mysql
Boasts strong programming chops in Python, Golang, or PHP and thrives when picking up new technologies

Responsibilities

Collaborate with cross-functional teams to craft and implement a modern observability stack and refine our incident-handling processes
Design and contribute to state-of-the-art cloud provider solutions for high-performance computing, AI training, and inference workloads, focusing on Observability and MLOps
The platform team aims to enhance the resilience and stability of our systems through thoughtful software improvements, architecture, and automation
Contribute to solutions for various challenges ranging in nature from low-level hardware issues to high-level distributed application scale challenges and everything in between
Champion DevOps and SRE principles through automation, thought leadership, and close collaboration within our engineering team
Enhance customer experience by improving case handling—strive for proactive responses, rich insights, and automated resolutions
Develop robust documentation to streamline the handling of recurring reliability issues, paving the way for junior SREs to take the helm confidently
Identify and implement scalable solutions to address technical challenges within our stack, setting new benchmarks for innovation

Remote Staff Site Reliability Engineer

Vultr

Job highlights

Summary

Requirements

Responsibilities

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Staff Software Engineer, Site Reliability Engineer

Fieldwire by Hilti

Remote

Software Development

Mid-level

Staff Software Engineer, Site Reliability Engineer

Babylist

Remote

Software Development

Mid-level

Staff Software Engineer, Site Reliability Engineer

Babylist

Remote

Software Development

Mid-level

Staff Site Reliability Engineer

Crisis Text Line

Remote

DevOps

Mid-level

Staff Site Reliability Engineer, Platform

Pismo

Remote

DevOps

Mid-level

Staff Site Reliability Engineer

Array

Remote

DevOps

Mid-level

Staff Site Reliability Engineer

Assured

Remote

DevOps

Mid-level

Staff Site Reliability Engineer

Illumio

Remote

DevOps

Mid-level

Staff Site Reliability Engineer

VGS

Remote

DevOps

Mid-level