Director of Site Reliability Engineers at Cyware

Summary

Join Cyware as the Director of Site Reliability Engineers (SREs) and lead a team responsible for maintaining the smooth operation of all user-facing services and production systems. You will guide and develop SREs, ensuring system monitoring, driving root cause analysis, and supporting on-call teams. This role involves leading automation efforts, defining and measuring SRE metrics, overseeing high availability and disaster recovery, and optimizing cloud infrastructure. Collaboration with engineering, security, and operations teams across time zones is crucial. The ideal candidate possesses extensive experience in SRE management and a strong technical background.

Requirements

US Citizenship
Bachelor's degree or higher, in Computer Science, Engineering, IT or related discipline
7 to 10 Years of total experience as an SRE
4 to 6 Years of experience managing a team of SREs
Experienced in knowledge sharing and mentoring of Team members
Self-awareness, handling conflict in the team, and providing and receiving feedback
Accountability: willing to proactively step in and do the right thing while providing candid and constructive feedback
Cloud: AWS/Azure/GCP
Linux: Solid understanding of Linux Systems, sed/awk/grep/egrep, VI/VIM/Emacs, netstat, lsof, strace, ps/top/atop/dstat, grub boot config & systems rescue, fstab/disk labels, ext3/ext4, IPtables, sysstat (sar/vmstat/iostat etc), run-levels & startup scripts, sudo/chroot
Scripting: Bash/Python
Development Languages and Frameworks: Python/Django, Vue, React, Go Lang
Fundamentals: Basic DNS & Networking, TCP/UDP, IP Routing, HA & Load Balancing Concepts
Application Protocols: SMTP, HTTP, HTTPS, FTP, IMAP, POP

Responsibilities

Guide and develop SREs, setting clear goals and fostering a high-performance culture
Ensure system monitoring, drive root cause analysis, and support on-call teams to meet SLAs
Lead efforts to automate deployments, infrastructure provisioning, and operational tasks to minimize human error
Define and measure SRE metrics (SLIs, SLOs, SLAs) and drive continuous improvement
Oversee high availability (HA), disaster recovery (DR), and compliance monitoring
Manage and optimize cloud infrastructure using tools like Terraform, Kubernetes, and Jenkins
Ensure smooth deployments, operational readiness, and security compliance
Work across time zones to coordinate with engineering, security, and operations teams

Preferred Qualifications

Database Systems Fundamentals (MySQL/Postgres)
Redis
Nginx/Apache
Supervisorctl
Nagios
Yum
RPM
GIT
Grafana
Prometheus
New Relic
ELK
Docker
Jenkins
RHCSA/RHCE/AWS (SysOps)

Benefits

Time off
Paid holidays
Retirement plans
Insurance coverage
Professional development opportunities
Competitive compensation packages

Director of Site Reliability Engineers

Cyware

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Director

Similar Remote Jobs

Remote

DevOps

Entry Level

Remote

DevOps

Director

Remote

DevOps

Senior

Remote

DevOps

Director

Remote

DevOps

Manager

Remote

Remote

DevOps

Senior

Learning Technologies Group plc

Remote

DevOps

Mid-level

Remote

DevOps

Principal

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level