Summary

Join SEON's Site Reliability Engineering (SRE) team as a highly experienced and motivated SRE Manager to lead a team of Site Reliability Engineers. You will play a crucial role in maintaining the reliability and efficiency of our services, ensuring that our products and services are reliable while coordinating with cross-functional teams across various geographical regions. This role offers flexibility, based in Budapest with a hybrid schedule or remotely in the European Union with occasional travel. You will lead and grow a high-performing SRE team, own incident management, drive implementation of SLAs and SLOs, champion automation, collaborate with engineering teams, and oversee system monitoring. You will also manage on-call rotations, drive continuous improvement, ensure compliance, provide mentorship, and communicate effectively with stakeholders.

Requirements

Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
Proven success in leading high-performing SRE or DevOps teams in a large-scale, fast-paced environment
Extensive experience running high-availability web services at a large scale, with comprehensive knowledge of cloud-native architectures and advanced networking concepts
Strategic vision to balance immediate operational needs with long-term reliability and scalability objectives
Outstanding communication and interpersonal skills, with the ability to build strong relationships with team members and stakeholders
Strong technical background with hands-on experience in cloud computing, system architecture, automation, and monitoring
Excellent problem-solving skills with a focus on root cause analysis and proactive improvements
Exceptional organizational skills, with the ability to manage multiple priorities and projects simultaneously
Experience with tools and technologies such as AWS, Kubernetes, Terraform, Prometheus, Grafana, Jenkins, and similar

Responsibilities

Lead and grow a high-performing SRE team responsible for the reliability, performance, and scalability of production systems
Own the incident management process, postmortems, and root cause analysis to improve system resilience
Drive implementation of SLAs, SLOs, and error budgets across services to align operational goals with business objectives
Champion the use of automation to reduce manual work and improve deployment and recovery times
Collaborate with software engineering and Platform engineering teams to ensure systems are designed for reliability and operational efficiency
Oversee system monitoring, alerting, and observability efforts using tools like Prometheus, Grafana, Datadog, or similar
Manage on-call rotations, and ensure proper documentation, runbooks, and playbooks are maintained
Identify and drive continuous improvement in system architecture, capacity planning, and deployment strategies
Ensure compliance with security, privacy, and regulatory requirements within the infrastructure
Provide mentorship, performance reviews, and career development opportunities for SRE team members
You will communicate effectively with stakeholders at all levels, providing updates on team performance, project status, and incident resolutions
You will advocate for the SRE team within the broader organization, representing their needs and concerns

Preferred Qualifications

Cloud Architect Certification in one of the public clouds (AWS, GCP, Azure)
Good Knowledge of security controls for SOC2 and ISO certifications

Benefits

This role offers flexibility. It can be based in Budapest with a hybrid schedule or anywhere in the European Union with a remote setup, including occasional travel to our other offices

Senior Manager, Site Reliability Engineering

SEON

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Manager

Share this job:

Similar Remote Jobs

Aledade, Inc.

Remote

DevOps

Manager

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

GoDaddy

Remote

DevOps

Senior

Remote

DevOps

Senior

Qustodio

Remote

DevOps

Senior

Remote

DevOps

Senior