Senior Site Reliability Engineer II

ActBlue Logo

ActBlue

πŸ’΅ $173k-$210k
πŸ“Remote - Worldwide

Summary

Join ActBlue's team as a Senior Site Reliability Engineer II and contribute to the reliability, scalability, and performance of our infrastructure. You will lead reliability initiatives, design and implement robust systems, partner with engineering teams, and improve the developer and operational experience through automation. Working across multiple systems and projects, you'll define our approach to observability and incident response, shaping the engineering culture around reliability. You will maintain our Kubernetes-based platform and GitOps workflows, share expertise through code reviews and tech talks, and mentor other engineers. This is a full-time, remote, salaried position with benefits including flexible work schedules, unlimited time off, and fully paid health insurance.

Requirements

  • 6–8 years of experience in site reliability engineering, DevOps, systems engineering, or a related discipline
  • Experience designing and operating distributed systems in production, especially in AWS or other cloud environments
  • Proficiency with container orchestration tools (especially Kubernetes) and infrastructure-as-code (e.g., OpenTofu/Terraform)
  • Hands-on experience with observability and monitoring tools (e.g., Datadog, Prometheus, Grafana)
  • Strong scripting or programming skills (e.g., Go, Python, Bash)
  • Be familiar with CI/CD pipelines and GitOps practices (we use Flux and Spacelift)
  • A collaborative approach to engineering and a willingness to share knowledge across teams
  • Comfort navigating ambiguity and solving complex operational challenges
  • A strong commitment to supporting ActBlue’s mission and building an inclusive technology infrastructure

Responsibilities

  • Design and implement scalable, resilient, and observable systems in a cloud-native environment
  • Lead reliability-focused efforts within the team and across the Platform Engineering org, including traffic management, incident management, and service-level objectives (SLOs)
  • Partner with application teams to ensure reliability concerns are considered early and throughout the development lifecycle
  • Participate in and continuously improve our on-call rotation through better automation, documentation, and tooling
  • Contribute to incident response, postmortems, and follow-up efforts, emphasizing reducing recurring pain and improving system understanding
  • Maintain and evolve our Kubernetes-based compute platform, GitOps workflows, and infrastructure-as-code tooling
  • Share your expertise through code reviews, RFCs, tech talks, and documentation
  • Act as a technical mentor and collaborative partner to Platform and Product Engineering engineers

Preferred Qualifications

  • Experience with traffic management (e.g., NGINX ingress, CDN configuration, failover strategies)
  • Familiarity with PCI or SOC2 compliance requirements and best practices
  • Exposure to or interest in reliability testing practices (e.g., chaos engineering, load testing)
  • Contributions to platform modernization or reliability-focused initiatives in past roles

Benefits

  • Flexible work schedules and an unlimited time-off policy
  • Fully paid and trans-inclusive health, dental, and vision insurance for employees and their families; plus fully-paid health reimbursement arrangement to use for out of pocket expenses and fully-paid short- and long-term disability
  • Fully paid basic and AD&D life insurance and a voluntary supplemental life insurance option
  • Dependent and health care flexible spending account options
  • Employee Assistance Program (EAP) benefits for employees
  • Automatic 2% Employer-paid 401K contribution, plus up to an additional 6% match on employee contributions
  • A minimum of three months paid medical, family and parental leave (for all new parents, adoptions included)
  • Commuter or home-office benefits, including a $1,000 home-office setup allowance for all new full-time remote employees
  • Additional perks including quarterly snack deliveries and digital subscriptions to the Boston Globe & New York Times

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs