Senior Systems Engineer

Spreedly
Summary
Join Spreedly as a Senior Systems Engineer and play a key role in scaling and strengthening our resilient payments platform. You will be central to initiatives advancing reliability and operational maturity across our infrastructure. Collaborate with various teams to simplify systems, enhance stability, and ensure safe, observable, and repeatable changes. Responsibilities include automating CI/CD pipelines, implementing IaC, ensuring system resiliency and observability, modernizing the platform, maintaining security and compliance, and providing operational support. You will also contribute to a culture of collaborative growth and knowledge sharing. This role requires a strong AWS background, DevOps experience, proficiency in IaC tooling, and experience with databases and middleware. Additional skills in payments industry compliance and project management are valued.
Requirements
- Strong background operating and scaling systems in AWS with a solid understanding of core services such as EC2, IAM, VPC, S3, CloudWatch, etc
- Experience with DevOps and/or GitOps and an understanding of CI/CD methods
- Proficient with infrastructure as code and configuration management tooling (Terraform, Ansible, Packer, etc.)
- Comfortable working in and troubleshooting Linux environments—specifically Amazon Linux and Ubuntu distributions and their ecosystems
- Experience with databases (CockroachDB, Postgres) and middleware infrastructure (Kafka)
- Proficiency in modern CDN and edge platforms, including configuration and perform
- Experience with containers and container orchestration (Docker, ECS, EKS, Kubernetes)
- Able to deliver solutions in support of applications written in Ruby on Rails and Elixir
- Some programming proficiency in Python or Go. You might even be a software engineer with a focus or passion for infrastructure!
- A desire to mentor other engineers and foster a collaborative environment to improve our software development processes
- Willingness to be a generalist and dig into new things you've never done before
- A pragmatic, action-oriented approach alongside a willingness to fail fast and pivot
- Ability to sort out immediate priorities from quickly evolving needs of a rapidly growing organization
Responsibilities
- Support and evolve our build and deployment pipelines (AWS Developer Tools, GitHub Actions) to improve reliability, speed, and developer autonomy
- Implement, improve, and maintain IaC (Terraform, Ansible, Packer) to provision and manage infrastructure in a repeatable, auditable, and scalable manner
- Apply SRE principles to proactively monitor system health and meet strict availability targets. Use tools like Datadog, AWS native tooling, and OpenTelemetry to create actionable dashboards and alerts, enabling adherence to crucial SLOs
- Stabilize critical infrastructure by designing for fault tolerance at every layer through early failure detection, graceful degradation, and automated recovery mechanisms. Continuously reduce MTTD and MTTR through continuously improved alerting, runbooks designed for rapid execution, and streamlined recovery workflows
- Improve infrastructure maturity through clear, incremental changes that promote simplicity, reduce legacy complexity, and strengthen the integrity and standardization of the platform as it evolves
- Contribute to infrastructure that meets compliance standards by ensuring controls around access, data protection, and deployment integrity are built into the platform
- Build and maintain a clean, well-documented, and consistent platform. Favor clarity, shared ownership, and design choices that minimize operational overhead
- Take ownership of system issues through thoughtful troubleshooting and root cause analysis, improving the platform along the way wherever possible. Participate in an on-call rotation to support 24/7 API operations
- Strengthen shared expertise throughout the team and organization by contributing to a culture of reciprocal learning—sharing knowledge, seeking input, and collaborating openly across experience levels
Preferred Qualifications
- Experience with systems like Heroku, Fastly, etc
- Payments industry experience and/or exposure to PCI, SOC2, or similar compliance environments
- Maturity in running projects end-to-end, including breaking down work, ensuring completion, and follow-up
Benefits
- Competitive salary + Equity
- Outstanding Medical and Dental benefits, including 100% employer-paid options
- Company-paid Life and Disability insurance
- Optional vision and supplemental insurance options, and various Flexible Spending Accounts (FSA)
- Open Paid Time Off policy + 12 weeks of paid leave for new parents
- Matching 401(k) plan (5% up to $5,000 yearly)
- Monthly home working/digital lifestyle stipend, new MacBook, and one-time accessory reimbursement
- $1,000 annual professional development stipend
- Access to company-paid professional coaching service
- Visits to HQ in Durham, North Carolina for remote employees