Senior Site Reliability Engineer at Juniper Square

Summary

Join Juniper Square as a Senior Site Reliability Engineer (SRE) to scale, secure, and improve our cloud infrastructure using modern cloud-native technologies. You will collaborate with software engineers and the platform team, building and maintaining self-service tools. This role demands ownership, a bias for action, and problem-solving skills. You will own reliability and scalability initiatives, participate in on-call rotations, and design, deploy, and manage Kubernetes clusters. The position requires strong experience with Kubernetes, AWS services, PostgreSQL, CI/CD automation, and IaC. Juniper Square offers a variety of work arrangements and a competitive compensation package.

Requirements

5+ years of experience in SRE, DevOps, or Infrastructure Engineering with a proven track record of ownership and initiative
Strong experience with Kubernetes, Helm, and CNIs, including networking and security
Proficiency in AWS services such as RDS, Aurora, IAM, VPC, EKS, and EC2
Experience in PostgreSQL administration, including performance tuning and high availability in RDS/Aurora
Hands-on experience with GitHub Actions and ArgoCD for secure and scalable CI/CD automation
Strong background in Infrastructure as Code (IaC) with Crossplane and Terraform
Deep understanding of observability and monitoring with Datadog
Experience with Kyverno for Kubernetes policy-based security enforcement
Proficiency in Python and Bash scripting for automation and system management
Strong understanding of CI/CD security best practices and ability to implement controls for securing deployments
Self-starter mentality —actively seeks out and fixes problems without waiting for assignments
High ownership and accountability —takes initiative in driving improvements and following through to resolution
Strong problem-solving mindset —identifies bottlenecks, inefficiencies, and risks, then delivers scalable solutions
Excellent communication skills —documents processes in Confluence, collaborates cross-functionally, and influences engineering teams toward operational excellence

Responsibilities

Own reliability and scalability initiatives—identify, prioritize, and implement solutions before issues escalate
Participate in an on-call rotation, responding to incidents, performing root cause analysis, and driving long-term fixes
Design, deploy, and manage Kubernetes clusters using Helm charts, Cilium, and Karpenter to optimize performance and cost
Architect and maintain AWS infrastructure with a focus on RDS/Aurora PostgreSQL, networking, and scaling best practices
Implement GitHub Actions CI/CD pipelines, integrating security best practices and automation
Define and enforce policy-based security for Kubernetes using Kyverno
Automate infrastructure provisioning with Crossplane and Terraform to ensure consistency and scalability
Enhance observability and monitoring using Datadog to proactively detect and resolve issues
Improve security and reliability by identifying risks in CI/CD, cloud environments, and Kubernetes, then implementing necessary safeguards
Lead post-incident reviews, drive lessons learned into long-term improvements, and document best practices in Confluence

Preferred Qualifications

Deep experience with GitHub Actions for CI/CD automation, with a focus on security best practices
Extensive knowledge of Helm charts for managing Kubernetes applications
Strong experience in PostgreSQL, including optimization and high availability in RDS/Aurora
Experience with NoSQL databases and best practices for scaling and performance
Proven ability to influence engineering culture toward automation, self-service, and operational excellence
Experience with Karpenter for Kubernetes autoscaling
Previous experience with cost optimization strategies in AWS environments
Experience with Atlassian tools (Jira, Confluence) for tracking incidents and documentation
Strong experience with and a passion for expanding AI into the SRE and DevOps world

Benefits

Health, dental, and vision care for you and your family
Life insurance
Mental wellness coverage
Fertility and growing family support
Flex Time Off in addition to company paid holidays
Paid family leave, medical leave, and bereavement leave policies
Retirement saving plans
Allowance to customize your work and technology setup at home
Annual professional development stipend

Senior Site Reliability Engineer

Juniper Square

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Gusto

Remote

DevOps

Senior

Loggi

Remote

DevOps

Senior