Summary

Join AppOmni as a Senior Site Reliability Engineer (SRE) and ensure the reliability, scalability, and performance of our systems and infrastructure. You will monitor system availability, automate deployment and maintenance, and proactively identify optimization areas. Collaborate with the development team to establish service-level objectives and lead incident response and postmortem analysis. This role requires excellent communication skills and 5+ years of hands-on experience with Python or Golang. AppOmni offers a hybrid work model with hub cities in San Francisco & San Jose (CA), Denver (CO), Lexington (KY), and New York City (NY). We are committed to supporting our employees' financial, professional, and personal well-being.

Requirements

Excellent technical and non-technical communication skills
Prior Experience as an SRE or related disciple responsible for maintaining high availability of a cloud based application, troubleshooting performance bottlenecks, configuring monitoring and alerting, and conducting incident response in a blameless environment
A knack for reducing manual toil tasks with automation and systematic thinking
Prior experience working with CI/CD tools and processes, pipelines-as-code (GitHub Actions, CircleCI)
At least 5+ years of hands-on experience with Python or Golang
A solid background in configuration management and infrastructure-as-code(Terraform)
Solid experience in monitoring/observability systems (Grafana, Prometheus, etc.)
Demonstrated knowledge with Container orchestration ( Kubernetes/GKE)
Experience managing Kubernetes platforms and resources, and using Kubernetes deployment tool and patterns ( Helm, GitOps, Knative)

Responsibilities

Ensure our systems and infrastructure's reliability, scalability, and performance
Monitor system availability
Implement automation for deployment and maintenance tasks
Proactively identify areas for optimization
Collaborate with the development team to establish and refine service-level objectives
Drive incident response and postmortem analysis to minimize service disruptions

Preferred Qualifications

Experience in FedRAMP or similar secure environments
Expertise working within highly controlled environments containing sensitive information
Experience designing and maintaining CI/CD pipelines using commercial solutions
Experience working on and within GCP and/or AWS

Benefits

Working remotely
New hire home office/computer equipment stipend
Generous paid time off
Paid company holidays
Paid floating holidays
Paid parental leave
Paid sick time
Paid family leave for applicable states
Health insurance - medical, dental, and vision with HSA option
LifeWorks Employee Assistance Program
Company-provided life insurance
AD&D
STD/LTD and additional supplemental life insurance options
401(k) and Roth retirement saving accounts
A monthly wellness benefit reimbursement

Senior Site Reliability Engineer

AppOmni

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Censys

Remote

DevOps

Senior

SMG Swiss Marketplace Group

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

ClickHouse

Remote

DevOps

Senior