Remote Lead Site Reliability Engineer

Logo of AppOmni

AppOmni

πŸ’΅ $164k-$225k
πŸ“Remote - United States

Job highlights

Summary

Join our team as a Lead Site Reliability Engineer (SRE) and play a key role in improving the reliability, scalability, and performance of the AppOmni platform. As a leader in engineering enablement efforts, you will solve interesting challenges, build scalable infrastructure solutions, develop internal tools and frameworks, and ensure high availability of our platform's services.

Requirements

  • Bachelor’s degree in Computer Science, Software Engineering, or related field. Significant equivalent work experience will also be considered
  • 7+ years of experience in SRE, Platform Engineering, Cloud Infrastructure, DevOps or related disciplines with responsibilities for maintaining high availability of a cloud based application. Prior experience in a leadership role is a significant plus
  • Strong expertise in AWS, Azure, and/or GCP cloud environments
  • Experience with containerization and orchestration using Kubernetes
  • At least 5+ years of hands-on experience with Bash, Python and/or Golang
  • Hands-on experience with infrastructure as code (IaC) tools such as Terraform, Ansible, CloudFormation, Pulumi, or similar technologies
  • Strong knowledge of CI/CD pipeline tools and frameworks like GitHub Actions and GitLab CI to drive Developer Enablement (DevEx) improvements
  • Deep hands-on experience with monitoring, alerting, and incident management tools such as Prometheus, VictoriaMetrics, Grafana, Sentry, and PagerDuty
  • Experience in managing cloud-native services such as managed databases, queues, and caches
  • Demonstrable expertise in networking, cloud computing, security in cloud environments, and distributed systems
  • Excellent written and verbal technical and non-technical communication skills to convey details across distributed teams
  • Experience championing highly effective cross-functional technical discussions and demonstrating deep understanding of SDLC concepts

Responsibilities

  • Lead the design, implementation, and maintenance of reliable, scalable platforms to support the development and deployment of cloud-native applications
  • Monitor system performance and troubleshoot platform issues
  • Optimize alerting, logging, and resource utilization to ensure platform and application reliability
  • Develop, maintain, and optimize CI/CD pipelines for rapid and reliable software delivery
  • Implement automation frameworks to eliminate manual processes
  • Implement and manage infrastructure as code (IaC) to automate infrastructure provisioning and scaling
  • Lead capacity planning and platform performance optimization efforts
  • Participate in contingency and disaster recovery planning, demand forecasting, and system performance tuning
  • Champion best practices to enhance infrastructure agility, resiliency, and security
  • Be a part of our on-call rotation and incident management practices
  • Managing Kubernetes platforms and resources using deployment tools and patterns such as Helm, Knative, and GitOps

Benefits

  • Base Salary: The annual base salary compensation range in the U.S. for this role is: $164,000 - $225,500
  • Stock Options: Our vision is to not just grow as a company but to grow together. By offering stock options, we are inviting you to be an integral part of our journey forward
  • Benefits: The many benefits of employment with AppOmni include working remotely, new hire home office / computer equipment stipend, generous paid time off, paid company holidays, paid floating holidays, paid parental leave, paid sick time and paid family leave for applicable states, health insurance - medical, dental, and vision with HSA option, LifeWorks Member Assistance Plan, company-provided life insurance, AD&D, STD/LTD and additional supplemental life insurance options, 401(k) and Roth retirement saving accounts, and a monthly wellness benefit reimbursement

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let AppOmni know you found this job on JobsCollider. Thanks! πŸ™