Remote Lead Site Reliability Engineer at Henry Meds

Summary

The job is for a Lead Site Reliability Engineering (SRE) Engineer at Henry Meds, a fast-growing startup specializing in compounded medications. The role involves ensuring the reliability, scalability, and performance of complex systems and cloud infrastructure, collaborating with engineering and security teams, and setting the direction of DevOps culture.

Requirements

Experience in GCP working with stakeholders to develop and document resilient services, across multiple edge and availability zones, with documented comprehensive disaster recovery plans and regularly conduct drills and exercises to test and validate the effectiveness of these plans
Experience managing identity and access management to control resources and services in GCP and work with stakeholders to develop security practices and procedures to ensure compliance with industry best practices and regulations
Experience managing the security and monitoring systems in our cloud that ensure our systems health
Experience leading incident management processes, conducting post-mortems, and driving improvements to prevent future incidents
Experience setting up availability expectations, addressing performance issues, uncovering observability gaps, leading problem management, and driving capacity planning
The ability to manage cloud operations, installing, maintaining, and monitoring network resources
Experience Defining SLOs, SLIs, leads on-call support schedules, troubleshooting, building support playbooks, implementing monitoring and alerting, logging standards, and conducting performance testing
Experience creating playbooks utilizing a chaos engineering mindset and resilience testing
Experience architecting Infrastructure As Code using Terraform

Responsibilities

Architect and create our observability and monitoring system
Create a disaster recovery plan and facilitate disaster recovery testing
Oversee teams who are responsible for the design, architecture, and development of operational infrastructure within our platform
Assist in hiring to perform daily operations and embed SRE operations across the department
Provide architectural and technical guidance and mentorship to SRE teams, fostering skill development, and building strong and capable SRE practices
Lead and prioritize multiple projects, create roadmaps, and drive implementation plans
Partner with product and engineering stakeholders to proactively identify operational needs and deliver solutions

Preferred Qualifications

10 + years of overall in a DevOps or Site Reliability Engineer environment
2+ years of leading Cloud SRE teams across AWS and Google Cloud Platform
5+ years of hands-on experience with infrastructure design and deployment utilizing Cloud PaaS and IaaS cloud offerings
5+ years of experience in cloud and system observability (Datadog, Grafana, Cloud Profiler) and alerting (OpsGenie, PagerDuty, GCP Cloud Monitoring)
5+ years of experience architecting and building infrastructure with a focus on redundancy, reliability, disaster response and discovery
5+ years of configuration/management experience with Cloud networking technologies (GCP IAM model,Terraform, gcloud-cli)
5+ years of cloud Operations knowledge with automation solutions
5+ years of cloud Solutions (Google Cloud Platform), Cloud Run, Containers, Terraform, GCS, C#, TypeScript

Benefits

Platinum PPO Healthcare + Vision & Dental (Henry covers 99% for employees and 50% for their qualified dependents)
401(k) with matching contributions beginning your first day
Unlimited PTO
Fully remote position with occasional travel

Henry Meds is hiring a Lead Site Reliability Engineer, Remote - United States

Lead Site Reliability Engineer closed

🏢 Henry Meds

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Similar Jobs

Engineering Team Lead Site Reliability Engineer

Givebutter

Remote

DevOps

Lead Site Reliability Engineer

Remotivate

Remote

DevOps

Lead Site Reliability Engineer

Sprinto

Remote

DevOps

Senior Site Reliability Engineering Engineer

Binance

Remote

DevOps

Senior Site Reliability Engineering Engineer

Binance

Remote

DevOps

Web3

GCP Reliability Engineering Lead

Remotivate

Remote

DevOps

Sr. Site Reliability Engineer

AKASA

Remote

DevOps

Site Reliability Engineer

ESL FACEIT Group

Remote

DevOps

Staff Site Reliability Engineer

Earnin

Remote

DevOps

Site Reliability Engineer

Appspace

Remote

DevOps

Henry Meds is hiring a
Lead Site Reliability Engineer, Remote - United States