Staff Site Reliability Engineer
closed
ModMed
Summary
Join Modernizing Medicine (ModMed) as a Staff Cloud Engineer and play a key role in ensuring the reliability, performance, and scalability of our AWS cloud infrastructure. You will architect and manage secure cloud infrastructure, implement and refine monitoring solutions using DataDog, lead CI/CD pipeline development with Jenkins, and drive containerization and orchestration with Kubernetes. This role requires expertise in AWS, DataDog, Jenkins, and Kubernetes, along with strong problem-solving and communication skills. You will mentor junior team members and collaborate with cross-functional teams. ModMed offers a competitive benefits package including comprehensive medical, dental, and vision benefits, 401(k) matching, generous paid time off, life and disability insurance, professional development opportunities, and a hybrid work environment.
Requirements
- Bachelorβs degree required in Computer Science, Information Technology, or related field, or equivalent experience
- A minimum of 8-10 years of experience in Site Reliability Engineering, Cloud Engineering, or a similar role, with a demonstrated track record of problem-solving in complex, cloud-based environments. This should include extensive experience with designing, implementing, and managing scalable, highly available, and fault-tolerant systems
- Strong expertise in managing cloud environments (preferably in AWS), with hands-on experience in observability platforms such as DataDog
- Proficiency in automation and scripting languages (e.g., Python, Bash) and infrastructure as code (IaC) tools (e.g., Terraform, Ansible)
- Extensive experience with CI/CD tools, notably Jenkins, and familiarity with containerization and orchestration technologies like Kubernetes
- Solid understanding of networking, cloud security best practices, performance optimization, and cost management strategies
- Demonstrated commitment to implementing industry-standard site reliability principles and a proactive approach to cost management in daily operations
- Proven leadership skills and the ability to mentor junior team members, guide teams through complex operational challenges, and foster a culture of continuous improvement
- Excellent verbal and written communication skills, with the ability to work effectively in a team environment and communicate complex technical concepts to a non-technical audience
Responsibilities
- Architect and manage secure, scalable cloud infrastructure and services, focusing on automation, reliability, and proactive cost management to ensure efficient operations
- Implement and refine observability and monitoring solutions using DataDog, ensuring proactive issue identification and efficient resource utilization
- Lead CI/CD pipeline development, maintenance, and optimization with Jenkins, integrating AWS services to enhance development workflows and infrastructure automation
- Drive the containerization and orchestration of applications using Kubernetes, enhancing scalability, deployment efficiency, and cost-effectiveness
- Monitor application and infrastructure performance in AWS, applying tuning and optimizations to ensure optimal resource utilization and user experience while managing costs
- Design and manage disaster recovery and backup strategies on AWS, prioritizing data integrity, system availability, and cost efficiency
- Provide expert troubleshooting and problem-solving across various platforms and applications within AWS, aiming for minimal disruption and quick resolution
- Ensure strict adherence to AWS security standards and compliance with data protection regulations, with a keen eye on cost implications
- Keep abreast of new cloud technologies and trends, recommending and implementing improvements for competitive advantage and cost savings
- Mentor and support junior team members, fostering a culture of learning, collaboration, and cost-consciousness
- Work closely with cross-functional teams to understand requirements and deliver AWS-based solutions that meet business objectives efficiently and cost-effectively
Benefits
- Comprehensive medical, dental, and vision benefits, including a company Health Savings Account contribution
- 401(k): ModMed provides a matching contribution each payday of 50% of your contribution deferred on up to 6% of your compensation. After one year of employment with ModMed, 100% of any matching contribution you receive is yours to keep
- Generous Paid Time Off and Paid Parental Leave programs
- Company paid Life and Disability benefits, Flexible Spending Account, and Employee Assistance Programs
- Company-sponsored Business Resource & Special Interest Groups that provide engaged and supportive communities within ModMed
- Professional development opportunities, including tuition reimbursement programs and unlimited access to LinkedIn Learning
- Global presence and in-person collaboration opportunities; dog-friendly HQ (US), Hybrid office-based roles and remote availability for some roles
- Weekly catered breakfast and lunch, treadmill workstations, Zen, and wellness rooms within our BRIC headquarters