Staff Site Reliability Engineer

ModMed
Summary
Join Modernizing Medicine as a Staff Cloud Engineer and spearhead efforts to ensure the reliability, performance, and scalability of our AWS cloud infrastructure and services. You will play a pivotal role in automating processes, optimizing costs, and applying industry-standard site reliability principles. Leveraging your expertise in AWS, DataDog, Jenkins, and Kubernetes, you will lead the evolution towards more efficient and cost-effective cloud-native technologies. This role requires a Bachelor's degree and 8-10 years of relevant experience. You will work closely with cross-functional teams and mentor junior team members. Modernizing Medicine offers a competitive benefits package, including health insurance, retirement benefits, paid time off, and professional development opportunities.
Requirements
- Bachelorβs degree required in Computer Science, Information Technology, or related field, or equivalent experience
- A minimum of 8-10 years of experience in Site Reliability Engineering, Cloud Engineering, or a similar role, with a demonstrated track record of problem-solving in complex, cloud-based environments. This should include extensive experience with designing, implementing, and managing scalable, highly available, and fault-tolerant systems
- Strong expertise in managing cloud environments (preferably in AWS), with hands-on experience in observability platforms such as DataDog
- Proficiency in automation and scripting languages (e.g., Python, Bash) and infrastructure as code (IaC) tools (e.g., Terraform, Ansible)
- Extensive experience with CI/CD tools, notably Jenkins, and familiarity with containerization and orchestration technologies like Kubernetes
- Solid understanding of networking, cloud security best practices, performance optimization, and cost management strategies
- Demonstrated commitment to implementing industry-standard site reliability principles and a proactive approach to cost management in daily operations
- Proven leadership skills and the ability to mentor junior team members, guide teams through complex operational challenges, and foster a culture of continuous improvement
- Excellent verbal and written communication skills, with the ability to work effectively in a team environment and communicate complex technical concepts to a non-technical audience
Responsibilities
- Architect and manage secure, scalable cloud infrastructure and services, focusing on automation, reliability, and proactive cost management to ensure efficient operations
- Implement and refine observability and monitoring solutions using DataDog, ensuring proactive issue identification and efficient resource utilization
- Lead CI/CD pipeline development, maintenance, and optimization with Jenkins, integrating AWS services to enhance development workflows and infrastructure automation
- Drive the containerization and orchestration of applications using Kubernetes, enhancing scalability, deployment efficiency, and cost-effectiveness
- Monitor application and infrastructure performance in AWS, applying tuning and optimizations to ensure optimal resource utilization and user experience while managing costs
- Design and manage disaster recovery and backup strategies on AWS, prioritizing data integrity, system availability, and cost efficiency
- Provide expert troubleshooting and problem-solving across various platforms and applications within AWS, aiming for minimal disruption and quick resolution
- Ensure strict adherence to AWS security standards and compliance with data protection regulations, with a keen eye on cost implications
- Keep abreast of new cloud technologies and trends, recommending and implementing improvements for competitive advantage and cost savings
- Mentor and support junior team members, fostering a culture of learning, collaboration, and cost-consciousness
- Work closely with cross-functional teams to understand requirements and deliver AWS-based solutions that meet business objectives efficiently and cost-effectively
Benefits
- Comprehensive medical, dental, and vision benefits, including a company Health Savings Account contribution
- 401(k): ModMed provides a matching contribution each payday of 50% of your contribution deferred on up to 6% of your compensation. After one year of employment with ModMed, 100% of any matching contribution you receive is yours to keep
- Generous Paid Time Off and Paid Parental Leave programs
- Company paid Life and Disability benefits, Flexible Spending Account, and Employee Assistance Programs
- Company-sponsored Business Resource & Special Interest Groups that provide engaged and supportive communities within ModMed
- Professional development opportunities, including tuition reimbursement programs and unlimited access to LinkedIn Learning
- Global presence and in-person collaboration opportunities; dog-friendly HQ (US), Hybrid office-based roles and remote availability for some roles
- Weekly catered breakfast and lunch, treadmill workstations, Zen, and wellness rooms within our BRIC headquarters
- Meals & Snacks: Enjoy complimentary office lunches & dinners on select days and healthy snacks delivered to your desk
- Insurance Coverage: Comprehensive health, accidental, and life insurance plans, including coverage for family members, all at no cost to employees
- Allowances: Annual wellness allowance to support your well-being and productivity
- Earned, casual, and sick leaves to maintain a healthy work-life balance
- Bereavement leave for difficult times and extended medical leave options
- Paid parental leaves, including maternity, paternity, adoption, surrogacy, and abortion leave
- Celebration leave to make your special day even more memorable, and company-paid holidays to recharge and unwind