πUnited States
Director of Site Reliability and Cloud Infrastructure
closed
Prompt Therapy Solutions Inc
πRemote - Worldwide
Summary
Join our team as a highly skilled Director of Site Reliability and Cloud Infrastructure to develop, maintain, and enhance our infrastructure while ensuring security, reliability, and scalability.
Requirements
- Bachelorβs degree in Computer Science, Engineering, or a related field
- Proven experience in a senior leadership role managing cloud infrastructure and site reliability, preferably within an AWS environment (EC2, S3, RDS, ELB, etc.)
- Hands-on experience with infrastructure as code (e.g., Terraform, CloudFormation) and automation tools (e.g., Ansible, Jenkins)
- Strong scripting skills (Python, Bash) and the ability to automate complex tasks
- Demonstrated success in scaling infrastructure and teams, particularly within high-availability and high-growth environments
- Solid understanding of networking, cloud security, and compliance standards (e.g., SOC2, HIPAA)
- Strong incident management skills and the ability to lead post-incident reviews to drive improvements
- Excellent communication skills and the ability to collaborate effectively with cross-functional teams
- Experience in hiring, developing, and managing technical teams with a focus on career development and innovation
Responsibilities
- Develop and maintain scalable and automated infrastructure solutions, particularly on AWS
- Implement and manage monitoring, alerting, and logging systems to detect and address reliability and security risks
- Manage incident response and resolution processes to minimize downtime, prevent recurrence, and ensure robust disaster recovery practices
- Conduct system performance tuning, capacity planning, and optimization to effectively manage resource utilization and loads
- Build and maintain strong relationships with cloud, security, and infrastructure vendors, ensuring their services meet performance, compliance, and security needs
- Lead contract negotiations and performance reviews for external vendors, ensuring alignment with internal standards and SLAs
- Hire, mentor, and lead a high-performing team of site reliability engineers (SREs), security experts, and infrastructure engineers
- Develop career growth plans and technical progression frameworks for team members, ensuring skills development in cloud technologies and SRE best practices
- Create a cohesive vision for cloud infrastructure, reliability, and security, aligning with the broader organizational goals
- Implement and maintain security best practices, including compliance with SOC2, HIPAA, and other relevant standards
- Ensure the infrastructure is protected against threats and vulnerabilities
- Drive innovation in cloud infrastructure and security, continuously improving our processes and systems
- Build and maintain automation tools and scripts to streamline system updates, deployments, and monitoring
- Design and oversee CI/CD pipelines, ensuring seamless integration with development and operations teams
- Work closely with the development, operations, and product teams to ensure alignment on priorities and collaboration on large-scale projects
- Provide technical guidance and mentorship across teams, championing a culture of reliability, automation, and security
- Communicate progress, risks, and issues clearly to both technical and non-technical stakeholders
Preferred Qualifications
- Experience in a high-growth SaaS company, especially within the healthcare or regulated industries
- Familiarity with cloud cost optimization, scalability best practices, and disaster recovery strategies
- Demonstrated ability to lead through influence, setting technical direction and ensuring execution across teams
Benefits
- Competitive salaries
- Remote/hybrid environment
- Potential equity compensation for outstanding performance
- Flexible PTO
- Company-wide sponsored lunches
- Company paid disability and life insurance benefits
- Company paid family and medical leave
- Medical, dental, and vision insurance benefits
- Discounted pet insurance
- FSA/DCA and commuter benefits
- 401k
This job is filled or no longer available
Similar Remote Jobs
πPortugal
π°$140k-$160k
πUnited States
π°$140k-$190k
π°$51k-$116k
πEurope
π°$192k-$288k
πUnited States
πUnited Kingdom, Germany
πAsia
πWorldwide