Cloud Site Reliability Engineer

closed
Smile Digital Health Logo

Smile Digital Health

πŸ“Remote - Worldwide

Summary

Join Smile Digital Health's Cloud Hosting Services team as a Cloud SRE, supporting the building, operating, and automating of infrastructure services for SaaS-based solutions on Azure/AWS. You will bridge development and operations, applying a software engineering mindset to system administration. Responsibilities include collaborating with security teams, developing multi-tenant approaches, cost tracking, documentation, and maintaining relationships with cloud providers. You will also ensure SLAs are met, create automation tools, participate in on-call rotations, and provide customer support. Approximately 50% of your time will be spent on deployment and infrastructure, with the remaining time allocated to patching, documentation, customer interaction, and solution development. Smile Digital Health offers a remote work environment, flexible time off, competitive salary and benefits, and various professional development opportunities.

Requirements

  • Demonstrated expertise of cloud service providers and best practices around implementation and configuration, preferably managing Azure on behalf of multiple teams for a company that delivers SaaS products
  • Experience with Kubernetes, Openshift, Kafka, Elastic stack
  • Proven experience with Security and Compliance (SOC2, HIPAA, ISO27001) best practices and how to implement controls that support high-velocity software delivery teams
  • Proficiency in Terraform, Ansible or Chef
  • Expertise in troubleshooting support escalation, on-Call process optimization and documenting knowledge
  • Passionate about Infrastructure as code, automation, and developing solutions that help developers move quickly and safely
  • Familiarity with infrastructure management and operations lifecycle concepts and ecosystem
  • Experience operating and maintaining production systems in a Linux and public cloud environment
  • You have prior experience working in high performance or distributed systems; while we strive to hire at a variety of experience levels
  • Working knowledge of industry best practices with regard to information security
  • Previous experience building or maintaining a large scale Cloud service
  • Proven ability to prioritize and track multiple projects in parallel
  • Proven ability to be highly responsive and customer-focused

Responsibilities

  • Collaborate with our Security Operations teams to help define and implement best practices around Cloud Service Provider configuration for AWS, Azure and other cloud providers
  • Develop, implement and coordinate a multi-tenant approach around service offerings for DB, Container platform, Authentication, Certificates, and Product Registries etc
  • Develop and maintain cost/utilization tracking and attribution processes for all Cloud Service Providers
  • Create documentation around Cloud Service Provider offerings detailing use cases, best practices, and implementation details
  • Develop and maintain technical relationships with our core Cloud Service Providers
  • Implement and maintain a secure and scalable infrastructure platform for delivering Cloud Services applications
  • Ensure that internal and external SLA’s meet and exceed expectations, and ensure that system centric KPIs are continuously monitored and improved
  • Create tools for automating deployment, monitoring and operations of the overall platform
  • Participate in an on-call rotation to provide application support, incident management, and troubleshooting
  • Provide ongoing maintenance and support of internal tools, improve system health and reliability
  • Assist customers with the On-premise deployments when needed
  • Ongoing compliance with organizational policies, procedures and practices (such as but not limited to security policies) are an ongoing requirement of the employment or contractual agreement
  • Comply with the privacy, security and confidentiality policies

Benefits

  • Remote Work Environment
  • Flexible Time Away From Work Policy including PTO, Personal and Sick Days
  • Competitive Salary and Health/Medical Benefits
  • RRSP/TFSA/401K Employee Contribution
  • Life and Disability
  • Employee Assistance Program
  • FHIR Study Program and Skillsoft Learning
  • Super HAPI Fun Club
This job is filled or no longer available