Senior Site Reliability Engineer

GlossGenius Logo

GlossGenius

πŸ“Remote - Canada

Summary

Join GlossGenius's Infrastructure team as a remote-based Cloud Engineer in Canada! You'll empower developers with the right tools, ensuring the reliability and scalability of our production systems. Your work will directly impact both internal developer experience and the robustness of our rapidly growing platform. You will build and maintain reliable, secure, and highly scalable infrastructure supporting over 75,000 service professionals. This role involves collaboration with product and engineering teams to drive operational excellence and scale our cloud-native deployment. You will be instrumental in shaping incident management practices and enhancing monitoring and alerting capabilities.

Requirements

  • 5+ years of experience working with cloud technologies as a Software Engineer (SWE), Systems Engineer, Production Engineer, Cloud Engineer, Site Reliability Engineer (SRE), or similar roles
  • Demonstrated experience working with cloud platforms (AWS, GCP, Azure, etc.) in production to serve real-world traffic
  • Demonstrated experience with infrastructure-as-code frameworks such as Terraform, Terraform CDK, or AWS CDK
  • Solid understanding of IP networking, DNS, CDN, load balancing, HTTP, and firewalls
  • Experience building and maintaining monitoring, logging, and alerting systems for large-scale, 24/7 platforms
  • Participating in team on-call responsibilities
  • Experience with container technologies such as Docker and container orchestration systems such as Kubernetes (i.e. AWS EKS, GCP GKE)
  • The ability to write high-quality code in a high-level programming language such as Go, Typescript, Python, Kotlin, Java, or Ruby
  • Demonstrated ability to drive projects from concept to completion, with a strong focus on delivering outcomes

Responsibilities

  • Partner with Product and Engineering teams to support a reliable, secure, and scalable infrastructure platform that minimizes toil
  • Ensure GlossGenius scales its AWS cloud footprint efficiently
  • Build tools to help engineers quickly identify problems across the stack
  • Shape incident management practices and drive operational excellence
  • Enhance our monitoring and alerting capabilities and share your expertise with other teams as an subject matter expert in observability
  • Champion DevOps principles across the company, fostering a culture of collaboration and automation
  • Understand industry and company-wide trends to help assess and integrate new technologies
  • Collaborate with the broader engineering team to optimize application performance and encourage resilient, scalable system architecture
  • Take ownership of complex problems from inception to resolution, engaging with stakeholders and driving solutions that balance business needs with platform reliability, scalability, and security

Benefits

  • Flexible PTO
  • Competitive health & dental insurance options, with premiums covered by GG
  • Generous, fully-paid parental leave policy
  • Professional Development - employees receive a yearly stipend for approved learning and educational-related expenses
  • Home office support
  • Team Bonding opportunities - as a distributed team, being able to build meaningful bonds both virtually and in person is incredibly important to us! We are constantly evaluating how we accomplish this and currently, teams are given opportunities to gather in person throughout the year

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs