Remote Engineering Manager, Infrastructure & DevOps

Logo of Coconut Software

Coconut Software

πŸ“Remote - Canada

Job highlights

Summary

Join our team as an experienced Engineering Manager for our cloud Infrastructure, SRE (Site Reliability Engineering), & DevOps Team. We're looking for a leader with excellent communication and collaboration skills to create the strategic plan for infrastructure and devops. As the Manager of this team, you will bring technical knowledge, defining and implementing CloudOps, DevOps, and SRE best practices.

Requirements

  • Proven experience in managing/leading a DevOps/SRE/Infrastructure team in a fast-paced environment
  • Expertise in cloud platforms and infrastructure management, preferably AWS and Kubernetes
  • Experience with provisioning, vendor management, and monitoring resources in a cloud based environment
  • Experience configuring and managing data sources like MySQL, Postgres, Redis
  • System configuration experience with automation tools such as Puppet/Chef/Ansible
  • Proficiency in automation and CI/CD tools such as Spinnaker, CircleCI, Travis CI, or GitLab CI/CD
  • Experience with containerization and orchestration techniques and tools (e.g., Docker, Kubernetes)
  • Experience with infrastructure as code tools, such as Terraform
  • Experience leading & analyzing complex application, database, network, and OS issues for customer-facing systems in a high-uptime environment
  • Experience with monitoring and alerting tools (e.g. DataDog, Sentry, OpsGenie)
  • Experience with Perl/Python/Java/Bash scripting
  • Experience working with large enterprise customers bases
  • Experience reporting on key metrics, costing, tooling to the organization and making recommendations for improvements
  • Excellent problem-solving, collaboration, and communication skills
  • Effective at nurturing relationships and managing multiple stakeholders across different teams
  • Strong project management, leadership and cost management abilities

Responsibilities

  • Demonstrate Team Leadership
  • Lead by example - act in accordance with our CHEERS values
  • Mentor, coach and inspire a team of DevOps, Infrastructure and SRE professionals
  • Foster a collaborative and high-performance work environment
  • Hire and train team members
  • Be accountable for the Infrastructure and DevOps roadmap and the results the team attains
  • Work with the Principal DevOps Engineer to set strategic plans and priorities for this function
  • Oversee Infrastructure and Site Reliability
  • Work with your team to design, implement, and manage a cloud-based infrastructure ecosystem for scalability and reliability
  • Ensure best practices for infrastructure as code (IaC) and configuration management are applied
  • Work closely with the application development teams to ensure a manageable migration into a secure and reliable product environment and on implementing new tools
  • Automation and CI/CD
  • Develop and maintain automated deployment pipelines
  • Promote continuous integration and delivery (CI/CD) practices
  • Design and develop automation and processes to enable teams to deploy, manage, configure, scale and monitor applications
  • Monitoring and Alerting
  • Ensure robust monitoring to proactively identify and resolve issues
  • Configure and manage alerting systems for real-time status and incident response
  • Reliability Engineering
  • Define and measure service level objectives (SLOs) and service level indicators (SLIs) to ensure system reliability
  • Lead incident response and post-incident reviews to improve system resilience
  • Develop innovative and technical tooling to improve production stability and enable faster recovery
  • Security and Compliance
  • Collaborate with security team to implement best practices for securing infrastructure and applications
  • Ensure compliance with industry standards and regulations
  • Resource Optimization
  • Optimize resource utilization to reduce costs while maintaining performance and reliability
  • Monitor & report on hosting & tooling costs
  • Documentation and Training
  • Maintain comprehensive documentation of systems, processes, and procedures
  • Provide training and knowledge sharing within the team

Benefits

  • Flexible work week ("Cabana Days")
  • Ability to do your job in a supported, but still flexible environment
  • Supported professional development, learning & career opportunities
  • Regular 1:1 coaching with your leader and regular connection to a passionate executive team
  • Work in a team big enough for growth but lean enough to make a real impact
  • Competitive Salaries - we pay fairly based on experience and expertise, not your ability to negotiate!
  • Health & Dental Benefits, Virtual Care, & Disability top up - all starting from day 1!
  • Virtual mental health and EAP platform
  • WealthSimple GRSP & Matching
  • Annual Wellness Benefit ($1000 per year)
  • Opportunity to work remote - anywhere in Canada!
  • Employee Options - everyone shares in our success!
  • Internet Subsidy on each paycheck
  • Tiki Bucks Incentive Program - everyone is entitled to earn bonuses!

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Coconut Software know you found this job on JobsCollider. Thanks! πŸ™