Senior Site Reliability Engineer
![Lightspeed Logo](https://cdn.jobscollider.com/logo/lightspeed-4a5b.webp)
Lightspeed
Summary
Join Lightspeed Hospitality's Site Reliability Engineering (SRE) team and contribute to the continuous improvement of software delivery processes for our flagship K-Series and regional L-Series & G-Series products. You will leverage automation to manage and monitor systems, design scalable cloud infrastructure, and adhere to DevOps best practices. The role requires strong AWS, Docker, Kubernetes, and Linux experience, along with proficiency in configuration management and Infrastructure as Code. You'll collaborate with various teams, provide timely assistance during incidents, and be on call periodically. Lightspeed offers a flexible work culture, remote work possibilities, opportunities for growth, and amazing benefits including equity.
Requirements
- Strong knowledge of Amazon Web Services
- Strong experience with Docker, Kubernetes & Linux Systems
- Experience with configuration management tools such as Chef, Puppet, Ansible, Salt
- Experience with Infrastructure as code practices: we use Terraform
- Ability to read & write complex scripts using Shell
- Ability to read & understand programming languages: Python, Ruby, Go, β¦
- Good understanding of Agile development and continuous delivery best practices, software engineering tools, processes, methods and testing
- Ability to partner effectively with other teams
- Ability to plan, organize, prioritize and stay focused
- Good experience provisioning and managing infrastructures with high availability constraints
- Good experience with cloud cost optimization
- You are a problem solvers who does not shy away from tackling complexity and critical thinking
- You have a strong will to learn, grow and get out of your comfort zone
- You have great energy and passion for technology
- You are able to express yourself flawlessly in English
- You have strong interpersonal skills
Responsibilities
- Initiate and contribute to the continuous improvement of our software delivery processes and practices in a multi-location, multidisciplinary team to empower and accelerate product development
- Use automation extensively to design, configure, manage, and monitor systems in support of our product development teams
- Design and architect operational solutions with the specific goal of increasing the standardization, automation, repeatability, cost-efficiency and consistency of operational tasks
- Working with developers and other SRE to design and build scalable, reliable and cost-efficient Cloud infrastructure
- Adhere to and advocate for best practices, including Infrastructure as Code, monitoring, high availability, disaster recovery, security, and DevOps methodologies
- Provide timely assistance and remediation solutions during critical situations and production incidents to help resolve service problems (You will be on call for periods of time)
Benefits
- Lots of autonomy, flexible work culture and possibility of remote work
- Development of high traffic products, used at the global scale
- Exposure to modern and proven technology
- Opportunity to learn and expand your skill set
- Tons of growth opportunities into technical or people management roles
- Amazing benefits & perks, including equity for all Lightspeeders
- Opportunity to join a fast-paced, high-growth company
- Become a valued part of the diverse and inclusive Lightspeed P&T team