Summary
Join Podium as a Site Reliability Engineer to drive our products' success by building a stable, scalable, sustainable, and slick system. As a senior SRE, you will work with various technologies, engage with the engineering community, participate in an on-call rotation, and mentor junior engineers.
Requirements
- Bachelorβs degree in a technical field or relevant work experience
- 4+ years experience working alongside a production system in either a software engineer or systems engineer type role
- 3+ years deploying, operating and debugging server software on Linux
- Curiosity and the desire to learn
- Ability to take a rotating on-call shift
Responsibilities
- Work with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI
- Engaging with Podium's engineering community to identify potential areas of improvement or pain points and making Podium's systems safer and more pleasant to operate
- Participating in an on-call rotation for the services the team owns, triaging and addressing production as well as development issues
- Working cross-functionally with different teams to make sure that there is no down time for our products
- Mentoring junior engineers on the team
Preferred Qualifications
- Experience with distributed systems and microservices
- Practical knowledge of system design
- Cloud computing, such as AWS, GCP, or Azure
- SOC2, HIPAA, PCI, or other regulatory or compliance standards
- Building and maintaining a CI/CD pipeline
- Heavy Infrastructure experience