Site Reliability Engineer
Playson
πRemote
Please let Playson know you found this job on JobsCollider. Thanks! π
Job highlights
Summary
Join Playson, a leading online gaming supplier, as a Site Reliability Engineer/DevOps to become part of our dynamic Firex squad. You will manage day-to-day alerts, provide 24/7 on-call support, and proactively improve our EKS/K8s infrastructure. Responsibilities include deploying to EKS/K8s using Terraform and Helm/Flux, implementing new technologies, and collaborating with other teams. You'll need strong experience with issue processing, Kubernetes, AWS, and various monitoring and logging tools. We offer quarterly bonuses, flexible work schedule, remote work option, comprehensive medical insurance, financial support for life events, unlimited paid vacation and sick leave, and reimbursement for professional development.
Requirements
- Strong experience with issue processing (RCA, Postmortems)
- Proficiency in Kubernetes (deployment, scaling, troubleshooting)
- Familiarity with AWS, Terraform, Docker, CI/CD
- Experience with monitoring tools like DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch
- Strong understanding of networking concepts and protocols
- Proficiency in at least one scripting language (e.g., Python, NodeJS, Go)
- Experience with configuration management tools like FluxCD/ArgoCD
- Proficiency in Git or other version control systems
- Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps
- Ownership, proactiveness, persistence, and passion for maintaining a high-traffic online platform
Responsibilities
- Manage day-to-day alerts, system checks, and issue escalation as necessary
- Provide 24x7 on-call support for critical SaaS events
- Document issues and remediation steps
- Proactively create monitors within the EKS/K8s ecosystem
- Deploy to EKS/K8s cluster using Terraform and Helm/Flux
- Enhance infrastructure health by implementing checks and scripts to address known issues
- Maintain and develop deployment code
- Implement/integrate new technologies into our Cloud Infrastructure
- Collaborate with other teams to provide top-notch support and assistance
- Prioritize customer focus in planning deployments/updates, ensuring minimal impact
- Conduct RCA and take necessary corrective actions to prevent issue recurrence
- Assign alert-related actions to the appropriate team after investigation
- Handle support requests for environment-specific actions
Benefits
- Quarterly Bonuses based on transparent and systematic evaluation
- Flexible Work Schedule
- Remote Work Option for Enhanced Flexibility
- Comprehensive Medical Insurance for you and your significant other
- Financial Support for Life Events
- Unlimited Paid Vacation
- Unlimited Paid Sick Leave
- Reimbursement for professional development courses and training
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
- π°$177k-$213kπUnited States
- πJapan
- π°$60k-$120kπAsia
- πMexico
- πPoland
- πUkraine
- πWorldwide
- πEurope
- π°$170k-$259kπUnited States
Please let Playson know you found this job on JobsCollider. Thanks! π