Senior Site Reliability Engineer at Playson

Summary

Join Playson, a leading iGaming supplier, as a Senior Site Reliability Engineer and manage day-to-day alerts, system checks, and issue escalations. Provide 24x7 on-call support for critical SaaS events, document issues and remediation steps, and proactively create monitors within the EKS/K8s ecosystem. You will deploy to EKS/K8s clusters, enhance infrastructure health, maintain and develop deployment code, and integrate new technologies into our Cloud Infrastructure. Collaborate with other teams, prioritize customer focus in deployments, conduct RCA, and assign alert-related actions. Handle support requests for environment-specific actions. Playson offers quarterly bonuses, flexible work schedule, remote work options, comprehensive medical insurance, financial support for life events, unlimited paid vacation and sick leave, and reimbursement for professional development.

Requirements

Proficiency in Kubernetes (deployment, scaling, troubleshooting)
Experience with configuration management tools like FluxCD/ArgoCD
Strong experience with issue processing (RCA, Postmortems)
Familiarity with AWS, Terraform, Docker, CI/CD
Experience with monitoring tools like DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch
Strong understanding of networking concepts and protocols
Proficiency in at least one scripting language (e.g., Python, NodeJS, Go)
Proficiency in Git or other version control systems
Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps
Ownership, proactiveness, persistence, and passion for maintaining a high-traffic online platform

Responsibilities

Manage day-to-day alerts, system checks, and issue escalation as necessary
Provide 24x7 on-call support for critical SaaS events
Document issues and remediation steps
Proactively create monitors within the EKS/K8s ecosystem
Deploy to EKS/K8s cluster using Terraform and Helm/Flux
Enhance infrastructure health by implementing checks and scripts to address known issues
Maintain and develop deployment code
Implement/integrate new technologies into our Cloud Infrastructure
Collaborate with other teams to provide top-notch support and assistance
Prioritize customer focus in planning deployments/updates, ensuring minimal impact
Conduct RCA and take necessary corrective actions to prevent issue recurrence
Assign alert-related actions to the appropriate team after investigation
Handle support requests for environment-specific actions

Benefits

Quarterly Bonuses based on transparent and systematic evaluation
Flexible Work Schedule
Remote Work Option for Enhanced Flexibility
Comprehensive Medical Insurance for you and your significant other
Financial Support for Life Events
Unlimited Paid Vacation
Unlimited Paid Sick Leave
Reimbursement for professional development courses and training

Senior Site Reliability Engineer

Playson

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior