Site Reliability Engineer
Avetta
πRemote - Australia
Please let Avetta know you found this job on JobsCollider. Thanks! π
Job highlights
Summary
Join Avetta as a Site Reliability Engineer in Australia! The ideal candidate will lead the management and monitoring of highly available replicated cloud systems, oversee NOC operations, and design escalation policies. They should have expertise in AWS technologies, stellar communication skills, and experience in managing teams and leading projects.
Requirements
- Minimum B.S. or B.A. in Computer Science
- Minimum of 5 years of experience as a Site Reliability Engineer, including some experience in managing teams and leading projects
- Stellar communication and interpersonal skills for effective collaboration with Development & Product teams
- Proficiency in monitoring the networking stack using distributed tracing and profiling tools
- Proficient with building dashboards with NewRelic, Kibana, Grafana, Prometheus and other observability platforms
- Proficient with AWS technologies
- Working knowledge in monitoring RESTful microservices and basic HTTP protocols
- Able to automate monitors and dashboards using REST APIs, GraphQL, and other modern programmatic methods
- Working knowledge of profiling tools for measuring CPU, Memory, I/O, Disk, and process threads dumps
- Experience in managing, integrating, and automating alerting and escalation tools
- Must live in Australia with unlimited rights to work
Responsibilities
- Lead the management and monitoring of highly available replicated cloud systems
- Oversee 24/7 Network Operations Center (NOC) operations, maintaining a minimum 99.9% annual uptime
- Define golden signals for all services in our core SaaS application
- Manage NOC engineer teams, including scheduling and responsibilities
- Design PagerDuty escalation policies across various teams
- Expertise in AWS technologies and building dashboards with leading observability platforms
- Automate monitors and dashboards using modern programmatic methods
- Provide regular reports to Engineering leadership and executive teams for continuous improvement
Preferred Qualifications
- Troubleshooting experience with modern container and networking technologies (Kubernetes, HAProxy, ALB)
- Familiarity with scripting languages like Bash, Python, and Go
- Ability to administer and tune load balancer technologies
- Experience in managing, monitoring, and benchmarking distributed file systems
- Proficiency in configuration management tools (SaltStack, Ansible, Terraform)
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
- πJapan
- πUnited States
- π°$195k-$220kπUnited States
- π°$129k-$161kπCanada
- π°$159k-$239kπUnited States
- π°$60k-$120kπAsia
- πWorldwide
- πUnited States
- π°$147k-$207kπUnited States
- π°$192k-$288kπUnited States
Please let Avetta know you found this job on JobsCollider. Thanks! π