Summary

Join Airalo, the world’s first eSIM store, as a Site Reliability Engineer! This remote-first, full-time position offers a chance to work on highly reliable systems within a growing engineering team. You will develop and maintain efficient systems, define service level objectives, conduct post-incident reviews, and drive automation. The ideal candidate possesses extensive experience in SRE, AWS services, Kubernetes, and various other technologies. Airalo provides excellent benefits including health insurance, a work-from-anywhere stipend, and annual wellness & learning credits.

Requirements

Bachelor’s degree in Computer Engineering or a similar discipline
5+ years of experience as a Site Reliability Engineer or in a similar role
3+ years of experience with AWS services including strong knowledge of container orchestration
2+ years of Kubernetes experience
Deep understanding of observability principles and tools (logging, monitoring, tracing)
Experience with incident management and postmortem analysis
Experience and interest in infrastructure as a code approach (Terraform)
Experience with chaos engineering and other techniques for testing system resilience
Experience with CI/CD tools such as GitHub Actions
Proficiency in at least one programming language (Python, Go, Java, etc.) for automation and tooling
Comfortable with messaging systems (SNS, SQS, etc)
Ability to work independently and collaboratively in a fast-paced environment
Team player and open to new ideas
Good communication skills and fluency in English

Responsibilities

Develop and maintain reliable, scalable, and efficient systems
Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve system reliability
Conduct blameless post-incident reviews to identify root causes and implement preventive measures
Drive automation of operational tasks and incident response
Develop and maintain runbooks and playbooks for common operational tasks and incident response
Mitigate operational risks
Work with software engineers to design systems for reliability, scalability, and maintainability
Continuously evaluate and optimize system performance, capacity, and cost
Participate in on-call rotation and be available to troubleshoot and resolve critical issues

Preferred Qualifications

Prior experience with Scrum and other agile methods
Certification in relevant areas such as AWS Certified DevOps Engineer, Certified Kubernetes Administrator (CKA), or similar
Experience with AI-driven SRE tools for anomaly detection and improvements
Contributions to open-source SRE projects or communities
Prior work experience in telecommunications
Knowledge of eSIM and GSMA related technologies and services

Benefits

Health Insurance
Work-from-anywhere stipend
Annual wellness & learning credits
Annual all-expenses-paid company retreat in a gorgeous destination
Other benefits

Senior Site Reliability Engineer

Airalo

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

ModMed

Remote

DevOps

Senior

Remote

DevOps

Senior

Vantage

Remote

DevOps

Senior

Remote

DevOps

Senior

Wizeline

Remote

DevOps

Senior

Algolia

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Supermetrics

Remote

DevOps

Senior