Summary

Join Deel's Production Sandbox Reliability team as a Junior to Mid-level Site Reliability Engineer, acting as the first line of defense for enterprise customer sandboxes. You will ensure these environments remain stable, up-to-date, and well-monitored. This role involves hands-on infrastructure work and close collaboration with engineering and customer-facing teams. The ideal candidate thrives in high-ownership environments and wants to grow their SRE skills. Deel offers a dynamic, globally distributed team and a chance to make a meaningful impact on the future of work. The company boasts impressive growth and recognition, making it a career accelerator. Deel is committed to inclusivity and offers competitive compensation and benefits.

Requirements

1-3 years of experience in SRE, DevOps or Infrastructure Engineering roles
Experience with Node.js or Go
Familiarity with AWS cloud services (EKS, S3, RDS)
Hand-on experience with Kubernetes, including Helm and ArgoCD
Experience with observability stacks: Datadog, Grafana, Mimir, Loki, Tempo, Zabbix
Strong verbal and written communication skills – able to interface effectively with both technical and non-technical stakeholders
Self-starter mindset with an eye for operational excellence and continuous improvement

Responsibilities

Maintain reliability: Monitor the health of enterprise customer sandbox environments and ensure high availability, uptime, and stability across all services
Stay up-to-date: Regularly roll out updates to microservices inside each sandbox to ensure alignment with the latest versions
Alert response & escalation: Triage infrastructure and application alerts, perform initial investigation and escalate incidents to the appropriate engineering teams with clear context
Improve observability: Enhance metrics, logs and tracing coverage using Datadog and the Grafana stack (Mimir, Loki, Tempo), identifying gaps and driving better alerting practices
Support incident workflows: Collaborate in post-incident reviews and ensure root cause analysis is followed up with actionable items and improvements by relevant teams
Communicate proactively: Act as the bridge between internal engineering teams and customer-facing teams, providing timely updates during incidents, maintenance and version upgrades
Participate in on-call rotation: Provide continuous coverage across APAC, EMEA and LATAM time zones as part of a rotating on-call schedule (follow the sun)

Benefits

Stock grant opportunities dependent on your role, employment status and location
Additional perks and benefits based on your employment status and country
The flexibility of remote work, including optional WeWork access

Site Reliability Engineer

Deel

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Tailor

Remote

Software Development

Mid-level

Remote

DevOps

Senior

GoDaddy

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

DC SCORES

Remote

DevOps

Mid-level