Site Reliability Engineer at Zipdev

Summary

Join Zipdev's team of LatAm developers as a remote Site Reliability Engineer. You will collaborate with product teams to build, deploy, and monitor cloud services, ensuring the reliability of critical systems. This role involves developing code and frameworks for monitoring, troubleshooting complex systems, and collaborating with various teams. You will also engage in capacity planning, demand forecasting, and standardization efforts. The ideal candidate possesses a Bachelor's degree in Computer Science or equivalent experience, along with 3-4 years of SRE experience and proficiency in programming languages like Python or Go. The position offers remote work, paid time off, and various benefits.

Requirements

Bachelor’s degree in Computer Science or equivalent work experience as System Administrator with programming skills
3 -4 years of proven professional experience as a Site Reliability Engineer
Experience with one or more general-purpose programming/scripting languages including but not limited to: Python, Bash, Perl or Go
Fundamental knowledge of technologies across a broad range of disciplines: virtualization storage, networking, server, and security
Understanding of systems and application design, including the operational trade-offs of various designs
Demonstrable knowledge of Unix, TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures
Experience in analyzing logs and troubleshooting large-scale distributed systems
Excellent organization, time management, and communication skills
Currently living in Latin America

Responsibilities

Build systems and infrastructure to monitor complex, large-scale distributed systems
Identify stability/performance issues and collaborate with developers to triage critical issues in production systems
Represent the SRE organization in design reviews and operational readiness exercises for new and existing services
Devise ways to actively monitor system throughput, capacity and reliability
Ability to debug complex systems and evolve a running environment without downtime
Engage in service capacity planning and demand forecasting, software performance analysis and system tuning
Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization

Preferred Qualifications

Experience with instrumenting and monitoring production systems (ELK stack, Zabbix, Nagios, Statsd/Graphite, APM, etc.)
Experience with Amazon AWS Infrastructure (EC2, S3, VPC, Security Groups, RDS) and related services desired
A working understanding of Docker, Vagrant, Ansible/Chef/Puppet

Benefits

Vacation: 10 business days a year
Holidays: 5 National Holidays a year
Company Holidays: 5 Company Holidays a year (Christmas Eve, Christmas Day, New Year's Eve, New Year's Day, Zipdev Day)
Parental Leave
Health Care Reimbursement
Active Lifestyle Reimbursement
Quarterly Home Office Reimbursement
Payroll Deduction Purchase Plans
Longevity Bonus
Continuous Learning Bonus
Access to Training and Professional Development Platforms
Work remotely Monday - Friday, 40 hours a week (no weekends)

Site Reliability Engineer

Zipdev

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Tailor

Remote

Software Development

Mid-level

Remote

DevOps

Senior

Kraken Digital Asset Exchange

Remote

DevOps

Mid-level

Kraken Digital Asset Exchange

Remote

DevOps

Mid-level

GoDaddy

Remote

DevOps

Mid-level

Remote

DevOps

Senior