Senior Site Reliability Engineer at Nylas

Summary

Join Nylas' Site Reliability Engineering (SRE) team and ensure the reliable and efficient operation of our products, serving billions of API calls daily. You will support the engineering team, maintain and scale infrastructure across AWS and GCP, configure alerts and dashboards, manage CI/CD pipelines, and participate in on-call rotations. This role requires extensive experience in production engineering, Linux, cloud services, and automation. Nylas offers excellent benefits, including extended healthcare, unlimited PTO, RRSP contributions, an education stipend, cell phone reimbursement, and fully paid parental leave.

Requirements

Experience: Minimum of 5 years in production engineering, with hands-on experience in managing and scaling Linux-based production servers
Communication and Empathy: Exceptional communication skills and a strong empathetic approach, understanding that effective teamwork and problem-solving require more than just technical skills
Linux Proficiency: Advanced proficiency in navigating the Linux command line
Logging and Observability: Demonstrated experience with platforms like New Relic, Coralogix, Grafana, and Prometheus
Configuration Management: Experience in automating systems using modern tools such as Chef, Ansible, or Puppet
Containerization and Orchestration: Proven track record of deploying and managing services using Kubernetes and Docker
Cloud Services: Practical experience with major cloud services like AWS, GCP, or Azure, focusing on deploying and maintaining scalable applications
Programming Skills: Capability to write reliable code in at least one programming language such as Python, GoLang, or JavaScript
Learning Agility: Ability to rapidly learn and adapt to new technologies and frameworks
Automation and Infrastructure: Passion for building modern, scalable infrastructure and automating routine tasks to improve efficiency and reliability

Responsibilities

Support our engineering team with best practices and provisioning new infrastructure as necessary
Maintain and scale a legacy system in AWS with Ansible, Python, MySQL, Terraform
Maintain our new Infrastructure in GCP with Kubernetes, Helm, ArgoCD, Terraform, GoLang, OpenSearch, Spanner, Redis
Configuring and adjusting alerts and dashboards in NewRelic and Coralogix. Leveraging Fluent-Bit and OpenTelemetry
Managing and improving our CI/CD pipelines using ArgoCD and Helm
Take part in an on-call rotation and assist in debugging and resolving incidents

Preferred Qualifications

Candidates with expertise in tuning alerts, synthetics, and creating comprehensive health dashboards and reports will be preferred

Benefits

Healthcare: Extended healthcare coverage for you and your family
Unlimited Paid Time Off (PTO): We take this very seriously as we care about the well-being of our employees
RRSP with 3% employer contribution
Education Stipend: $1,250 annual education & development benefit
Cell Phone: $60 per month stipend towards cell phone reimbursement
Fully Paid Parental Leave: 12 weeks parental leave (maternity & paternity)

Senior Site Reliability Engineer

Nylas

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior