Senior Site Reliability Engineer at Netlify

Summary

Join Netlify's Infrastructure SRE team as a Site Reliability Engineer and play a key role in designing, developing, and delivering solutions that enhance the scalability, availability, and efficiency of our platform. You will manage the full infrastructure lifecycle, participate in on-call rotations, automate tasks, conduct performance tuning and troubleshooting, and participate in disaster recovery planning. The role requires several years of experience in SRE or DevOps, expertise in hyperscale cloud environments, and strong understanding of network protocols and automation tools. Netlify offers a remote-first, globally distributed work environment with a focus on asynchronous communication and a commitment to diversity and inclusion. The company prioritizes a healthy work-life balance and offers competitive compensation and benefits.

Requirements

Several years of experience in SRE, DevOps, or related roles
Proven experience working in hyperscale cloud environments
Demonstrated ability to lead infrastructure projects
Strong understanding of network protocols and configurations
Experience with automation tools (e.g., Ansible, Terraform) and scripting languages (e.g., Python, Bash, Golang)
Experience automating component deployment across multiple environments using tools like Jenkins, CircleCI, or GitHub Actions
Proficient observability and log analysis techniques to detect and resolve system issues
Effective communication skills for both technical and non-technical stakeholders
Familiarity with compliance requirements and frameworks: PCI, ISO 2701, HIPAA, SOC

Responsibilities

Manage full infrastructure lifecycle from design to decommission, ensuring systems are reliable and efficient
Participate in an on-call rotation for the compute platform and related systems
Automate routine tasks and develop tools to improve system efficiency and reduce the human intervention time on any tasks
Conduct system performance tuning and troubleshooting, as well as capacity planning, to ensure system reliability and efficiency
Participate in the creation and testing of disaster recovery plans
Monitor and maintain observability systems to ensure issues are identified and resolved proactively
Educate team members on security best practices and emerging threats

Benefits

Remote work, flexible hours
Competitive compensation and benefits
Equity plan

Senior Site Reliability Engineer

Netlify

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Senior

Similar Remote Jobs

Remote

DevOps

Senior

GoDaddy

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

GoDaddy

Remote

DevOps

Senior

GoDaddy

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Daxko

Remote

DevOps

Senior