Senior Site Reliability Engineer at Fastly

Summary

Join Fastly's Technical Operations team as a Senior Site Reliability Engineer and play a key role in ensuring the reliability, performance, and scalability of our infrastructure. You will develop and enhance automation, refine monitoring, champion platform stability, and drive continuous improvement. This role requires expertise in Linux/Unix systems, proficiency in Go or Python, and experience with large-scale infrastructure. The position offers a hybrid or remote work option within the US, with preferred locations in San Francisco, New York, and Denver. Competitive salary, comprehensive benefits, and opportunities for professional growth are included.

Requirements

Expertise in Linux/Unix systems with hands-on experience tuning, troubleshooting and operating systems at scale
Proficiency in software development using Go or Python, including experience writing robust, maintainable, and efficient code for infrastructure automation
Experience operating large-scale infrastructure in on-prem, cloud, or hybrid environments, with a focus on reliability, scalability, and automation
End-to-end system knowledge, from design and provisioning to deployment, monitoring, and long-term operations
Strong networking fundamentals, including TCP/IP, DNS, HTTP, and TLS, with practical experience debugging network issues
Proven ability to manage and scale highly available, distributed systems, or the demonstrated capability to quickly ramp up in such environments
Motivated to learn and adapt to a new tech stack, embracing unforeseen challenges with a problem-solving mindset

Responsibilities

Develop and enhance automation and tooling to reduce manual toil across the Fastly fleet
Refine monitoring and alarming to ensure focus on key metrics for optimal performance and customer stability
Champion platform stability and customer reliability through cross-team collaboration
Drive continuous improvement by learning from operational challenges and integrating insights into future plans
Address large-scale challenges and optimize systems for efficiency and performance

Preferred Qualifications

Experience with BGP and network routing in large-scale environments or deep knowledge of the Linux networking stack
Hands-on Kubernetes experience or expertise with other container orchestration platforms in production environments

Benefits

Medical, dental, and vision insurance
Family planning, mental health support along with Employee Assistance Program
Insurance (Life, Disability, and Accident)
A Flexible Vacation policy and up to 18 days of accrued paid sick leave
401(k) (including company match)
An Employee Stock Purchase Program
11 paid local holidays
11 paid company wellness days

Senior Site Reliability Engineer

Fastly

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Similar Remote Jobs

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

BeyondTrust

Remote

DevOps

Senior

Natera

Remote

DevOps

Senior

Wisp

Remote

DevOps

Senior

ServiceNow

Remote

DevOps

Senior

Loadsmart

Remote

DevOps

Senior

Exygy

Remote

DevOps

Senior