Site Reliability Engineer at 66degrees

Summary

Join 66degrees, a leading consulting firm specializing in AI-focused, data-led solutions, as a Site Reliability Engineer (SRE). Work with major cloud users to transform their businesses using Google Cloud Platform expertise and DevOps methodologies. Daily tasks involve solving critical outages, designing and deploying cloud workloads, and building self-healing automation. You will utilize cutting-edge Google Cloud technologies and collaborate with clients, your team, and Google engineers. This role requires a proactive approach to client workloads, ensuring availability and a great customer experience. Contribute to initiatives like documentation, open-sourcing, and operational improvements.

Requirements

Minimum 3+ years of cloud and infrastructure experience, including demonstrated expertise with Linux, Windows, k8s, databases, and networking services
Proficiency with Python required
Strong provisioning and configuration skills using Terraform
Experience with 24x7x365 monitoring, incident response, and on-call support
Experience in troubleshooting that spans systems, network, and code
Experience determining & negotiating Error budgets, SLIs, SLOs, and SLAs with product owners
Demonstrate the ability to work independently and as a member of a greater team, including cross-team activities
Experience working in Agile Scrum, Kanban methodologies in SDLC
Proven experience balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale
Strong communication skills, as this is a heavily customer-facing role
Bachelor’s degree in computer science, electrical engineering, or equivalent required

Responsibilities

Ensure near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement
Create highly automated, available and scalable systems by applying software and infrastructure principles
Employ and advise clients on DevOps and SRE principles and practices, covering deployment pipelines, HA, service reliability, technical debt, and operational toil for live services running at scale
Provide a proactive approach to our clients’ workloads, anticipating failures, automating tasks, ensuring availability, and providing a great customer experience
Work closely with clients, your team, and Google engineers to investigate and resolve infrastructure issues
Contribute to ad-hoc initiatives such as writing documentation, open-sourcing, and improving operation, making a huge impact at a rapid-growth Google Premier Partner

Preferred Qualifications

2+ years of Google Cloud experience and related certifications strongly preferred but not required
Other programming language experience is a plus

Site Reliability Engineer

66degrees

Summary

Requirements

Responsibilities

Preferred Qualifications

Remote

DevOps

Mid-level

Similar Remote Jobs

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Tailor

Remote

Software Development

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Keeper Security, Inc.

Remote

DevOps

Mid-level

Remote

DevOps

Principal

Hostinger International

Remote

DevOps

Mid-level