Senior Site Reliability Engineer at Collectivei

Summary

Join Collective[i], a private 100% remote company, as a Senior Site Reliability Engineer and contribute to building a platform for prosperity that helps companies generate sales and people expand their professional connections.

Requirements

Proficiency with AWS, Terraform, Packer, Ansible, and container technologies
Expertise in AWS services
Experience with other cloud providers is a plus
Strong knowledge of Ubuntu 24.04 , Bash, Python, systemd, podman, docker, and auditd
Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot
Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog
Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch
Familiarity with security tools like Snyk, Tenable.io , and 1Password
Experience with SOC 2 or other compliance standards is highly desirable

Responsibilities

Manage AWS infrastructure across multiple accounts using Terraform with extensive experience in deployment and automation
Utilize Linux and open-source tooling as the foundation of your work, being proficient across various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, with a preference for Ansible
Develop and implement containerization strategies, ensuring well-crafted container builds. Must be capable of creating original containers and not just relying on third-party containers from public repositories
Assess and apply Kubernetes knowledge selectively, understanding when and why it is appropriate to use—note, we are not a Kubernetes-focused environment
Collaborate closely with development teams, providing support in building and optimizing distributed systems
Maintain expertise in Git workflows, including proficiency in CI/CD automation tools such as GitHub Actions
Implement and manage monitoring and logging solutions, with hands-on experience in tools like DataDog and OpenTelemetry
Strive to prevent issues like log diving, incident response, root cause analysis, and late-night pages by proactively managing system stability and reliability

Senior Site Reliability Engineer

Collectivei

Summary

Requirements

Responsibilities

Remote

DevOps

Senior

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior