Remote Senior Site Reliability Engineer
closedCollectivei
π΅ $150k-$185k
πRemote - United States
Job highlights
Summary
Join Collective[i], a private 100% remote company, as a Senior Site Reliability Engineer and contribute to building a platform for prosperity that helps companies generate sales and people expand their professional connections.
Requirements
- Proficiency with AWS, Terraform, Packer, Ansible, and container technologies
- Expertise in AWS services
- Experience with other cloud providers is a plus
- Strong knowledge of Ubuntu 24.04 , Bash, Python, systemd, podman, docker, and auditd
- Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot
- Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog
- Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch
- Familiarity with security tools like Snyk, Tenable.io , and 1Password
- Experience with SOC 2 or other compliance standards is highly desirable
Responsibilities
- Manage AWS infrastructure across multiple accounts using Terraform with extensive experience in deployment and automation
- Utilize Linux and open-source tooling as the foundation of your work, being proficient across various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, with a preference for Ansible
- Develop and implement containerization strategies, ensuring well-crafted container builds. Must be capable of creating original containers and not just relying on third-party containers from public repositories
- Assess and apply Kubernetes knowledge selectively, understanding when and why it is appropriate to useβnote, we are not a Kubernetes-focused environment
- Collaborate closely with development teams, providing support in building and optimizing distributed systems
- Maintain expertise in Git workflows, including proficiency in CI/CD automation tools such as GitHub Actions
- Implement and manage monitoring and logging solutions, with hands-on experience in tools like DataDog and OpenTelemetry
- Strive to prevent issues like log diving, incident response, root cause analysis, and late-night pages by proactively managing system stability and reliability
This job is filled or no longer available
Similar Remote Jobs
- π°$60k-$120kπAsia
- π°$177k-$213kπUnited States
- πUnited Kingdom
- πUnited States
- πCanada
- πPoland
- π°$167k-$201kπUnited States
- Nπ°$68k-$98kπWorldwide
- π°$125k-$150kπCanada
- π°$154k-$258kπWorldwide