Site Reliability Engineer Lead

Input Output Logo

Input Output

💵 $150k-$175k
📍Remote - United States

Summary

Join IOG, a blockchain technology company, as a Site Reliability Engineer Lead. You will lead a team, ensuring high-quality, stable environments for customers. Responsibilities include managing build and deployment cycles, supporting multi-tier applications, building automation tools, and improving monitoring systems. You will collaborate with agile teams and foster a DevOps culture. The role requires a Bachelor's degree or equivalent experience, 5+ years in SRE/DevOps, and 2+ years in a leadership role. Strong Linux, networking, and programming skills are essential. IOG offers remote work, laptop reimbursement, learning opportunities, competitive PTO, medical/dental/vision benefits, 401k, and a health savings account.

Requirements

  • Bachelor’s Degree or higher in Computer Science, Software Engineering, or related technical field, or equivalent practical experience
  • 5+ years of professional experience in SRE, DevOps, Platform Engineering, or Infrastructure roles
  • 2+ years in a technical leadership or senior engineering capacity
  • Proven track record of building and operating highly available, distributed, fault-tolerant systems
  • Strong foundation in Linux system internals, networking (TCP/IP, DNS, HTTP), and systems programming
  • Experience leading incident responses, writing post-mortems, and driving reliability improvements
  • Experience working with Agile, Kanban, or similar development methodologies
  • You will be someone who works well on your own and with a team
  • You value cooperation and collaboration above all, and are not afraid to ask for clarification or help when needed
  • You are kind and respectful of others’ opinions, and you are open and act with integrity when engaging in academic or technical discussions
  • Strong scripting and programming skills: Bash, Python, Go, or Rust preferred
  • Extensive experience with Git: branching strategies, GitOps workflows, code review best practices
  • Experience with CI/CD systems, such as GitHub Actions, GitLab CI, Jenkins, Buildkite, or equivalent
  • Cloud platform proficiency: AWS, GCP, Azure — including compute, storage, networking, and IAM
  • Containerization and orchestration: deep experience with Docker and Kubernetes (k8s), Helm
  • Infrastructure as Code (IaC): using Terraform, Pulumi, or similar tools
  • Configuration management: Ansible, Chef, or SaltStack (with preference for declarative approaches)
  • Monitoring, logging, and observability: Prometheus, Grafana, Loki, OpenTelemetry, Datadog, or similar
  • Security best practices: secrets management (Vault, SOPS), least privilege, security incident handling
  • Incident Management and Root Cause Analysis (RCA): strong ownership in production reliability
  • Automated testing and validation: unit testing, integration testing, chaos engineering exposure
  • Experience managing large-scale Linux-based systems: operational excellence in Ubuntu, Debian, or NixOS environments
  • Advocate of DevOps/SRE culture: focus on reducing toil, Service Level Objectives (SLOs), error budgets
  • Strong communication skills: written and verbal, capable of collaborating across distributed teams

Responsibilities

  • Working on ‘build and deployment cycles’ across all development environments
  • Supporting the build, deployment, and configuration management for multi-tier applications
  • Participating in the building of tools and processes to support the infrastructure
  • Improving and maintaining tooling and scripts for automation purposes
  • Develop tooling for internal and external users to monitor and maintain production systems
  • Supporting our teams to write software that is simple and flexible to configure and deploy
  • Collaborating with agile teams to establish and maintain automated regression suite infrastructure and performance testing infrastructure
  • Building capabilities to allow development teams to be self-sufficient
  • As Leaders it is our responsibility to motivate, develop and progress our fellow team members
  • As a Leader there is a need to communicate openly with all members of your team, address any issues head on and not shy away from difficult conversations
  • Empowering your team to provide the best results by organizing clear processes and coordinating team efforts should be your top priority

Preferred Qualifications

Demonstrated experience in open-source contribution is highly desirable

Benefits

  • Remote work
  • Laptop reimbursement
  • New starter package to buy hardware essentials (headphones, monitor, etc)
  • Learning & Development opportunities
  • Competitive PTO and Sick Leave plan
  • Medical, Dental, and Vision benefits coverage for the employee and dependents
  • 401k
  • Health Savings Account
  • Life Insurance

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.