Staff Software Engineer

closed
Chainguard Logo

Chainguard

πŸ’΅ $192k-$235k
πŸ“Remote - United States

Summary

Join Chainguard, a leader in secure open-source software, as a Site Reliability Engineer. You will design, build, and scale robust cloud-based infrastructure, ensuring high availability and performance. Responsibilities include incident response, optimizing infrastructure, and promoting best practices. The ideal candidate possesses strong Linux experience, expertise in infrastructure-as-code tools, and software development skills. We offer a competitive salary, comprehensive benefits, and a collaborative work environment. If your experience closely aligns with our requirements, we encourage you to apply.

Requirements

  • Comfortable working and thriving within a Linux ecosystem
  • Experience supporting high availability distributed production systems
  • Experience with database administration and support
  • Treated infrastructure as code utilizing tools like Terraform, Ansible, Chef, Puppet, and SaltStack
  • Familiarity working in a public cloud platform (GCP, AWS, Azure)
  • Software development skills in at least one of the following languages: Python, Go, Javascript, and/or Ruby
  • B.S. or M.S. in Computer Science or related field or equivalent in related work experience
  • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  • Ability to learn about the supply chain security space

Responsibilities

  • Practice continuous improvement, by iterating on how services are deployed, configured, monitored, and maintained on our platform
  • Lead incident response, diagnosis, and follow-up on system outages and alerts
  • Help develop an operational focus and act as thought leaders for the rest of engineering
  • Maintain and optimize infrastructure for performance, scalability, and cost
  • Analyze system metrics and identify opportunities for improvement in reliability and efficiency

Preferred Qualifications

  • Experience scaling services in a performant and cost-effective manner
  • Implemented incident management and disaster recovery playbooks
  • Knowledge of microservices architecture and containerization (Docker/OCI, Kubernetes)
  • Familiarity across multiple public cloud platforms (GCP, AWS, Azure)
  • Operated a multi-tenant capable software defined network (SDN)
  • Linux systems troubleshooting and debugging skills
  • Solid understanding of data structures, algorithms, API design, and software design patterns
  • Interest in open source software projects and communities

Benefits

  • Equity/stock options
  • Unlimited PTO
  • Remote work with flexible coworking and team meetup opportunities
  • Home office and internet stipend
  • 100% health/dental/vision insurance coverage for you and your family
  • Monthly Wellness budget
This job is filled or no longer available