Remote Senior Production Engineer

Logo of HashiCorp

HashiCorp

πŸ“Remote - United States

Job highlights

Summary

Join the Terraform Platform Engineering group as a Senior Site Reliability Engineer to build and maintain the core services of HCP Terraform architecture. The team fosters operational maturity efforts, including documentation, training, and tooling related to service ownership, monitoring, and observability.

Requirements

  • Have 5+ years of production experience at scale working on any of the following: Backend applications written in Ruby on Rails, Databases, Observability, and services written in Go
  • Experience working closely with teams building Ruby/Rails and Go services
  • Experience building and supporting the production services for a large-scale SaaS application
  • Experience building and scaling distributed, highly available systems
  • Informed opinions from experience about service ownership best practices, incident response and resolution, and platform resiliency
  • Strive for quality through maintainable code and comprehensive testing from development to deployment
  • Communicate clearly while remaining empathetic and kind
  • Have an eagerness to learn through humility and reflection
  • Have experience debugging performance bottlenecks for live services and systems
  • Working knowledge of industry best practices related to information security

Responsibilities

  • Dive into problems with an eye to both immediate remediation as well as the follow-through changes and automation that will prevent future occurrences
  • Troubleshoot production incidents that often span across multiple teams, services, and codebases
  • Help develop and evangelize SRE best practices, techniques, and tools to the engineers building our services
  • Model our incident response process, leading by example during incidents and in blameless retrospectives
  • Maintain day-to-day vigilance with regards to operational security while helping to enhance the intrinsic security of the overall production system
  • Collaborate across teams to improve our tools based on experiences found from running our own software in production
  • Participate in a 24/7 on-call rotation that supports our production services

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs

Please let HashiCorp know you found this job on JobsCollider. Thanks! πŸ™