Site Reliability Engineer

Logo of One

One

πŸ’΅ $175k-$205k
πŸ“Remote - United States

Job highlights

Summary

Join One's mission to help customers achieve financial progress as a Site Reliability Engineer (SRE). Ensure availability and reliability of critical services, collaborate with engineering teams, and drive incident management process.

Responsibilities

  • Working proactively with engineering teams to help them set SLOs and implement best practices for logging and telemetry collection
  • Design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting
  • Participating in a 12x7 on-call rotation supporting the health of our services
  • Driving the incident management process and support a blameless post-mortem culture
  • Participating in application design consulting and capacity planning
  • Defining and formalizing SRE practices and help guide the overall reliability engineering direction
  • Providing mentorship both formally and informally to engineers at One
  • Continuously optimizing systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability
  • Combining software and systems knowledge to engineer high-volume distributed systems in a reliable, scalable, and fault-tolerant manner

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let One know you found this job on JobsCollider. Thanks! πŸ™