Site Reliability Engineer II

Logo of Earnest

Earnest

πŸ’΅ $155k-$175k
πŸ“Remote - United States

Job highlights

Summary

Join Earnest, a company dedicated to making higher education more accessible and affordable. As a Site Reliability Engineer II, you will play a crucial role in ensuring the reliability and scalability of our systems. You will be responsible for setting up monitoring, developing IaC, implementing tools to measure SLOs/SLAs/SLIs, and automating infrastructure. This role requires 3+ years of experience in SRE or a similar field, hands-on experience with cloud providers (AWS), containerization, CI/CD, and observability tools. The position offers a competitive salary, remote work flexibility with monthly in-office collaboration, and a comprehensive benefits package including health insurance, retirement plan, paid time off, and more. Earnest fosters a culture of growth, humility, and ownership, making it an ideal environment for driven and collaborative individuals.

Requirements

  • 3+ years of professional experience in Site Reliability Engineering or a similar role, with a focus on infrastructure, automation, and system reliability
  • Hands-on experience with cloud providers (AWS), containerization (Kubernetes, Docker), CI/CD pipelines, and observability tools (e.g., Prometheus, Grafana or New Relic/Splunk)
  • Willing to travel to the Oakland office monthly to engage with team members and strengthen collaboration

Responsibilities

  • Set up and maintain comprehensive monitoring, create and refine playbooks, build dashboards, and adopt industry-standard practices to enhance the reliability and resilience of our site and systems
  • Develop and manage IaC to ensure reliable, scalable, and high-performance systems, reducing configuration drift and enabling rapid recovery
  • Implement and maintain both in-house and SaaS-based tools to measure SLOs, SLAs, and SLIs, ensuring we meet our reliability targets and provide transparency into system health
  • Identify opportunities for automation across the infrastructure to minimize manual interventions, streamline operations, and improve response times
  • Participate in on-call rotations, respond to incidents, conduct root cause analyses, and contribute to post-incident reviews to drive improvements
  • Work closely with cross-functional teams to enhance system design, support code deployments, and optimize system performance

Preferred Qualifications

  • Passionate about seeking opportunities to innovate and implement changes that enhance system reliability and client satisfaction
  • Champions self-service infrastructure solutions to empower development teams and accelerate deployment cycles
  • Embodies continuous improvement and is committed to driving projects beyond "good enough" toward operational excellence
  • Proactively identifies potential issues and implements preventive measures to ensure consistent system uptime
  • Able to clearly document processes and communicate with technical and non-technical stakeholders to ensure alignment

Benefits

  • Health, Dental, & Vision benefits plus savings plans
  • Mac computers + work-from-home stipend to set up your home office
  • Monthly internet and phone reimbursement
  • Employee Stock Purchase Plan
  • Restricted Stock Units (RSUs)
  • 401(k) plan to help you save for retirement plus a company match
  • Robust tuition reimbursement program
  • $1,000 travel perk on each Earnie-versary to anywhere in the world
  • Competitive days of annual PTO
  • Competitive parental leave

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs

Please let Earnest know you found this job on JobsCollider. Thanks! πŸ™