Remote Staff Software Engineer, Site Reliability Engineer
Babylist
π΅ $159k-$239k
πRemote - United States
Please let Babylist know you found this job on JobsCollider. Thanks! π
Job highlights
Summary
Join Babylist as a Staff and Senior Software Engineer, Site Reliability to ensure the stability, scalability, and reliability of our systems and services. You will work closely with all Babylist Engineering teams to support shared infrastructure and developer tools.
Requirements
- 6+ years of experience as a Site Reliability Engineer or similar role, demonstrating a strong background in maintaining highly available and scalable systems
- Experience supporting high-traffic consumer-facing websites, understanding the unique challenges and considerations in maintaining such systems
- Proficiency with Terraform is a must, as you will be a member of the team responsible for managing and building our AWS infrastructure using Infrastructure as Code (IaC) practices
- You possess strong experience working with AWS cloud-based infrastructure and services, ensuring their reliability, performance, and security
- Proficiency with Docker and Kubernetes is essential, as you will contribute to the design, deployment, and management of containerized applications in our environment
- You have a solid understanding of cloud-native systems design, including CDNs, load balancers, cloud networking, DNS, caching, and distributed systems
- Troubleshooting and debugging are second nature to you, allowing you to quickly identify and resolve issues across various environments
- Experience designing and supporting CI systems such as CircleCI, Jenkins, or GitHub Actions
- You are familiar with monitoring and alerting best practices, utilizing tools like Datadog, Cronitor, Sentry, and PagerDuty to ensure proactive identification and resolution of issues
- Proven experience in on-call management best practices, including effective incident response, escalation procedures, and post-incident reviews to drive continuous improvement and ensure system reliability
- You have excellent verbal and written communication skills, and the ability to collaborate effectively with cross-functional teams
Responsibilities
- Manage and build our AWS infrastructure using Infrastructure as Code (IaC) tools like Terraform
- Improve the speed and reliability of our Continuous Integration (CI) systems to support the entire Engineering Team, enabling faster and more efficient development and deployment processes
- Provide support to developers in troubleshooting issues across local development, staging, and production environments
- Establish, communicate, and support best practices for monitoring and alerting. This will involve setting up effective monitoring systems and defining actionable alerts for proactive incident management
Benefits
- Company paid medical, dental, and vision insurance
- Generous paid parental leave policy
- 401k with company match
- Flexible spending account
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
- π°$195k-$220kπUnited States
- π°$129k-$161kπCanada
- π°$244k-$304kπUnited States
- π°$204k-$259kπUnited States
- πWorldwide
- πUnited States
- π°$135k-$178kπWorldwide
- πWorldwide
- πAustralia
Please let Babylist know you found this job on JobsCollider. Thanks! π