Crisis Text Line is hiring a
Senior Infrastructure Site Reliability Engineer in Worldwide

Logo of Crisis Text Line
Senior Infrastructure Site Reliability Engineer
🏢 Crisis Text Line
💵 $107k-$162k
📍Worldwide
📅 Posted on May 26, 2024

Summary

Crisis Text Line is a mental health support organization that provides free, 24/7 text-based crisis intervention services. The engineering, product, and design teams at Crisis Text Line are seeking a Site Reliability Engineer (SRE) to lead and maintain the cloud infrastructure on AWS Fargate, design and maintain CloudWatch alerting and monitoring configurations, mentor junior team members, collaborate with cross-functional teams, and automate repetitive tasks. The ideal candidate should have experience in site reliability engineering or related roles, hands-on experience with AWS services, proficiency with infrastructure as code tools, strong scripting and automation skills, experience with container orchestration platforms, a solid understanding of networking concepts, security best practices, and DevOps principles, and strong problem-solving skills.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field (Master's degree preferred) or equivalent experience
  • Experience in site reliability engineering (SRE) or related roles, with a focus on cloud infrastructure management
  • Hands-on experience with AWS services, particularly AWS Fargate, CloudWatch, and related tools
  • Proficiency with infrastructure as code (IaC) tools such as Terraform or CloudFormation
  • Strong scripting and automation skills using languages such as Python, Bash, or PowerShell
  • Experience with container orchestration platforms such as Kubernetes or Amazon ECS
  • Solid understanding of networking concepts, security best practices, and DevOps principles
  • Strong problem-solving skills and the ability to work effectively in a fast-paced, collaborative environment

Responsibilities

  • Lead, and maintain highly available, scalable, and secure infrastructure on AWS Fargate
  • Design and maintain CloudWatch alerting and monitoring configurations to proactively identify and resolve potential issues
  • Mentor and guide junior team members, sharing best practices and promoting a culture of excellence
  • Collaborate with cross-functional teams to define and implement best practices for infrastructure as code (IaC), continuous integration/continuous deployment (CI/CD), and site reliability engineering (SRE) methodologies
  • Lead in incident response and resolution, including troubleshooting complex system issues and implementing preventive measures to minimize downtime
  • Automate repetitive tasks and processes to improve operational efficiency and reduce manual intervention
  • Conduct performance tuning and optimization of infrastructure components to ensure optimal resource utilization and cost efficiency
  • Stay up-to-date with emerging technologies and industry trends to drive innovation and continuous improvement

Benefits

  • 20 paid holidays including: Federal holidays like Juneteenth and Labor Day, Election day, Holiday break from Dec 24 through January 1, 2 days for renewal, and 2 floating holidays
  • Flexible paid time off, including: 15 vacation days, 3 personal days, and 7 sick days
  • Medical, dental, and vision benefits for the staff member and family at no cost to the employee
  • 403B retirement plan (the nonprofit equivalent of a 401K): 3% contribution by Crisis Text Line to support building financial wellness, regardless of personal contribution
  • 12 weeks paid parental leave (after 6 months of employment)
  • Student loan repayment (after 2 years of continuous full time service)
  • Family support through a virtual childcare platform
  • Stipends/Allowances for mental health, internet service, professional development, and wellness
Help us out by mentioning to Crisis Text Line that you discovered this job opportunity on JobsCollider. Your support is greatly appreciated. Thank you 🙏
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs