Site Reliability Engineer

Logo of KnowBe4

KnowBe4

πŸ’΅ $110k-$125k
πŸ“Remote - Worldwide

Job highlights

Summary

Join KnowBe4 as an Internal SRE and ensure the reliability, scalability, and performance of our internal systems and infrastructure. You will manage and maintain GitLab environments, design CI/CD pipelines, monitor system performance, and collaborate with development teams. This role requires expertise in AWS services, infrastructure-as-code, and observability tools. Proactive problem-solving, automation, and strong communication skills are essential. The position offers a competitive salary and a range of benefits, including company-wide bonuses, employee referral bonuses, adoption assistance, tuition reimbursement, and certification reimbursement.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, or a related field
  • Equivalent work experience in SRE, DevOps, or infrastructure management may substitute for formal education
  • GitLab Administration: Experience managing and securing self-hosted GitLab environments
  • CI/CD Workflows: Expertise in designing and maintaining automated pipelines for continuous delivery
  • AWS Cloud Expertise: Strong knowledge of AWS services, including ECS, S3, API Gateway, DynamoDB, RDS, IAM, VPC, and Lambda
  • Infrastructure-as-Code: Proficiency in Terraform, Ansible, or similar tools
  • Monitoring and Observability: Experience with Prometheus, Grafana, Datadog, or other observability platforms
  • Automation and Scripting: Proficiency in Python, Bash, or other scripting languages to automate tasks
  • Incident Management: Ability to lead incident response efforts and conduct root cause analysis
  • Collaboration and Communication: Strong interpersonal skills to work effectively across teams and with stakeholders

Responsibilities

  • Manage and maintain GitLab environments to ensure high availability and security
  • Design and implement CI/CD pipelines to automate software delivery
  • Monitor and troubleshoot system performance issues, using observability tools like Prometheus, Grafana, or Datadog
  • Collaborate with development teams to align infrastructure efforts with project needs and timelines
  • Build and maintain infrastructure as code (IaC) solutions using tools like Terraform and Ansible
  • Manage AWS services, including ECS, S3, API Gateway, DynamoDB, RDS, IAM, and VPC
  • Participate in incident response, conducting root cause analysis and post-incident reviews
  • Automate manual tasks to improve operational efficiency and reduce technical debt

Benefits

  • Company-wide bonuses based on monthly sales targets
  • Employee referral bonuses
  • Adoption assistance
  • Tuition reimbursement
  • Certification reimbursement
  • Certification completion bonuses
  • Relaxed dress code

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let KnowBe4 know you found this job on JobsCollider. Thanks! πŸ™