Site Reliability Engineer

Tekmetric Logo

Tekmetric

πŸ“Remote - Worldwide

Summary

Join Tekmetric, a leading cloud-based platform for auto repair shops, and contribute to our innovative team. Design and implement scalable infrastructure, monitor system performance, and automate processes to ensure high availability and security. Collaborate with cross-functional teams and provide technical leadership. The ideal candidate possesses 5+ years of DevOps experience, strong cloud infrastructure knowledge (AWS or GCP), and expertise in automation, containerization, and monitoring tools. We offer competitive salaries, generous PTO, comprehensive health benefits, retirement plans, and professional development opportunities.

Requirements

  • Experience: 5+ years of experience in DevOps, Site Reliability Engineering (SRE), or a related field, with deep knowledge of cloud environments (preferably AWS or GCP.)
  • Cloud Infrastructure: Hands-on experience with AWS (or similar cloud providers) and infrastructure as code (Terraform, etc.)
  • Automation: Strong experience in automation tools
  • Containerization: Expertise in working with containerized environments like Docker and orchestration tools such as Kubernetes
  • Monitoring and Logging: Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
  • Scripting: Proficiency in scripting languages like Python, Bash, or similar
  • CI/CD pipelines: Experience with designing and optimizing Continuous Integration and Continuous Deployment (CI/CD) pipelines
  • Collaboration and Communication: Strong communication skills and ability to work cross-functionally, solving complex technical challenges in a collaborative manner
  • Problem-solving mindset: Ability to troubleshoot and resolve critical issues in high-pressure environments, maintaining composure and professionalism

Responsibilities

  • Design and implement scalable infrastructure
  • Architect and maintain reliable, scalable, and secure cloud infrastructure that supports positive user experiences and measurable business growth
  • Monitor and optimize system performance
  • Develop and maintain monitoring, alerting, and incident response practices to ensure system reliability and performance at scale
  • Automate everything
  • Create automated pipelines for deployment, testing, and infrastructure management to improve speed, consistency, and reliability across the organization
  • Ensure high availability and disaster recovery
  • Implement and manage solutions for backup, disaster recovery, and failover processes to ensure business continuity
  • Security and compliance
  • Apply best practices in security, monitoring, and compliance, ensuring that systems meet necessary requirements and regulations
  • Collaboration
  • Work cross-functionally with development, data, product, and QA teams to improve application reliability and scalability
  • Leadership and mentorship
  • Provide technical leadership, mentorship, and guidance to junior DevOps team members, fostering a culture of continuous learning and improvement

Preferred Qualifications

  • Experience with Infrastructure as Code tools like Terraform
  • Familiarity with monitoring tools like Prometheus, Grafana, or the ELK stack
  • Exposure to compliance and security best practices in cloud environments
  • Experience coding in one or multiple programming languages such as Go, Java, Javascript

Benefits

  • Enjoy the flexibility of remote work
  • Competitive base salaries that reflect your value
  • Generous Paid Time Off, because we know you do your best work when you're well-rested
  • Support for every stage of lifeβ€”with paid maternity, parental bonding, and medical leave for you or your loved ones
  • Comprehensive health benefits, including Medical, Dental, Vision, and Prescription coverage. For employee only, we offer plans that cover 100% of premiums and we cover 50% of costs for families
  • Prioritizing your mental health: get free, confidential counseling through our partnership with BetterHelp
  • 401(k) Retirement Savings Plan with 100% employer match on contributions up to 6% - so your future self will thank you
  • Flexible Spending Accounts (FSA) and Health Savings Accounts (HSA) to make your money go further
  • Life and Accidental Death & Dismemberment (AD&D) Insurance for added peace of mind
  • Wellness on your terms: get up to $60/month toward fitness, mental health, or almost anything that helps you feel your best
  • After one year of employment, enjoy a $300 home office setup bonus to help make your space work for you
  • Keep growing with support for continuing education - we’re invested in your development

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.