Senior Site Reliability Engineer at Kontakt.io

Summary

Join Kontakt.io as a Site Reliability Engineer (SRE) and ensure the scalability, availability, and security of our cloud-based AI-driven healthcare platform. You will collaborate with various teams to build highly resilient and automated systems, impacting how healthcare organizations leverage real-time data. Your expertise in cloud infrastructure, automation, monitoring, and performance optimization will be crucial. Passion for highly available systems and automation is essential. Help us build the future of smart care operations! We offer competitive salary, stock options, flexible work options, and benefits.

Requirements

3+ years of experience as an SRE
Strong expertise in Kubernetes, Docker, and container orchestration
Experience managing cloud-native environments (AWS)
Experience with event-driven architectures, Kafka, or real-time data streaming
Knowledge of machine learning infrastructure
Previous experience in healthcare, compliance (HIPAA), and highly regulated environments
Proficiency in Infrastructure as Code (IaC) using Terraform
Deep knowledge of networking, DNS, load balancing, and security best practices
Experience with CI/CD pipelines (Jenkins, CI, or ArgoCD)
Hands-on experience with monitoring and logging tools (Prometheus, Grafana, ELK, OpenTelemetry)
Strong programming skills in Python, Golang, or Bash for automation

Responsibilities

Design and maintain highly available, fault-tolerant, and scalable cloud infrastructure
Implement SLOs, SLIs, and SLAs to track system reliability and optimize uptime
Participate in 24/7 on-call rotation
Oversee production platform deployments
Monitor latency, traffic, errors, and system health using modern observability tools
Conduct root cause analysis (RCA) and post-mortems to continuously improve system resilience
Automate infrastructure provisioning using Terraform, Ansible, or Pulumi
Implement CI/CD pipelines to ensure seamless and safe deployments
Enable self-healing mechanisms using Kubernetes operators, auto-scaling, and fault detection
Ensure compliance with HIPAA, GDPR, and other healthcare data regulations
Define and execute disaster recovery (DR) and business continuity plans
Manage and optimize AWS environments for cost-efficiency and performance
Deploy and manage observability tools and build real-time alerting and response frameworks
Establish best practices for logging, debugging, and performance monitoring
Improve incident response automation through runbooks, AI-based anomaly detection, and predictive analytics

Benefits

Work on a mission-driven platform that improves healthcare operations and patient outcomes
B2B contract or an employment agreement
Competitive salary and stock option plan
Collaborate with top engineers, data scientists, and AI experts
Flexible remote or hybrid work options (office in Krakow)
Collaborative and self-organized environment
Private medical care, cafeteria system

Senior Site Reliability Engineer

Kontakt.io

Summary

Requirements

Responsibilities

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior