Platform Engineer - Kubernetes

Logo of Portainer.io

Portainer.io

πŸ“Remote - Argentina

Job highlights

Summary

Join Portainer, a leading tech company, as a highly skilled Platform Engineer - Kubernetes. Manage large-scale Kubernetes environments, implement and maintain the platform's reliability and scalability. Participate in an on-call rotation to handle critical incidents.

Requirements

  • 6 years of total experience in IT and platform engineering
  • 4 years managing Kubernetes environments
  • Experience with Docker Swarm is an advantage
  • Experience in operation, virtualization, cloud infrastructure (AWS, Azure, GCP), and DevOps practices
  • Familiarity with ITIL-based practices for incident management and service requests
  • Expertise in Kubernetes, Docker, and container orchestration tools
  • Experience with monitoring and logging tools (Prometheus, Grafana, Loki etc)
  • Proficient in scripting and automation (Python, Bash, Terraform, Ansible)
  • Knowledge of CI/CD pipelines and GitOps practices
  • Knowledge of Virtualization Technologies (VMware)

Responsibilities

  • Manage and optimize large-scale Kubernetes clusters
  • Perform version updates, configuration changes, and troubleshoot issues
  • Assist with and maintain container orchestration using Kubernetes
  • Maintain and expand the platform solution to meet SLA/OLS requirements
  • Perform platform moves/adds/changes and monitor core platform metrics
  • Manage load across components and ensure normal operating parameters
  • Implement component updates for defect resolution and preventive maintenance
  • Create and maintain documentation for service levels, roles, and responsibilities
  • Conduct platform reviews and tooling deployments
  • Aid in the use of GitOps pipelines and assist in application deployment strategies
  • Provide guidance on namespace, cluster, access control, and isolation best practices
  • Implement blue/green deployment strategies and assist with performance issues
  • Develop automations for preventative maintenance and operational efficiency
  • Create and validate cluster recovery guides to ensure infrastructure recoverability
  • Be part of a team that provides 24/7 emergency engineering support with a 1-hour response SLA
  • Analyze alerts and perform root analysis to prevent recurrence

Benefits

  • Highly competitive salary
  • Stock options
  • Insurance
  • Ability to work anywhere in the world while still being part of a dynamic team taking on some of the most interesting challenges in the technology/infrastructure space

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Portainer.io know you found this job on JobsCollider. Thanks! πŸ™