Staff Monitoring Engineer

Logo of Kaseya

Kaseya

πŸ“Remote - United States

Job highlights

Summary

Join the Kaseya growth rocket ship and see how we are #ChangingLives! As a Senior Linux Monitoring and Observability Engineer, you will participate in the architecture, management, and operation of monitoring infrastructure systems, work with large fleets of hosts and data at scale, and contribute to documentation and monitoring plans.

Requirements

  • 10+ years of experience with Linux administration (Ubuntu preferred, but it doesn't matter.)
  • 10+ years of experience with monitoring platforms at scale
  • Comfortable working with observability tools of some kind (logs/metrics/traces)
  • Willingness to participate in a small team that consists of mostly remote members
  • Eager to learn things quickly
  • Ability to effectively manage time and push multiple projects at once
  • Intermediate knowledge of at least some of our architecture components
  • Comfortable working within Git based deployment workflows, using related tooling
  • Solid configuration management skills (Ansible and Terraform primarily, Foreman/Puppet, Salt)
  • Experience with metric query language (PromQL/MetricQL)
  • Experience with scripting and system automation (Bash, Python, Go, Perl, Awk, etc.)
  • Experience with Kubernetes or other microservice based environments
  • Work well in an energized environment
  • Excellent documentation skills (Wiki’s, ticket details, blog posts, etc)
  • Excellent diagnostic expertise including problem investigation, root-cause analysis, and resolution skills
  • Comfortable working with open source projects and supporting them directly as needed (bugfixes /new features)
  • The desire and ability to see a problem through to a complete solution

Responsibilities

  • Participate in the architecture, management, and operation of the monitoring infrastructure systems including Zabbix, Alerta, Kafka, OpenSearch, Jaeger, and VictoriaMetrics
  • Work with large fleets of hosts, and data at scale for both bare-metal and virtualization architectures
  • Learn what makes distributed systems tick by collecting and analyzing metrics
  • Instrument relevant data with visualizations and dashboards (Grafana/Opensearch-Dashboards)
  • Contribute to documentation, monitoring plans and alert configuration strategies
  • Provide ad-hoc support to other engineering groups in the company via Teams/Jira tickets
  • Ensure system and service stability, scalability, security, and performance
  • Solve complex scaling challenges

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Kaseya know you found this job on JobsCollider. Thanks! πŸ™