Remote Site Reliability Engineer

Logo of Catchpoint

Catchpoint

πŸ“Remote - Turkey

Job highlights

Summary

Join Catchpoint as a Site Reliability Engineer to support the systems that run their global monitoring platform, interacting with operations and development teams to build and maintain automation and monitoring. The role requires an operational mindset and problem-solving skills on a global scale.

Requirements

  • Strong Experience/knowledge of administering application servers, web servers, and databases
  • Familiarity with Automation and configuration management tools (preferably terraform)
  • Good networking knowledge and experience with Internet Architecture (BGP, peering, DNS)
  • 2+ years of incident resolution experience in a large-scale operations environment
  • Hands-on experience with cloud deployment, monitoring, and ops analysis tools such as Prometheus, Elasticsearch, Grafana, Kibana, Splunk, Terraform, Jenkins, etc
  • 3+ years with python, bash, PowerShell, C, etc
  • Virtualization experience required
  • BS degree in Computer Science or related technical field involving coding or equivalent practical experience

Responsibilities

  • Engage in and improve the whole lifecycle of servicesβ€”from inception and design, through deployment, operation and refinement
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health. Establish performance baselines, define actions and automation correlating data from multiple sources
  • Design, build, and maintain logging and telemetry systems that are used to manage all services
  • Design, code, test, and deliver software to automate manual operational work
  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
  • Identify application patterns and analytics in support of better service level objectives
  • Deploy and maintain systems that run on multiple cloud providers (AWS, GCP, Azure, Alibaba, Tencent, Oracle, IBM) and physical systems around the world
  • Be part of an on-call rotation to support production systems

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Catchpoint know you found this job on JobsCollider. Thanks! πŸ™