DevOps Observability Engineer

ASCENDING Logo

ASCENDING

πŸ“Remote - United States

Summary

Join our team as a DevOps Observability Engineer and play a key role in our infrastructure transition to Microsoft Azure. This critical role focuses on building and maintaining robust monitoring, logging, and tracing solutions. You will ensure system performance and reliability, providing insights for operational excellence. Key responsibilities include designing and implementing observability solutions, supporting Azure migration, utilizing tools like Dynatrace and ELK Stack, and automating alerts and reporting. You will analyze data to optimize performance, troubleshoot incidents, and collaborate with various teams. This is a long-term contract position (36+ months), fully remote (U.S. only), and requires U.S. citizenship.

Requirements

  • Experience: 5-7 years of progressive experience in DevOps roles
  • Dedicated Observability Experience: Minimum of 2 years of dedicated experience specifically in DevOps Observability, focusing on implementing and managing monitoring, logging, and tracing solutions
  • Cloud Proficiency: Strong hands-on experience with Microsoft Azure services, particularly those related to infrastructure, networking, and monitoring
  • Observability Tools: Expert-level proficiency with Dynatrace, ELK Stack (Elasticsearch, Logstash, Kibana)
  • Scripting: Strong programming and scripting skills, particularly in Python, for automation and data manipulation
  • Problem-Solving: Excellent troubleshooting, analytical, and problem-solving abilities
  • Communication: Strong communication skills, both written and verbal, with the ability to convey complex technical information to diverse audiences

Responsibilities

  • Design and Implement Observability Solutions: Architect, implement, and manage comprehensive monitoring, logging, and tracing systems for both existing on-premise infrastructure and new Azure cloud environments
  • Azure Migration Support: Play a key role in the migration to Azure, specifically designing and deploying observability tools and practices within the Azure ecosystem
  • Tooling Expertise: Utilize and optimize tools such as Dynatrace, ELK Stack (Elasticsearch, Logstash, Kibana), and other relevant platforms to capture and visualize system metrics, logs, and traces
  • Automated Alerting & Reporting: Develop and configure automated alerts, dashboards, and reports to provide real-time insights into system health, performance bottlenecks, and potential issues
  • Performance Optimization: Analyze observability data to identify performance degradation, troubleshoot complex incidents, and recommend solutions for system optimization and stability
  • Scripting & Automation: Write and maintain automation scripts (primarily in Python) for integrating observability tools, automating data collection, and streamlining operational tasks
  • Incident Response & Root Cause Analysis: Support incident response efforts by providing critical data and analysis, facilitating rapid root cause identification and resolution
  • Collaboration & Best Practices: Collaborate closely with development, operations, and security teams to embed observability best practices throughout the software development lifecycle

Preferred Qualifications

  • Experience with other monitoring tools (e.g., Prometheus, Grafana, Splunk, Datadog)
  • Familiarity with containerization technologies (Docker, Kubernetes)
  • Experience with Infrastructure as Code (Terraform, Azure Resource Manager templates)
  • Background working with hybrid cloud environments (on-premise to cloud migration)

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs