Remote Site Reliability Analyst

closed
Logo of Almanak

Almanak

πŸ“Remote - Vietnam

Job highlights

Summary

The job is for a Site Reliability Analyst at Almanak Blockchain Labs, a data science company focused on decentralized networks. The role involves ensuring the reliability, scalability, and performance of systems, working closely with engineering and operations teams to optimize infrastructure, automate processes, and implement best practices.

Requirements

  • 2+ years of experience in site reliability engineering, devops, or a similar role with a focus on release management and platform scalability
  • Proficiency with Python, GCP Cloud, infrastructure as code (IaC) tools, specifically Terraform, for managing and provisioning infrastructure through code
  • Solid understanding of CI/CD principles and experience with CI/CD tools to support efficient development and deployment processes
  • Experience with feature flag management tools (e.g., LaunchDarkly) and strategies for safe, incremental feature rollouts
  • Knowledge of deployment strategies such as canary releases and blue-green deployments, with the ability to implement these processes effectively
  • Ability to provide on-call support, troubleshoot and resolve issues promptly, ensuring platform reliability and service continuity
  • Strong documentation skills, with the ability to create clear and detailed guides for system configurations, deployment procedures, and incident response
  • Excellent problem-solving abilities, with a proactive approach to identifying and mitigating potential issues before they affect users

Responsibilities

  • Lead release management efforts, ensuring smooth and reliable software releases through well-managed deployment processes
  • Implement and manage feature flags using tools like LaunchDarkly, facilitating safe and controlled feature releases and A/B testing
  • Oversee the scaling of the platform to handle increased load, optimizing for performance and reliability
  • Conduct deployment management, including the coordination of canary deployments and blue-green deployment strategies, to minimize disruption and ensure high availability
  • Contribute to the setup and maintenance of CI/CD pipelines, leveraging automation to improve development workflows and deployment efficiency
  • Provide on-call support as L1-L2, quickly addressing and resolving incidents to maintain service quality and platform stability
  • Perform small bug fixes and respond to feature requests, contributing to the continuous improvement of the platform
  • Develop and maintain comprehensive documentation covering deployment processes, incident management, and system configurations

Preferred Qualifications

  • Experience with trunk-based development and its implementation in a CI/CD pipeline to support rapid and safe code integrations
  • Familiarity with scaling strategies for high-traffic applications, including load balancing and resource optimization techniques

Benefits

  • Competitive compensation, consisting of either fiat/crypto remuneration
  • Flexible schedule & remote work
  • Co-working space, gear & education budgets
  • Impact: You’ll work with some of the smartest people in the space and play a pivotal role in influencing the way some of the most popular crypto applications are built
  • The company shall invest in your comfort of work, as well as in your personal growth
This job is filled or no longer available

Similar Remote Jobs