Skyward is hiring a
Site Reliability Engineer, Remote - Worldwide

Logo of Skyward

Site Reliability Engineer closed

🏢 Skyward

💵 ~$82k-$120k
📍Worldwide

Summary

The job is for a Site Reliability Engineer at Skyward who will support the modernization of enterprise systems for CMS, develop cloud management best practices, architect automation solutions, drive incident management processes, collaborate with development teams, establish and monitor SRE metrics, mentor team members, and stay updated with AWS technologies and SRE best practices.

Requirements

  • A bachelor’s degree in computer science, engineering, or related field
  • 5+ years of experience in site reliability engineering, systems engineering, or related roles, with at least 3 years focused on AWS cloud environments
  • Deep expertise in AWS services, architecture, and best practices
  • Proficiency in infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation)
  • Solid understanding of CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI)
  • Expertise in scripting and programming languages (e.g., Python, Bash, Go)
  • Exceptional problem-solving skills and the ability to work under pressure
  • Excellent communication skills, with a knack for presenting complex data concepts in an understandable manner

Responsibilities

  • Be on a team supporting the Centers for Medicare and Medicaid Services (CMS) endeavor to modernize enterprise systems’ access, reducing manual effort, improving data accuracy, and enhancing transparency for stakeholders
  • Develop and enforce best practices for cloud management, including cost optimization, security policies, and deployment strategies
  • Architect and implement automation solutions for infrastructure provisioning, configuration, and deployment processes
  • Drive incident management processes, including post-mortem analysis and implementing preventative measures
  • Collaborate with development teams to embed reliability and performance considerations in the software development lifecycle
  • Establish and monitor SRE metrics and service level indicators (SLIs) and objectives (SLOs) to measure and improve system reliability and performance
This job is filled or no longer available

Similar Jobs