Site Reliability Engineer

ASCENDING
Summary
Join our client, a leading national healthcare provider, as a Senior Site Reliability Engineer and contribute to their Cloud Infrastructure team. You will collaborate with a Kubernetes Architect and software engineering teams to onboard applications into Kubernetes Clusters (AKS) within the Azure Cloud. This role demands strong experience in deploying AKS clusters in Azure (5+ years), significant Infrastructure as Code (IaC) experience, and expertise in Kubernetes. You will design and implement microservices-based architecture, utilize Kubernetes for container orchestration, and implement robust monitoring and alerting mechanisms. The position also involves mentoring junior team members and conducting technology evaluations. This is a 100% remote position within the continental US, open to US Citizens only.
Requirements
- Strong Kubernetes(K8S) experience required
- 5+ years of experience in rearchitecting large-scale monolithic applications to cloud-native architectures
- Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field
- 3+ years of experience with cloud computing platforms with Azure including experience with serverless architectures, containers, and orchestration
- Experience with payment industry standards, protocols, and security best practices
- Knowledge of load balancing algorithms (will be asked in interview)
- Must have worked in an Azure environment, working heavily with Kubernetes (AKS)
- Must know security best practices for containers and clusters
- Must know terraform well
- Strong experience with Infrastructure as Code
Responsibilities
- Stay up-to-date with emerging technologies, frameworks, and industry trends related to payment systems and cloud computing
- Design and implement microservices-based architecture using domain-driven design principles
- Utilize Kubernetes for container orchestration and management, ensuring scalability, reliability, and high availability of the payment system
- Implement robust monitoring, logging, and alerting mechanisms to ensure system performance and availability
- Develop highly resilient and highly available components for the payment system
- Conduct technology evaluations and provide recommendations for new tools, technologies, and frameworks that can enhance our payment infrastructure
- Mentor and provide technical guidance to junior team members, fostering a culture of continuous learning and professional growth