Senior Site Reliability Engineer
Lumin Digital
π΅ $170k-$200k
πRemote - United States
Please let Lumin Digital know you found this job on JobsCollider. Thanks! π
Job highlights
Summary
Join Lumin Digital as a Senior Site Reliability Engineer (SRE) and ensure the availability, scalability, and performance of our digital banking platform. You will leverage your deep understanding of development and operations, utilizing automation to enhance reliability. Collaborate with Software Engineers to implement best practices, ensuring Service Level Objectives (SLOs) are met. Responsibilities include developing and managing CI/CD pipelines, monitoring and troubleshooting issues, collaborating with development and security teams, and engaging in capacity planning. You will also provide performance metrics and implement monitoring and alerting strategies. This role involves a 24x7 on-call rotation.
Requirements
- Strong problem-solving skills with an operations mindset and an ability to anticipate issues in large-scale systems
- Proficiency with configuration management tools such as Chef, Ansible, or Puppet
- Knowledge of standard networking protocols and components (HTTP, DNS, TCP/IP, ICMP)
- Expertise in AWS or other cloud hosting environments, with a security-focused approach to data integrity and availability
- Hands-on experience with containerization and orchestration technologies, including Docker and Kubernetes
- Advanced understanding of Terraform, CI/CD architecture, and the ability to automate workflows
- Ability to respond to incidents during off hours
Responsibilities
- Develop and manage CI/CD pipelines, ensuring efficient deployment and system updates
- Monitor and troubleshoot application and infrastructure issues across all environments, proactively ensuring SLOs and uptime requirements are met
- Collaborate with development and security teams to integrate best practices and ensure system resilience
- Engage in capacity planning and demand forecasting to anticipate performance bottlenecks and proactively scale the environment
- Manage change and configuration, ensuring stability and consistency across deployments
- Provide metrics to track system performance and identify areas for improvement
- Implement monitoring and alerting strategies that promote automation, self-healing, and effective incident response
- Participate in a 24x7 on-call rotation to support system reliability and availability
- Perform other duties as assigned
Benefits
$170,000 - $200,000 a year
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
- π°$60k-$120kπAsia
- π°$177k-$213kπUnited States
- πPoland
- πUkraine
- πWorldwide
- π°$170k-$259kπUnited States
- πEstonia
Please let Lumin Digital know you found this job on JobsCollider. Thanks! π