Remote Principal Site Reliability Engineer

closed
Logo of SimSpace

SimSpace

πŸ’΅ $204k-$275k
πŸ“Remote - United States

Job highlights

Summary

Join SimSpace as a Principal Site Reliability Engineer to design and implement reliable, scalable, and highly available systems and infrastructure for our cloud-based applications.

Requirements

  • In depth experience in software development and/or infrastructure engineering, with a focus on site reliability and/or system administration
  • Strong experience in cloud computing, particularly with Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP)
  • Must have extensive experience in containerization technologies such as Docker and Kubernetes
  • Strong experience in one of the scripting languages such as Python, Perl, or Ruby
  • Proficiency in Terraform or Cloud Formation for managing infrastructure using IAC principles
  • Proficiency in one of the configuration management tools such as Puppet, Chef, or Ansible
  • Deep understanding of networking concepts such as TCP/IP, DNS, load balancing, and firewalls

Responsibilities

  • Develop and implement strategies for the monitoring and alerting of systems health, performance, and security
  • Develop and implement strategies for incident management, problem management, and change management
  • Create and maintain automation tools and scripts for configuration management, deployment, and maintenance of cloud-based infrastructure
  • Conduct performance and capacity planning to ensure the systems are operating at optimal levels
  • Implement and manage the disaster recovery plan, ensuring that the systems are backed up and can be recovered in case of an outage
  • Collaborate with development and operations teams to ensure that application and infrastructure changes are properly tested, deployed, and maintained
  • Evaluate new technologies and tools, and make recommendations for their adoption based on their impact on system performance, reliability, and scalability
  • Develop and maintain documentation of system configurations, processes, and procedures
  • Providing technical leadership and mentoring to other engineers on the team
This job is filled or no longer available