Senior Manager - Site Reliability Engineering

SimSpace Logo

SimSpace

💵 $180k-$250k
📍Remote - United States

Summary

Join SimSpace as a Senior Manager, Site Reliability Engineering and lead SRE and platform engineering initiatives across our infrastructure. You will guide and mentor teams, building reliable, scalable systems on-premises and in cloud environments. Collaborate with development teams to ensure system reliability, performance, and security, providing oversight and technical leadership for platform operations. Drive the evolution of CI/CD pipelines and enhance engineering productivity. Collaborate with security teams to implement DevSecOps practices and ensure compliance. Mentor team members in SRE principles and promote their professional growth. Continuously improve operational processes to enhance system reliability and organizational efficiency. This role requires strong leadership, communication, and technical expertise in SRE, DevOps, and security automation.

Requirements

  • Experienced engineering manager with a strong background in site reliability engineering, platform operations, and distributed systems
  • Experience in managing SaaS systems and shipping the same as a SaaP offering on customer hardware platforms
  • Proven track record of successfully leading and scaling SRE teams in high-growth technology environments
  • Deep technical expertise in system reliability, observability, and automation with the ability to solve complex infrastructure challenges
  • Strong communication and leadership skills with the ability to influence engineering practices across multiple teams
  • Passionate about building resilient systems and empowering development teams through excellent platform experiences
  • Knowledge of SRE principles, DevOps practices, and security automation in modern software delivery
  • Comfortable with agile methodologies and able to adapt platform capabilities to evolving business requirements
  • Ability to mentor technical teams and drive adoption of reliability and automation best practices
  • Results-driven with a focus on measurable improvements in system reliability, deployment frequency, and mean time to recovery
  • 8+ years of experience in infrastructure, platform engineering, or SRE roles with at least 3+ years in management
  • Expert knowledge of SRE principles, infrastructure automation, and modern deployment practices
  • Proven track record of leading SRE or platform engineering teams to deliver highly available, scalable systems
  • Strong leadership and communication skills, with the ability to drive technical consensus and cultural change
  • Experience with DevOps and DevSecOps methodologies, including security automation and compliance integration
  • Deep understanding of observability, monitoring, and incident management practices
  • Extensive experience with container orchestration (Kubernetes), infrastructure as code, VMware, and cloud platforms
  • Demonstrated ability to design, build, and operate large-scale distributed systems with high reliability requirements
  • Experience with security automation, vulnerability management, and compliance frameworks in DevSecOps environments
  • Proven experience building and operating CI/CD platforms, developer tooling, and internal platform services
  • Experience managing infrastructure across both cloud and on-premises environments including packaging and shipping SaaP to customers
  • Bachelor's in computer science, engineering, or related field (or equivalent experience)
  • U.S. Citizenship is required for this role

Responsibilities

  • Lead and manage SRE teams responsible for the reliability, scalability, and performance of our SimSpace cyber range infrastructure and services
  • Partner with development and product teams to establish SLIs, SLOs, and contribute to SLA definitions, along with error budgets and reliability practices that balance feature velocity with system stability
  • Work with Customer Success Teams to assure the Core Platform meets customer’s needs and requirements
  • Drive the evolution of our GitHub and ArgoCD CI/CD pipelines, deployment strategies, and developer tooling to enhance engineering productivity and code quality
  • Collaborate with security teams to implement DevSecOps practices, automated security scanning, and compliance monitoring throughout the development lifecycle
  • Develop and maintain infrastructure automation, monitoring strategies, and incident response procedures to ensure high availability and rapid recovery
  • Work with engineering leadership to establish platform standards, architectural patterns, and best practices for cloud-native and hybrid environments
  • Mentor and coach team members in SRE principles, automation practices, and system design, promoting their professional growth and technical expertise
  • Ensure compliance with security frameworks, industry standards, and regulatory requirements in all platform operations
  • Continuously improve operational processes, tooling, and practices to enhance system reliability, developer experience, and organizational efficiency

Benefits

  • Compensation. Base salary range: $180,000 – $250,000, reflecting our confidence in your expertise and impact, with the opportunity for annual bonuses tied to company performance and individual contributions
  • Health & Wellness. Comprehensive medical, dental, and vision benefits, plus savings plans—coverage starts on day one!
  • Mental Health Support. Access to company-paid counseling, coaching, and resources for you and your family through Spring Health
  • Financial Well-Being. Plan for your future with a 401(k)-retirement savings plan featuring a company match
  • Flexible Time Off. Take the time you need with unlimited vacation and dedicated health & wellness days. SimSpace provides flexible solutions to meet the diverse work-life needs of team members
  • Parental Leave. Paid leave plans to support you and your loved ones during life’s most important moments
  • Ownership Opportunities: Equity stock options at hire, with annual performance-based grants—become an invested stakeholder in our shared success
  • Referral Rewards: Earn $1,500–$3,500 for every qualified hire through our employee referral program
  • Peloton Interactive Wellness Program: Full- and partial- subsidized membership plans and equipment discounts to help you reach your personalized fitness goals
  • Continuous Learning: Access a LinkedIn Learning membership to prioritize your personal and professional development
  • Social Connections: Monthly reimbursements for meaningful connections with teammates through our SocialSpace Community
  • Extra Perks: Legal plan coverage, pet insurance, wellness reimbursements, and more to simplify life’s details

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.