Technical Manager, Site Reliability Engineering

Logo of Coalfire

Coalfire

πŸ’΅ $94k-$163k
πŸ“Remote - Worldwide

Job highlights

Summary

Join Coalfire's Management and Operations team as a leader in cloud technology solutions. You will ensure client SLAs are met, collaborate with cross-functional teams, and drive operational excellence. Responsibilities include managing a team, leading escalations, improving tools and processes, and developing governance models. This role requires extensive experience in cloud architecture, AWS, IaC, and team management. Coalfire offers a flexible work model, competitive perks, paid parental leave, flexible time off, and comprehensive insurance options.

Requirements

  • Proven experience supporting clients in a managed services environment
  • Demonstrated expertise in ticket management and meeting SLA requirements
  • Strong relationship management skills, with a track record of building and maintaining client trust
  • Extensive experience with AWS, Azure, or GCP services; cloud certifications are highly preferred
  • Deep expertise with Terraform, Ansible, GitLab, and CI/CD technologies
  • Proven experience managing technical teams of 5-8 members
  • Exceptional communication, organizational, and problem-solving skills in fast-paced environments
  • Strong documentation skills, including the creation of technical diagrams and comprehensive written descriptions
  • Ability to work autonomously and collaboratively, demonstrating a professional attitude and demeanor
  • Critical thinker capable of balancing security requirements with mission objectives
  • Proven ability to derive application and platform infrastructure requirements
  • Minimum of 7 years of experience in systems engineering and architecture, including requirements definition, architecture development, use case creation, and systems integration and testing
  • Minimum of 7 years of experience in cloud architecture, design, implementation, operations, and automation, specifically in AWS
  • Minimum of 7 years of hands-on experience with Infrastructure-as-Code and orchestration/automation tools, including Terraform and Ansible

Responsibilities

  • Ensure client Service Level Agreements (SLAs) are consistently met, particularly regarding availability, response times, and service posture
  • Collaborate with Site Reliability Engineers and cross-functional teams to identify, prioritize, and escalate bugs or issues, ensuring appropriate fixes and responses are driven effectively
  • Lead and coordinate Coalfire and client teams during escalations, with a primary focus on fast-tracking issue resolution
  • Drive operational excellence and benefits realization through continuous measurement and improvement of key performance indicators (KPIs)
  • Lead initiatives to improve existing tools and processes while providing actionable feedback on new practices and procedures
  • Guide the creation and maturation of internally-managed Infrastructure as Code (IaC) solutions to enhance project efficiency and reduce system variability
  • Develop and advocate for governance models that promote consistent use of cloud technologies, aligning with institutional strategies, policies, and regulatory standards such as FedRAMP
  • Ensure operational support team members are well-versed in client business objectives, architectures, cloud adoption strategies, and operating models
  • Prepare and coach the team for regulatory compliance audit interviews with third-party auditors, with a focus on control implementations during client assessments
  • Manage and mentor a team of 6-8 individual contributors, focusing on career development, goal setting, project management, quality assurance, and daily guidance
  • Contribute to the definition, planning, and documentation of key Managed Services projects and initiatives while tracking progress and outcomes against established goals
  • Support hiring and personnel development to meet current needs and enable scalable growth aligned with client expansion
  • Work with other TSM’s on supporting the practice and driving technical growth and innovation

Preferred Qualifications

  • Previous experience in a consulting roles
  • Previous experience managing or operating a 24x7x365 highly-available environment for a SaaS vendor
  • Demonstrated expertise in implementing system encryption (SSL, PKI, FIPS 140-2) and system hardening using CIS Benchmarks and DISA STIG standards
  • Experience with cloud-based networking technologies and other network technologies such as Palo Alto, Cisco ASAv, and similar platforms
  • Proficiency in serverless, microservices, and modern application architectures
  • Advanced skills in creating diagrams using tools like Visio and Lucid Chart, along with experience using Jira for project management
  • Experience leading cloud infrastructure audits and gathering artifacts to demonstrate compliance with at least one regulatory framework
  • Familiarity with frameworks such as FedRAMP, FISMA, SOC, ISO, HIPAA, HITRUST, PCI, etc

Benefits

  • Flexible work model
  • Paid parental leave
  • Flexible time off
  • Certification and training reimbursement
  • Digital mental health and wellbeing support membership
  • Comprehensive insurance options
  • Annual incentive
  • Commission
  • Recognition programs

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Coalfire know you found this job on JobsCollider. Thanks! πŸ™