Site Reliability Engineer

Nextiva Logo

Nextiva

πŸ“Remote - Spain

Summary

Join Nextiva as a Site Reliability Engineer to enhance, support, and troubleshoot our SaaS platform. You will provide critical support for compliance-driven environments, contributing to initiatives related to FedRAMP authorization, security hardening, and industry-specific compliance standards. This role requires triaging, troubleshooting, and fixing production problems; designing, developing, and improving monitoring systems; identifying and automating tasks; and writing software to improve system reliability. You will collaborate with compliance and security teams and ensure platform reliability for regulated customer environments. The ideal candidate is a generalist comfortable working between development and systems, with experience in compliance-focused environments and strong communication skills.

Requirements

  • Bachelor's degree in Computer Science or related field, or equivalent work experience
  • Bilingual Spanish and English
  • 0–2 years of software development experience
  • 0–2 years of Linux system administration experience
  • 0–2 years of performance engineering experience
  • Experience working with RESTful APIs
  • Experience troubleshooting complex systems
  • Experience working with source control systems (e.g., Git)
  • Familiarity with containerization and orchestration (e.g., Docker, Kubernetes)
  • Familiarity with front-end technologies
  • Familiarity with application performance monitoring tools
  • Familiarity with relational databases and SQL
  • Familiarity with microservices and distributed system design
  • Ability to clearly communicate technical concepts
  • Working knowledge of general SRE concepts and DevOps principles
  • Understanding of or experience supporting regulated environments and public sector clients

Responsibilities

  • Triage, troubleshoot, and fix production problems in every layer of the stack
  • Design, develop, improve, and tune logging, monitoring, and alerting systems
  • Identify manual tasks, document fixes via runbooks, and drive automation
  • Write software to improve the reliability and recoverability of production systems
  • Perform and automate system administration tasks
  • Participate in on-call rotation supporting production systems
  • Collaborate with compliance and security teams to meet standards for FedRAMP, HIPAA, and other regulatory frameworks
  • Ensure platform reliability and availability for regulated customer environments, including healthcare and government sectors
  • Support infrastructure and deployments aligned with the needs of SLED and federal clients

Preferred Qualifications

  • Experience in or exposure to compliance-focused environments (e.g., FedRAMP, HIPAA, CJIS, SOC 2)
  • Datadog
  • Atlassian Suite (Jira, Confluence, BitBucket)
  • Java/Spring
  • Python
  • Javascript/React
  • SQL
  • Ansible
  • Jenkins
  • Tomcat
  • Git
  • Redis
  • RabbitMQ
  • Splunk/Kibana
  • Terraform

Benefits

  • Comprehensive medical coverage, including dental care
  • Life insurance, covering life and disability
  • PTO and Paid Sick time as per CBA, paid parental leave
  • Private pension plan available
  • Employee Assistance Program and comprehensive wellness initiatives
  • Access to ongoing learning and development opportunities and career advancement

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.