Site Reliability Engineer 2

Granicus Logo

Granicus

πŸ“Remote - India

Summary

Join Granicus as a Mid-Level Site Reliability Engineer (SRE 2) and contribute to the reliability, availability, and performance of our services. You will collaborate with software engineers, provide on-call production support, monitor systems, automate processes, manage incidents, and improve system reliability and scalability. This role requires strong analytical and problem-solving skills, excellent communication, and a proactive approach to addressing issues. You will work with cutting-edge technologies and have opportunities for professional growth. The position involves rotational shifts, including weekends, and requires flexibility in working hours. Granicus is a remote-first company with a globally distributed workforce.

Requirements

  • Good understanding of Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud)
  • Experience with scripting languages such as Python, Bash, or Ruby
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Splunk), version control systems (e.g., Git), and CI/CD pipelines
  • Strong analytical and problem-solving skills with a proactive approach to identifying and addressing issues
  • Excellent verbal and written communication skills, with the ability to work effectively in a team environment
  • Eagerness to learn new technologies and improve existing skills. Openness to receiving feedback and applying it to improve performance
  • Minimum Four years experience in a SRE, Devops and Production support
  • Bachelor’s degree in computer science, Information Technology, or a related field, or equivalent practical experience
  • Responsible for Granicus information security by appropriately preserving the Confidentiality, Integrity, and Availability (CIA) of Granicus information assets in accordance with the company's information security program
  • Responsible for ensuring the data privacy of our employees and customers, their data, as well as taking all required privacy training in a timely manner, in accordance with company policies

Responsibilities

  • Provide on-call production support on a shift according to the team on-call roster
  • Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface
  • Work on SREs backlog items
  • Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability
  • Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention
  • Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence
  • Participate in the design and implementation of system improvements to enhance reliability, scalability, and performance
  • Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes
  • Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team
  • Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth
  • Implement and adhere to security best practices to protect our systems and data

Preferred Qualifications

  • Relevant certifications such as AWS Certified Solutions Architect, Google Cloud Professional DevOps Engineer, or similar
  • Experience: Internships or academic projects involving system administration, cloud services, or software development

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.