Cloud SRE

NICE Logo

NICE

πŸ“Remote - India

Summary

Join NiCE, a global leader in software solutions, as a Site Reliability Engineer (SRE). You will play a crucial role in enhancing the reliability and availability of our SaaS platform. Key responsibilities include improving observability and monitoring, providing reliability consulting and automation, managing incidents and problems, sharing knowledge and mentoring colleagues, and ensuring process and documentation compliance. This position requires 4+ years of experience in programming/scripting, public/private cloud environments, SRE/DevOps/Observability, and AWS. Experience with Agile, Jira, GitHub, monitoring, automation, and dashboarding is also essential. NiCE offers a hybrid work model (NICE-FLEX) with opportunities for professional growth and development within a collaborative and creative environment.

Requirements

  • 4+ years programming/scripting experience with any of the following: (Go, Python, .Net (C#), Node)
  • 4+ years of experience working within public or private cloud environments
  • 4+ years of SRE/DevOps/Observability or related experience
  • 4+ years of AWS
  • Experience with Agile, Jira, GitHub, monitoring, automation, dashboarding

Responsibilities

  • Create new dashboards and metrics to provide comprehensive observability into the health and performance of development teams' applications, including SLI/SLO metrics
  • Work with development teams to ensure proper monitoring is set up and enabled for their services
  • Identify evolutionary improvements to the observability and monitoring solutions
  • Consult with development teams on SRE services and best practices to help them improve the reliability of their applications
  • Create automation and tooling to reduce toil and manual intervention
  • Assist other teams in data and performance analysis to identify the root causes of issues and recommend automation actions
  • Review the work of other SREs and provide training and guidance to help them improve their skills
  • Communicate effectively with both technical and non-technical peers and customers
  • Follow established processes when performing work or help document and create processes, as necessary
  • Document troubleshooting steps and results in appropriate locations for historical access
  • Ensure compliance with policies, procedures, and standards
  • Implement or coordinate remediation required by audits and assessments, and document, as necessary
  • Estimate the time required to complete activities and projects

Preferred Qualifications

  • Kubernetes + certification
  • Grafana
  • AWS
  • Azure
  • DevOps experience

Benefits

NICE-FLEX hybrid model: 2 days working from the office and 3 days of remote work, each week

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.