Remote SRE Engineer

Logo of Nintex

Nintex

📍Remote - Malaysia

Job highlights

Summary

Join Nintex as an engineer and build a tangible impact with every line of code, working on complex infrastructure components for distributed systems like Kubernetes.

Requirements

  • You provide guidance on infrastructure architecture and contribute to high-quality and successful product releases
  • Strong understanding of Kubernetes
  • You contribute to your team and domain through successfully leading and consistently delivering on projects of ambiguous scope, high complexity, and critical business impact
  • You contribute to relevant guilds, practice forums and other initiatives to improve Nintex’s DevOps and SRE discipline
  • You have an in-depth understanding of distributed systems architecture, as well as monitoring and observability practices and tools
  • You quickly resolve priority infrastructure issues and help other technical team members or Product Managers understand how to avoid them in the future
  • You provide detailed estimates for work items you propose or assigned
  • You assist in decision-making around tooling, automation practices, and testing solutions
  • You stay up-to-date with technology trends and use this knowledge help your team and the broader Engineering practice
  • You run Nintex infrastructure with IaC tools (as Terraform) and GitHub Actions for automation, containerize our environments (Kubernetes) and leverage cloud technologies to meet our goals
  • You build monitoring that alerts on symptoms rather than outages using tools like Prometheus, Grafana, Alertmanager and PagerDuty
  • You debug production issues across services and all levels of the stack
  • You share the learnings through issues, runbooks, documentation, and brownbag sessions
  • You foster effective collaboration between product, engineering, and design teams

Responsibilities

  • You are comfortable working with technical complexity and building reusable infrastructure that can streamline development and operational pipelines
  • You pioneer best practices to boost quality and productivity
  • You work with composite technologies within ambiguous projects to deliver successful outcomes
  • You mentor other engineers and support both your team as well as Product members
  • You specialize in systems (OS’s, Storage, Networking) while implementing best practices for availability, reliability, and scalability with interests in distributed systems as Kubernetes
  • You write scripts, tools and utilities that support and integrate with delivery pipelines and you integrate telemetry where appropriate
  • You are called into incidents and bring trusted knowledge in your platform domain
  • You debug and fix infrastructure issues on production environments quickly using the relevant tools and guidelines to prevent recurrence
  • You build, promote and support infrastructure patterns and practices within Nintex
  • You lead or contribute to post-mortems for incidents, including root cause analysis and identification of preventative and remedial actions
  • You continuously monitor our platform performance and take immediate action to improve it
  • You review and advise on appropriate design patterns to solve automation and infrastructure problems without creating technical debt
  • You design and build complex infrastructure components for distributed systems as Kubernetes
  • You initiate and lead the refactoring of complex parts of the infrastructure
  • You identify optimization opportunities anywhere in the development or operations functions and contribute to the implementation of proposed solutions
  • You suggest and contribute improvements to Nintex platform and observability tools and practices
  • You are continually on the lookout for opportunities to reduce errors through automating and standardizing processes
  • You bring infrastructure components into managed implementations like Infrastructure as Code (IaC), configuration management, and container usage
  • Each solution that you design, and implement will adhere to the relevant guidelines in support of security, disaster recovery, scalability, availability, reliability, and durability
  • You are an active part of the incident management process, including on-call rotation and unblocking technical and operational decisions, related to Nintex Platform
  • You contribute to script libraries, infrastructure provisioning templates, reporting mechanisms or other shared repositories that will increase the productivity and success of Nintex teams
  • You act as a reliability champion
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews

Benefits

  • Global Gratitude and Recharge Days
  • Flexible, paid time off policy
  • Employee wellness programs and counseling resources
  • Meaningful peer recognition and awards
  • Paid parental leave
  • Invention/patenting assistance
  • Community impact, paid volunteer time, and opportunities
  • Intercultural learning and celebration
  • Multiple tools through which to learn and grow, and an incredible global community

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Nintex know you found this job on JobsCollider. Thanks! 🙏