Senior Production Support Engineer

Tala Logo

Tala

๐Ÿ“Remote - Philippines

Summary

Join Tala, a fintech company on a mission to empower the financially underserved, as a Production Support Engineer. You will play a crucial role in ensuring the high availability of our platform across global markets. This involves incident response, collaboration with various teams, continuous improvement of monitoring systems, and identifying areas for product enhancement based on customer feedback. You will work closely with global teams, track and report on issues, and contribute to improving our documentation. The ideal candidate possesses extensive experience in technology, incident response, and working with remote teams in a global setting.

Requirements

  • 3+ years of experience working in technology environment with experience in microservices architecture
  • 3+ years of experience in incident response or similar role
  • Experience working with a remote team in a global environment
  • Knowledge of various monitoring platforms such as AWS CloudWatch, SumoLogic, APM monitoring (NewRelic, Instana), mobile (Crashlytics data), BI (Looker, Snowflake)
  • Knowledge of relational databases, BI querying languagesย  to be able to construct queries during investigations
  • Experience working with tools like Postman, or scripting API queries
  • Excellent debugging and documentation skills
  • Ability to coordinate incident response and communicate effectively with stakeholders from variety of teams across different timezones
  • Ability to remain calm under pressure during a production incident resolution

Responsibilities

  • Ownership of risk event process for the PH & East Asia timezone: youโ€™ll help coordinate teams responding to an incident, communicate effectively, oversee post-mortem and monitor that the follow-up action items are completed
  • Ownership of escalations from the in-country CXCL guild: debugging and identifying problems, resolving when possible and escalating to appropriate teams when necessary
  • Tracking and reporting on pending issues, and regular updates on open items
  • Continuous improvement of our monitoring dashboards and alerts
  • In collaboration with the CX team, identify patterns in customer and product issues and propose improvements
  • In collaboration with the Production Support Engineers globally, share product learning, knowledge and exchange ideas
  • Identify and communicate repeating themes around risk events and propose improvements to prevent recurrence of the same issues
  • Keep track of metrics related to production performance and identify areas of improvement
  • Continuous improvements of our documentation library to allow faster onboarding of new team members and more efficient response times

Preferred Qualifications

Candidates with SRE background are welcome

Benefits

Remote-first approach

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.