Summary

Join Feedzai's Platform Engineering Performance & Reliability team as a Platform Engineer and contribute to the optimization and scalability of our cloud-based risk management platform. You will work with a talented team to build and maintain distributed systems, automate infrastructure, and resolve production issues. This role requires experience in cloud services, programming (Go, Python), and system design. You will be responsible for capacity planning, collaboration with product teams, and incident response. The ideal candidate is passionate about distributed systems, performance, and reliability. Feedzai offers a fast-paced, collaborative environment with opportunities for continuous learning.

Requirements

A bachelor's degree in Computer Science, Information Systems, or the equivalent combination of education, experience, and training
Programming skills (Go, Python or similar languages)
3+ years of experience in data structures, algorithms, programming, asynchronous & multithreaded designs
3+ years of experience with building scalable and distributed cloud services
3+ years operating production environments
2+ years of experience in cross team collaboration within a supportive role
Self-driven & motivated, with a strong work ethic and a passion for problem solving
Systematic problem-solving approach, coupled with effective verbal and written communication skills
Experience being oncall

Responsibilities

Provide recommendations about capacity allocation considering cost, resilience and performance
Work together with product teams to support best practices and drive improvements on systems performance and reliability before and after they go live
Development with Go, Python or similar languages
Automate all aspects of cloud infrastructure and incident response
Develop playbooks related to actionable alerts
Participate in incident response, root cause investigation and resolution
Maintain and develop our infrastructure as code (IaC) to manage and operate end-to-end lifecycle operations (monitoring, alerting, security, cost optimization, configuration, backup, etc.) in production environments
Utilize your experience and problem solving skills to help prevent and investigate production issues

Preferred Qualifications

Experience with monitoring & Observability stacks such as Grafana and Prometheus
Kubernetes, Cloud and Hashicorp experience is valued
Knowledge or experience with AWS or GCP

Site Reliability Engineer

Feedzai

Summary

Requirements

Responsibilities

Preferred Qualifications

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

Remote

DevOps

Mid-level

Tailor

Remote

Software Development

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Mid-level

Wizeline

Remote

DevOps

Mid-level

Wizeline

Remote

DevOps

Mid-level

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior