Site Reliability Engineer II

Sinch
Summary
Join Sinch as a Site Reliability Engineer and be part of a fully remote team based in France, managing the global infrastructure for Sinch Mailjet services. You will monitor KPIs, collaborate with engineers on resource allocation, automate processes, and plan for scalability. Responsibilities include partnering with product engineering teams, building and supporting cloud-based infrastructure, automating tasks, developing and monitoring SLOs, troubleshooting issues, ensuring datastore health, and contributing to team growth. The ideal candidate possesses a background in infrastructure, operations, or software engineering, experience with cloud providers (GCP), and proficiency in configuration management and monitoring tools. Strong technical skills, problem-solving abilities, and excellent communication are essential. Sinch values learning and offers opportunities for professional growth.
Requirements
- Background in infrastructure, operations, or software engineering
- Experience with cloud providers such as GCP
- Proficiency in configuration management tools such as Terraform and Ansible
- Hands-on proficiency with modern monitoring tools like Prometheus and Grafana
- Experience with distributed data stores such as PostgreSQL, Cassandra, and ElasticSearch
- Knowledge about distributed event store and stream-processing platform (Apache Kafka)
- Strong technical skills across various infrastructure technologies
- Proven ability to break down complex tasks into manageable ones
- Strong communication skills and a history of building solid relationships with peers and leadership
- Experience operating and maintaining production systems in a Linux and public cloud environment
- Demonstrated ability to mentor and guide team members
- Hold French citizenship
- Hold EU/EEA citizenship
- Have a valid work permit for working in France
Responsibilities
- Partner with product engineering teams to identify systems requirements
- Build and support our cloud-based microservices infrastructure
- Automate routine processes and remediation tasks
- Develop, monitor and track Service Level Objectives (SLOs) for the systems under management
- Proactively troubleshoot, resolve, and plan for issues that typically come from support staff, other engineering teams, and our automated monitoring system
- Ensure our datastores are healthy and operate at optimal performance levels
- Contribute to the growth and culture of our engineering team
Preferred Qualifications
Experience with Python and Bash is beneficial
Benefits
At Sinch, we value learning, embrace change, and offer opportunities for personal and professional growth
Share this job:
Similar Remote Jobs
