Site Reliability Engineer

Devexperts
Summary
Join Devexperts as a Site Reliability Engineer (SRE) and contribute to the development and support of large-scale trading platforms. You will collaborate with developers, deploy and maintain software, configure infrastructure, build CI/CD pipelines, and make key decisions regarding scalability and reliability. This role requires strong experience in Linux/UNIX administration, scripting (Bash, Python, Groovy), CI/CD pipelines (TeamCity), Iaas solutions (Ansible, Terraform), container orchestration (K8S/OpenShift/Hashicorp), and monitoring tools. Devexperts offers a comprehensive benefits package including flexible work arrangements, health and recreation benefits, facility perks, community events, professional training, and social benefits.
Requirements
- Experience as an SRE or DevOps
- Experience with support of JVM application (garbage collection, memory leaks)
- Experience with software development
- Strong experience with OS-level administration on Linux and/or UNIX
- Hands-on scripting experience with Bash, Python, and/or Groovy
- Experience with configuring TeamCity CI/CD pipelines
- IAAS solutions using Ansible, Terraform
- Experience with Docker containers orchestrating (K8S/OpenShift/Hashicorp)
- Know how to read and analyze errors
- In-depth knowledge of TCP/IP and ISO/OSI stack
- Experience with monitoring and logging tools (Zabbix, Elasticsearch or Opensearch, Grafana, Kibana, etc)
- Experience in working with Apache, Nginx, HAproxy, Envoy, etc
- Strong ability to solve problems using code and scripting
- English level not lower than B2
Responsibilities
- Work closely with developers for prototyping, and designing new features as part of the infrastructure
- Deploy, install, configure and maintain sophisticated Trading/Finance and related software
- Configure bare metal & cloud instances by using Infrastructure as Code
- Build & maintain CI/CD pipelines
- Make key decisions for scalability, reliability and accessibility
- Install and manage in-house developed and external well-known monitoring systems
- Design, deploy and configure cloud-based servers and networks provision servers and storage, configure firewalls, VPN, monitoring, etc
- Administrate UNIX/Cloud infrastructure β installation, configuration and maintenance
- Work with the Nexus and GIT repositories
Preferred Qualifications
- Experience with SQL-like command language
- Experience with Ansible (AWX)
- Knowledge of Java programming language
- Experience with trading/exchange/risk management software usage
- Experience with Atlassian software (JIRA, Confluence, FishEye, etc.)
Benefits
- Possibility of hybrid/remote work mode
- Flexible working hours
- Work From Anywhere Program
- 14 days of paid vacation
- Fully paid additional wellness days (3 days per year)
- Supplementary private health insurance
- Reimbursement of fitness
- Meal allowance
- Modern office with new equipment
- Parking spaces/transport reimbursement
- Free drinks and snacks
- Teambuilding activities
- Corporate parties
- Speakers' club
- Free admission to corporate external events
- Possibility of joining conferences and professional fairs
- English language courses
- Local language courses for foreign employees
- Unlimited access to self-learning platforms
- Certification opportunities
- Mentorship Program
- Parental bonus
- BES Insurance
- Referral bonus
- Blood donation paid leave
- Gifts for employees
- Gifts for children