Summary
Join Xebia, a global leader in Cloud-based solutions, as a Data Engineer responsible for at-scale infrastructure design, data processing, and software development using Azure and other technologies.
Requirements
- 3+ yearsβ experience with Azure (Data Factory, Databricks, SQL, Data Lake, Power BI, Devops, Delta Lake, CosmosDB)
- 5+ yearsβ experience with data engineering or backend/fullstack software development
- Strong SQL skills
- Python scripting proficiency
- Experience with data transformation tools - Databricks and Spark
- Data manipulation libraries (such as Pandas, NumPy, PySpark)
- Experience in structuring and modelling data in both relational and non-relational forms
- Ability to elaborate and propose relational/non-relational approach
- Normalization / denormalization and data warehousing concepts (star, snowflake schemas)
- Experience with CI/CD tooling (GitHub, Azure DevOps, Harness etc)
- Git
- Good verbal and written communication skills in English
Responsibilities
- Responsible for at-scale infrastructure design, build and deployment with a focus on distributed systems
- Building and maintaining architecture patterns for data processing, workflow definitions, and system to system integrations using Big Data and Cloud technologies
- Evaluating and translating technical design to workable technical solutions/code and technical specifications at par with industry standards
- Driving creation of re-usable artifacts
- Establishing scalable, efficient, automated processes for data analysis, data model development, validation, and implementation
- Working closely with analysts/data scientists to understand impact to the downstream data models
- Writing efficient and well-organized software to ship products in an iterative, continual release environment
- Contributing and promoting good software engineering practices across the team
- Communicating clearly and effectively to technical and non-technical audiences
- Defining data retention policies
- Monitoring performance and advising any necessary infrastructure changes
Preferred Qualifications
- Experience with Azure Event Hubs, Azure Blob Storage, Azure Synapse, Spark Streaming
- Experience with data modelling tools, preferably DBT
- Experience with Enterprise Data Warehouse solutions, preferably Snowflake
- Familiarity with ETL tools (such as Informatica, Talend, Datastage, Stitch, Fivetran etc.)
- Experience in containerization and orchestration (Docker, Kubernetes etc.)
- Cloud (Azure, AWS, GCP) certification
Benefits
Work from the European Union region and a work permit are required