Data Engineer

closed
Owkin Logo

Owkin

πŸ“Remote - France

Summary

Join Owkin, an agentic AI company, as a 6-month CDD Data Engineer to support the development and maintenance of data pipelines. You will participate in designing, optimizing, and maintaining ETL/ELT pipelines using Airflow, ensuring reliability, scalability, and compliance. Responsibilities include organizing data systems, reporting pipeline performance, and contributing to scientific data processing workflows. The role demands attention to detail, strong collaboration skills, and the ability to manage multiple priorities. You will streamline production workflows, implement best practices for data governance and security, and collaborate with engineers, data scientists, and researchers. This position is based in our Paris office or remotely in France.

Requirements

  • Proficiency in Python and SQL
  • Familiarity with Airflow for workflow orchestration
  • Familiarity with cloud-based data storage and cloud-native processing concepts
  • Familiarity with containerization technologies such as Docker and Kubernetes
  • Knowledge of data governance and security fundamentals
  • Ability to work with structured and unstructured datasets in predefined formats

Responsibilities

  • Operate and optimize ETL/ELT pipelines using Airflow
  • Support the structuring and organization of data systems in alignment with predefined architectures
  • Ensure timely and accurate reporting of data pipeline performance and operational issues
  • Follow data governance, security, and compliance standards in all data processing activities
  • Work on containerized data infrastructures using Docker and Kubernetes under supervision
  • Contribute to operational tasks related to scientific data processing and quality control
  • Implement optimizations in Python and SQL-based workflows following team guidelines
  • Work within established frameworks for data lake and data warehouse maintenance
  • Collaborate with engineers and researchers to define data processing requirements
  • Contribute to the standardization and monitoring of production data workflows
  • Support the design and optimization of data pipelines using Airflow
  • Develop and operate Python and SQL-based solutions for data processing
  • Contribute to the development of scalable ETL/ELT pipelines to process and transform datasets
  • Work closely with data scientists, business developers, software engineers, and biomedical researchers to deliver high-quality data solutions
  • Contribute to management and monitoring of containerized data infrastructures with Docker, Kubernetes, and cloud platforms
  • Follow best practices for data governance, security, and compliance in all workflows
  • Operate on the data architectures, including data lakes, data warehouses, and analytical insights platforms
  • Contribute to the productionization of data processing pipelines, ensuring efficiency and scalability in scientific data workflows

Benefits

  • Flexible work organization
  • Friendly and informal working environment
  • Opportunity to work with an international team with high technical and scientific backgrounds
This job is filled or no longer available

Similar Remote Jobs