Intermediate Data Engineer

Logo of Dev.Pro

Dev.Pro

πŸ“Remote - Poland

Job highlights

Summary

Join Dev.Pro, a US-based software development company, as a Data Engineer. You will play a key role in a project connecting the art market with a digital financial ecosystem, working with blockchain technology and world-class brands. This is a dynamic startup environment where you'll build processes and products from scratch. The role requires extensive experience in data engineering, including ETL/ELT pipelines, data warehousing, and cloud platforms (ideally GCP). You'll be responsible for data migration, pipeline development, and ensuring data quality. Dev.Pro offers a comprehensive benefits package including paid time off, health insurance, and professional development opportunities.

Requirements

  • 4+ years of experience in data engineering, encompassing data extraction, transformation, and migration
  • Advanced experience with data extraction from unstructured files and legacy systems
  • Proven expertise in migrating data from file-based storage systems to cloud storage solutions, ideally on Google Cloud Platform
  • Proficiency with relational databases, specifically MariaDB or MySQL, as well as cloud-native solutions such as Google Cloud Storage, Google BigQuery, and optionally Snowflake or Amazon Redshift
  • Strong programming skills in Python, with a focus on data manipulation, automation, and re-implementing custom tools
  • Extensive experience with ETL/ELT pipeline development and workflow orchestration tools (e.g., Apache Airflow, Luigi, Google Dataflow, Prefect)
  • Hands-on experience with batch processing frameworks and real-time data processing frameworks
  • Experience with data pipeline development using programming languages, including batch processing implementation
  • In-depth understanding of data modeling, data warehousing, and best practices for designing scalable data architectures
  • Practical experience in developing or re-engineering data mastering tools for the purpose of data cleaning, standardization, and preparation
  • Expertise in RDBMS functionalities, such as stored procedures, triggers, partitioning, indexes, and structural changes
  • Ability to handle Personally Identifiable Information (PII) data within pipelines and data storage systems
  • Experience with NoSQL databases, such as MongoDB, Cassandra, or HBase
  • Experience with monitoring tools such as Prometheus, Grafana, and CloudWatch to oversee data pipelines and systems
  • Knowledge of best practices in database management, performance optimization, data security, and ensuring consistency across distributed systems
  • Ability to critically evaluate data architecture and provide strategic recommendations for infrastructure improvements
  • Upper-Intermediate+ English level

Responsibilities

  • Take full responsibility for the data warehouse and pipeline, including planning, coding, reviews, and delivery to the production environment
  • Migrate data from existing file storage systems to the Google Cloud Platform, including Google Cloud Storage and BigQuery
  • Design, develop, and maintain ETL/ELT pipelines to support data migration and integration
  • Collaborate with team members to re-implement existing custom data mastering tools, with a focus on improving data cleaning and standardization capabilities
  • Conduct thorough evaluations of the existing technology stack and provide data-driven recommendations for improvements, including re-evaluating database solutions and orchestration tools
  • Develop a new scraper system to extract and aggregate data from diverse external sources, ensuring integration with existing platforms
  • Ensure the integrity, consistency, and quality of data through optimized processes and validation protocols
  • Work closely with a small, dynamic team to ensure that project milestones are met effectively, with an emphasis on scalability, reliability, and sustainability of solutions

Preferred Qualifications

  • Familiarity with JavaScript for maintaining or enhancing legacy systems and cross-functional integration
  • Experience with ElasticSearch for indexing and querying large datasets
  • Proficiency with analytical tools such as Tableau, Power BI, Looker, or similar platforms for data visualization and insights generation
  • Interest or background in the art industry, particularly related to digital asset management and tokenization
  • Demonstrated ability to collaborate in cross-functional teams and contribute to multidisciplinary projects
  • Experience with PostgreSQL and understanding its application in data engineering environments
  • Knowledge of specific services related to data engineering, including key metrics and business processes relevant to the industry domain
  • Experience with MLOps tools and practices to streamline machine learning deployment and operations
  • Basic understanding of existing machine learning models and algorithms

Benefits

  • Get 30 paid rest days per year to use as holidays/vacation/other on the desired and requested dates
  • 5 sick leave days, up to 60 days of medical leave, and up to 6 days of leave per year due to family reasons (i.e., wedding/funeral/baby birth)
  • Get a health insurance package fully compensated by Dev.Pro
  • Join fun online activities and team-building events
  • Get continuous remote HR, payroll support, and overtime coverage
  • Join English/Polish lessons
  • Grow your expertise with mentorship support and DP University

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs

Please let Dev.Pro know you found this job on JobsCollider. Thanks! πŸ™