Summary

Join Movable Ink as a Principal Data Engineer and help shape our Data Warehouse and Hybrid Data Lake Infrastructure. You will collaborate with various teams, enabling data-driven decisions. This pivotal role involves owning the infrastructure and code for data pipelines, handling data at scale. Responsibilities include designing, implementing, and optimizing ingestion pipelines, ensuring data accuracy and integrity. You will also mentor junior team members and ensure compliance with regulatory requirements. The ideal candidate possesses extensive experience in data engineering, cloud-based data warehouses, and various data pipeline technologies.

Requirements

12+ years of professional experience in data engineering, software engineering, database administration, business intelligence, or related fields, with 8+ years as a Data Engineer focused on cloud-based Data Warehouses(Redshift, Snowflake, Firebolt, BigQuery)
Deep experience working with multi-petabyte, mission-critical databases, optimizing for high availability, performance, and reliability, informed by a strong understanding of database internals
Expert proficiency with Python and SQL, and significant experience building robust data pipelines with these languages
Expert proficiency in deploying and managing data pipeline orchestration frameworks such as Apache Airflow or Prefect
Significant experience with Infrastructure-as-Code (Terraform) and automating cloud infrastructure management
Significant experience with stream processing technologies such as Apache Flink, Apache Kafka, or Apache Pulsar
Significant experience in building telemetry, monitoring, and alerting solutions for large-scale data pipelines
Significant experience in implementing Hybrid Data Lake / Data Warehouse architectures, with a focus on Apache Iceberg or similar technologies
Significant experience in designing and implementing solutions that comply with regulatory requirements such as GDPR and CCPA
Experience in Agile/Scrum environments, working with technical managers and product owners to break down high-level requirements into actionable work
Excellent communication skills, with the ability to effectively collaborate across technical and business teams

Responsibilities

Partner with internal operations teams to identify, collect, and integrate data from various business systems, ensuring comprehensive and accurate data capture
Design, implement, and maintain robust batch and real-time data pipelines, leveraging tools like Apache Airflow, Apache Flink, and Terraform for IaC
Build and optimize Hybrid Data Lake / Data Warehouse infrastructure with solutions like Apache Iceberg for scalable and cost-effective storage
Ensure data pipelines adhere to best practices and are optimized for performance, scalability, and reliability
Conduct thorough testing of data pipelines to validate data accuracy and integrity
Monitor data pipelines, implement telemetry and alerting, troubleshoot any issues that arise, and proactively improve system reliability
Establish and track SLAs for data processing and delivery, ensuring timely and reliable access to data for all users
Become a mentor for less experienced team members and establish patterns and practices that increase the quality, accuracy, and efficiency of solutions produced by the team
Design and implement Change Data Capture (CDC) solutions to support real-time data replication and point-in-time data queries
Work with other teams to ensure secure data access and compliance with regulatory requirements (e.g., GDPR, CCPA, etc.)

Principal Data Engineer

Movable Ink

Summary

Requirements

Responsibilities

Remote

Data

Principal

Share this job:

Similar Remote Jobs

NextHire

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Remote

Data

Principal

Seamless.AI

Remote

Data

Principal