Principal Data Engineer

Tiger Analytics Logo

Tiger Analytics

πŸ“Remote - United States

Summary

Join Tiger Analytics as a Principal Data Engineer (Azure) and build scalable data ingestion pipelines using Azure cloud technologies. You will work with various stakeholders, design high-performance data processing for structured and unstructured data, and ensure data harmonization. The role requires experience with Azure Data Factory (ADF), PySpark, Databricks, and other big data technologies. You will collaborate with multiple teams to deliver analytical solutions. The position offers significant career development opportunities in a fast-growing environment. Compensation packages are highly competitive.

Requirements

  • Experience in implementing Data Lake with technologies like Azure Data Factory (ADF), PySpark, Databricks, ADLS, Azure SQL Database
  • A comprehensive foundation with working knowledge of Azure Synapse Analytics, Event Hub & Streaming Analytics, Cosmos DB, and Purview
  • A passion for writing high-quality code and the code should be modular, scalable, and free of bugs (debugging skills in SQL, Python, or Scala/Java)
  • Enthuse to collaborate with various stakeholders across the organization and take complete ownership of deliverables
  • Experience in using big data technologies like Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, Elastic Search
  • Adept understanding of different file formats like Delta Lake, Avro, Parquet, JSON, and CSV
  • Good knowledge of building and designing REST APIs with real-time experience working on Data Lake or Lakehouse projects
  • Experience in supporting BI and Data Science teams in consuming the data in a secure and governed manner
  • Azure Data Factory (ADF), PySpark, Databricks, ADLS, Azure SQL Database
  • Strong programming, unit testing & debugging skills in SQL, Python or Scala/Java
  • Some experience of using big data technologies like Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, Elastic Search
  • Good Understanding of different file formats like Delta Lake, Avro, Parquet, JSON and CSV
  • Experience of working in Agile projects and following DevOps processes with technologies like Git, Jenkins & Azure DevOps

Responsibilities

  • Design and build scalable & metadata-driven data ingestion pipelines (For Batch and Streaming Datasets)
  • Conceptualize and execute high-performance data processing for structured and unstructured data, and data harmonization
  • Schedule, orchestrate, and validate pipelines
  • Design exception handling and log monitoring for debugging
  • Ideate with your peers to make tech stack and tools-related decisions
  • Interact and collaborate with multiple teams (Consulting/Data Science & App Dev) and various stakeholders to meet deadlines, to bring Analytical Solutions to life

Preferred Qualifications

  • Certifications like Data Engineering on Microsoft Azure (DP-203) or Databricks Certified Developer (DE) are valuable addition
  • Experience of working on Data Lake & Lakehouse projects
  • Experience of building REST services and implementing service-oriented architectures
  • Experience of supporting BI and Data Science teams in consuming the data in a secure and governed manner
  • Certifications like Data Engineering on Microsoft Azure (DP-203) or Databricks Certified Developer (DE)
  • Azure Synapse Analytics, Event Hub & Streaming Analytics, Cosmos DB and Purview

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs