Senior Data Engineer - I

DATAMAXIS Logo

DATAMAXIS

πŸ“Remote - India

Summary

Join our team as a Databricks Data Engineer and play a crucial role in our Fintech data lake project. You will be involved in the design and development of enterprise data solutions in Databricks, ensuring robustness and scalability. Collaborate with the Data Architect to build and maintain data pipeline architectures. Process large, complex ERP datasets to meet diverse requirements. Continuously optimize data solutions, implementing testing and tooling techniques to enhance quality. Improve the performance, reliability, and maintainability of data pipelines. Implement and maintain PySpark and Databricks SQL workflows. Participate in release management using Git and CI/CD practices. Develop business reports using SplashBI. This is a full-time remote position.

Requirements

  • 5+ years of experience working in data warehousing systems
  • 3+ strong hands-on programming expertise in Databricks landscape, including SparkSQL, Workflows for data processing and pipeline development
  • 3+ strong hands-on data transformation/ETL skills using Spark SQL, Pyspark, Unity Catalog working in Databricks Medallion architecture
  • 2+ yrs work experience in one of cloud platforms: Azure, AWS or GCP
  • Experience working in using Git version control, and well versed with CI/CD best practices to automate the deployment and management of data pipelines and infrastructure
  • Must have eport development experience using PowerBI, SplashBI or any enterprise reporting tool
  • Bachelors Degree in Computer Science, Engineering, Finance or equivalent experience
  • Good communication skills

Responsibilities

  • Involve in design and development of enterprise data solutions in Databricks, from ideation to deployment, ensuring robustness and scalability
  • Work with the Data Architect to build, and maintain robust and scalable data pipeline architectures on Databricks using PySpark and SQL
  • Assemble and process large, complex ERP datasets to meet diverse functional and non-functional requirements
  • Involve in continuous optimization efforts, implementing testing and tooling techniques to enhance data solution quality
  • Focus on improving performance, reliability, and maintainability of data pipelines
  • Implement and maintain PySpark and databricks SQL workflows for querying and analyzing large datasets
  • Involve in release management using Git and CI/CD practices
  • Develop business reports using SplashBI reporting tool leveraging the data from Databricks gold layer

Preferred Qualifications

  • Nice to have hands-on experience building data ingestion pipelines from ERP systems (Oracle Fusion preferably) to a Databricks environment, using Fivetran or any alternative data connectors
  • Experience in a fast-paced, ever-changing and growing environment
  • Understanding of metadata management, data lineage, and data glossaries is a plus

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs