Lead Data Engineer

Reveal HealthTech Logo

Reveal HealthTech

πŸ“Remote - India

Summary

Join Reveal Health Tech as a Lead Data Engineer to design, implement, and maintain scalable data pipelines using AWS services. The ideal candidate is independent, detail-oriented, and capable of performing data validation and debugging.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent work experience)
  • Proficiency in data modeling and architecture, leveraging AWS services to architect scalable and efficient data solutions
  • Proven experience as a Data Engineer or in a similar role
  • Strong proficiency in AWS services relevant to data engineering (e.g., AWS Glue, Lambda, S3, Redshift, DynamoDB, Amazon Athena, OpenSearch)
  • Experience designing and building data pipelines from diverse source systems
  • Solid understanding of data lake and warehousing concepts and best practices
  • Proficiency in programming languages used for data manipulation and transformation (e.g., Python, SQL)
  • Experience with relational databases like PostgreSQL (PSQL)
  • Understanding of database systems, including schema design, SQL querying, and performance optimization
  • Experience implementing data quality checks and validation processes
  • Experience using Cloudformation and CI/CD for deployment
  • Familiarity with data quality validation tools and frameworks (Monte Carlo)
  • Ability to work independently and as part of a team in a fast-paced environment
  • Excellent problem-solving skills and attention to detail

Responsibilities

  • Designing and implementing scalable, robust, and maintainable data pipelines using AWS services such as Glue, EMR, S3, Redshift, DynamoDB, Amazon Athena, and OpenSearch
  • Implementing data quality checks and validation processes to ensure accuracy, completeness, and integrity of data
  • Designing effective data models and architectures to optimize data processing and facilitate downstream Data Science and Machine Learning workflows
  • Utilizing data quality validation tools (mention specific tools if relevant) to automate and streamline validation processes
  • Collaborating with cross-functional teams (e.g., Data Science, Software Engineering) to understand data requirements and implement efficient data processing solutions
  • Monitoring and optimizing the performance of data pipelines
  • Debugging issues and resolving data-related problems in a timely manner
  • Documenting data pipeline architectures, processes, and procedures
  • Stay up-to-date with new technologies, tools, and best practices in the data engineering field

Preferred Qualifications

  • AWS certification (e.g., AWS Certified Solutions Architect, AWS Certified Data Analytics Specialty)
  • Familiarity with serverless computing (e.g., AWS Lambda) and containerization (e.g., Docker)
  • Experience with LLMs and machine learning is highly preferred

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs