๐France, Spain
Senior Data Engineer

H1
๐Remote - India
Please let H1 know you found this job on JobsCollider. Thanks! ๐
Summary
Join H1, a company aiming to improve global healthcare access through data and AI, as a Senior Data Engineer. You will design, develop, and maintain scalable data systems and pipelines, working with big data technologies like Apache Spark on AWS. This role involves collaboration with various teams, optimizing data processing, and ensuring data quality. You will also mentor other engineers and contribute to best practices. The ideal candidate has extensive experience in data engineering, strong programming skills, and a passion for building high-quality solutions. H1 offers a comprehensive benefits package including health insurance, paid time off, retirement options, and flexible work arrangements.
Requirements
- 6+ years of experience in data engineering, working with large-scale data systems and pipelines
- Proficiency in programming languages like Python, Java, or similar languages
- Strong SQL skills, including the ability to write optimized complex queries for large datasets using advanced SQL operators such as GROUP BY, HAVING, window functions, and complex joins
- Experience with big data tools like Apache Spark, particularly on cloud platforms, with a preference for AWS EMR
- Experience with Docker or other containerization technologies
Responsibilities
- Design, develop, and maintain scalable data extraction frameworks that ingest structured and unstructured data from diverse sources
- Build and optimize robust ETL/ELT pipelines using big data technologies, especially Apache Spark on cloud platforms (preferably AWS EMR)
- Improve the efficiency, reliability, and performance of data processing systems through thoughtful design and continuous optimization
- Transform, clean, and normalize complex datasets for downstream use, ensuring high standards of data quality and consistency
- Partner with senior engineers to evolve H1โs data architecture and infrastructure in support of product and platform scalability
- Lead data integration efforts across multiple systems, ensuring accuracy and seamless collaboration across teams
- Monitor and troubleshoot data flows and pipelines, proactively identifying and resolving performance issues
- Maintain clear documentation of systems, workflows, and processes to promote transparency and operational excellence
- Participate in code reviews and promote a culture of engineering excellence, mentorship, and continuous improvement
- Collaborate closely with cross-functional teams to align technical execution with business goals
Preferred Qualifications
- You have an understanding of Large Language Models (LLMs) and their applications
- Itโs a bonus if youโre familiar with model training and fine-tuning, particularly in NLP (Natural Language Processing) contexts
- You possess a basic knowledge of network, security, and encryption protocols such as HTTP/HTTPS/TLS
- Youโre able to work collaboratively across teams and communicate effectively with both technical and non-technical stakeholders
- You have strong analytical and problem-solving skills with a focus on data quality and performance optimization
- You have a passion for writing clean, efficient code and following best practices
Benefits
- Full suite of health insurance options, in addition to generous paid time off
- Pre-planned company-wide wellness holidays
- Retirement options
- Health & charitable donation stipends
- Impactful Business Resource Groups
- Flexible work hours & the opportunity to work from anywhere
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
๐ฐ$120k-$180k
๐Worldwide
๐Brazil
๐India
๐ฐ$175k-$210k
๐United States
๐ฐ$225k-$255k
๐United States
๐ฐ$170k-$180k
๐United States
๐Argentina
๐ฐ$140k-$200k
๐United States