Summary
Join Prolific, a leader in human data infrastructure for AI development, as a Data Engineer. You will build and optimize scalable data pipelines, ensuring high-quality data access for various teams. Collaborate with data scientists, analysts, and engineers to enhance data workflows and drive impactful insights. This role requires expertise in Python, SQL, data warehousing, pipeline tools, API development, and data modeling. You'll be responsible for data pipeline development, warehouse management, system architecture, and data quality assurance. Prolific offers a competitive salary, benefits, and remote work opportunities within a mission-driven culture.
Requirements
- 2+ years of hands on experience deploying production quality code with proficiency in Python for data processing and related packages
- Deep understanding of SQL and analytical data warehouses (Snowflake, Redshift preferred) with proven experience implementing ETL/ELT best practices at scale
- Hands on experience with data pipeline tools (Airflow, dbt) and strong ability to optimise for performance and reliability
- Ability to design and develop robust data APIs and services that expose data to applications, bridging analytical and operational systems
- Strong data modelling skills and familiarity with the Kimball methodology to create efficient, scalable data structures
- Commitment to continuously improving product quality, security, and performance through rigorous testing and code reviews
- Meticulous approach to creating and maintaining architecture and systems documentation
- Ability to work across teams to understand and address diverse data needs while maintaining data integrity
- Desire to continually keep up with advancements in data engineering practices and technologies
- Exceptional analytical skills to troubleshoot complex data issues and implement effective solutions
- Capability to ship medium features independently while contributing to the team's overall objectives
Responsibilities
- Build and maintain robust data pipelines from internal databases and SaaS applications ensuring timely and accurate data delivery
- Maintain our data warehouse with high-quality, well-structured data that supports analytics and business operations
- Design and implement scalable data infrastructure that accommodates our growing data volume and complexity
- Create and maintain APIs and micro services that expose data to applications, enabling seamless integration between data systems and business applications
- Establish processes and tools to monitor data quality, identify issues, and implement fixes promptly
- Create and maintain comprehensive documentation of data flows, models, and systems for knowledge sharing
- Work closely with analytics, research, and product teams to ensure their data needs are addressed effectively
- Implement and advocate for data engineering best practices across the organisation
- Plan and execute system expansion as needed to support the company's growth and evolving analytical needs
- Continuously optimise data pipelines and warehouse performance to improve efficiency and reduce costs
- Ensure all data systems adhere to security best practices and compliance requirements
Preferred Qualifications
Snowflake, Redshift
Benefits
- Competitive salary
- Benefits
- Remote working
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.