πUnited Kingdom
Data Engineer Intern

Sayari
π΅ $41k-$52k
πRemote - United States
Please let Sayari know you found this job on JobsCollider. Thanks! π
Summary
Join Sayari's Data Engineering team as a Data Engineer Intern specializing in web crawling! This remote, paid internship (20-30 hours/week) focuses on maintaining and improving Sayari's web crawling framework, emphasizing scalability and reliability. You'll collaborate with Product and Software Engineering teams to ensure crawling deployments meet product requirements and integrate efficiently with ETL pipelines. The internship involves investigating and implementing web crawlers for new sources, maintaining existing infrastructure, improving metrics and reporting, and contributing to Sayari's data product development. This role offers valuable experience in large-scale web crawling and data engineering.
Requirements
- Experience with Python
- Experience managing web crawling at scale, any framework, Scrapy is a plus
- Experience working with Kubernetes
- Experience working collaboratively with git
- Experience working with selectors such as: XPath, CSS, JMESPath
- Experience with WebDev tools (Chrome/Firefox)
Responsibilities
- Investigate and implement web crawlers for new sources
- Maintain and improve existing crawling infrastructure
- Improve metrics and reporting for web crawling
- Help improve and maintain ETL processes
- Contribute to development and design of Sayariβs data product
Preferred Qualifications
- Experience with Apache projects such as Spark, Avro, Nifi, and Airflow
- Experience with datastores Postgres and/or RocksDB
- Experience working on a cloud platform like GCP, AWS, or Azure
- Working knowledge of API frameworks, primarily REST
- Understanding of or interest in knowledge graphs
- Experience with *nix environments
- Experience with reverse engineering
- Proficient in bypassing anti-crawling techniques
- Experience with Javascript
Benefits
- This is a remote paid internship with work expectations being between 20-30 hours a week
- $20 - $25 an hour
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
π°$41k-$52k
πUnited States
πUnited States
πCanada
π°$175k-$210k
πUnited States
πArgentina
π°$175k-$210k
πUnited States
π°$175k-$210k
πUnited States
π°$175k-$210k
πUnited States