Summary
Join Jungle Scout's Data Engineering team as a Software Engineer, Data Collection, focusing on building and scaling web scraping systems to process over 30TB of Ecommerce data daily. You will design and implement high-performance, fault-tolerant systems and pipelines, ensuring data integrity and contributing to core data platform components. This role involves solving complex engineering problems, managing large-scale event processing, and optimizing scraping efficiency. Collaboration is key, with responsibilities including code reviews, design reviews, and partnering with stakeholders. The position is remote-first, within the PST-EST time zone across Canada (excluding Quebec).
Requirements
- Experienced in large-scale scraping . You have hands-on experience building and scaling web scraping or crawling systems
- Cloud-native mindset . Youโve worked in environments like AWS, GCP, or Azure and are comfortable navigating cloud infrastructure
- Skilled in data architecture . You understand pub/sub systems, data pipelines, and how to move and process large volumes of data
- Committed to clean code . You thrive in collaborative environments and care about writing maintainable, production-grade code
- Excited by scale . You enjoy working with TBs of real-world data and solving performance and reliability challenges
- Fluent in Python or TypeScript . You bring strong programming skills and are comfortable writing SQL
- Obsessed with quality . You care deeply about data integrity, observability, and building systems you can trust
- Clear communicator & team player . You explain technical concepts well and enjoy working closely with others to drive impact
- Experience with distributed systems and data pipelines
- Strong programming skills in Python or TypeScript
- Expertise in SQL for querying and transforming data
- Experience with pub/sub and streaming systems (Kafka, Kinesis, etc.)
- Hands-on experience in a cloud-native environment (AWS preferred)
- Familiarity with CI/CD pipelines, code reviews, and automated testing
- Experience with infrastructure-as-code (e.g., AWS CDK or Terraform)
- Solid understanding of distributed systems and performance optimization
Responsibilities
- Solve complex engineering problems by building high-volume, fault-tolerant scraping services that enhance our data extraction infrastructure
- Manage large-scale event processing by working with systems handling hundreds of millions of events daily
- Design and improve algorithms to optimize scraping efficiency and accuracy
- Implement observability and monitoring tools to track data integrity and maintain high data quality standards
- Contribute to core data platform components that support both new and existing data-driven product features
- Engage in collaborative engineering practices including technical design reviews, code reviews, and pair programming
- Partner with stakeholders to help shape the data roadmap and influence strategic decisions
Preferred Qualifications
- Experience building large-scale web scraping or DaaS systems
- Experience monitoring and operating high-throughput production systems
- Experience building serverless services using Lambda or containerized applications
- Background with NoSQL or document stores (e.g., DynamoDB, Redis, Elasticsearch)
- Familiarity with data lake formats like Iceberg, Hudi, or Delta Lake
- Experience with Airflow or other orchestration tools
- Exposure to Chrome extension development (especially helpful for scraping)
- Worked on systems with well-defined SLAs, uptime, and reliability targets
Benefits
- The BEST team . Youโll work alongside the smartest, most passionate, and kindest humans day in and day out making work fun
- A growth culture! We have tons of opportunities for you to elevate your skills and take you to that next step; we are here to help you find the ones that matter most to you through exposure and training
- Ability to make impact! Although itโs a highly collaborative culture, team members are empowered to work autonomously and take extreme ownership of their work. You'll have the opportunity to truly make a difference and impact our customers
- Competitive compensation packages! We structure our compensation packages to reward our team members' contributions to our company's success - youโll have a bonus tied to performance and will be invested into our long-term success with Equity
- Flexible Time Off. With our generous PTO and recognition of local holidays, escape to the beach, recharge mentally, or use your Volunteer Time Off (VTO) to give back through volunteering
- Comprehensive Health Benefits & Retirement Program. We offer comprehensive healthcare and retirement matching plans for eligible employees
- Paid Parental Leave Policy . Jungle Scout values the importance of family and offers a paid parental leave that provides the support and flexibility you need to embrace this special time in your life. We also offer a ramp-back period for a seamless transition for you and your family
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.