Engineering Member, Human Data
poolside
Job highlights
Summary
Join poolside, a remote-first AI company, as a Member of Engineering (Human Data) and lead the development and management of high-quality data labeling pipelines for large language models. You will build and manage an internal labeling team, collaborate with external vendors for crowdsourced data, and design scalable processes for data annotation. This critical role ensures our AI models are trained on top-tier data. You will optimize labeling pipelines, set up quality assurance processes, and work cross-functionally with researchers and engineers. The position requires experience in data labeling, managing vendors, and understanding data quality metrics. Poolside offers a fully remote work environment, generous vacation time, health insurance, and other benefits.
Requirements
- Experience with designing and managing data labeling processes, with a strong emphasis on crowdsourcing solutions
- 2+ years of experience in a technical role such as Data Engineer, Data Scientist, Technical Project Manager, or similar, ideally in machine learning/data-focused environments
- Familiarity with managing vendors and crowdsourcing platforms to handle large-scale data labeling efforts
- Strong understanding of data quality metrics such as accuracy, precision, recall, and F1 score
- Proven ability to develop complex pipelines with multiple stages, particularly for data annotation and machine learning training
- Ability to collaborate with technical teams and ensure labeling processes align with overall model development needs
- Mandatory experience with crowdsourcing platforms (e.g., ScaleAI, Toloka, or similar) for data labeling
- Strong problem-solving skills and ability to work independently in a fast-paced environment
Responsibilities
- Design, develop, and implement scalable data labeling pipelines that integrate into model training workflows
- Manage and expand the internal data labeling team to meet the company's growing needs
- Collaborate with external vendors to source and manage crowdsourced data labeling efforts, ensuring timely and high-quality delivery
- Monitor and improve labeling processes by conducting experiments, ensuring data quality, and optimizing performance across labeling projects
- Set up metrics and QA processes to evaluate the quality of labeled data and continuously improve output
- Work cross-functionally with researchers and engineers to align labeling pipelines with model training needs
- Identify new tools and technologies to streamline labeling processes and increase efficiency
Preferred Qualifications
Experience with cloud platforms and tools such as AWS, GCP, Kubernetes, and CI/CD systems is a plus
Benefits
- Fully remote work & flexible hours
- 37 days/year of vacation & holidays
- Health insurance allowance for you and dependents
- Company-provided equipment
- Wellbeing, always-be-learning and home office allowances
- Frequent team get togethers
- Great diverse & inclusive people-first culture
Share this job:
Similar Remote Jobs
- πUnited States
- π°$142k-$282kπUnited States, Canada
- πUnited States
- πUnited States
- πUnited States
- πWorldwide
- πPoland
- πWorldwide
- πWorldwide