Summary
Join our team as a highly motivated and detail-oriented Data Scientist to play a critical role in the development and training of large language models (LLMs) focused on code understanding and multimodal data processing.
Requirements
- Proficiency in one or more programming languages (e.g., Python, JavaScript) and a solid understanding of coding concepts
- Experience with data annotation tools and platforms
- Familiarity with machine learning concepts, especially in relation to training large language models
- Strong attention to detail and the ability to understand complex data structures
- Ability to identify patterns and inconsistencies in data to ensure high-quality annotations
- Excellent written and verbal communication skills, with the ability to clearly document processes and provide feedback
- Collaborative mindset with a willingness to learn and adapt to new challenges
Responsibilities
- Accurately label and annotate data related to coding tasks, such as identifying functions, variables, syntax, and code structure
- Annotate multimodal datasets that include text, images, video, and other data types, ensuring that annotations are precise and consistent
- Collaborate with AI researchers and engineers to refine annotation guidelines and develop new annotation strategies
- Perform quality checks on annotated data to ensure high standards are met
- Provide feedback on annotation tools and processes to improve efficiency and accuracy
- Assist in the development and testing of internal tools for data annotation
- Suggest and implement improvements to the annotation process to streamline workflows and enhance productivity
- Work closely with cross-functional teams, including AI researchers, software engineers, and product managers, to understand project requirements and deliver high-quality annotated datasets
- Participate in regular meetings to discuss progress, challenges, and areas for improvement
Preferred Qualifications
- 0-2 years of experience
- Engineering or Bachelorβs degree in Science or equivalent
- Tier 1/2 colleges and universities
- Distinction or first class in Bachelorβs degree
Benefits
- Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
- Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
- Competitive compensation
- Flexible working hours
- Full-time remote opportunity