Imbue is hiring a
Software Engineer, Data

closed
Logo of Imbue

Imbue

πŸ’΅ $170k-$350k
πŸ“Remote - United States

Summary

This role is for a Data Infrastructure Engineer at Imbue, where the main focus will be on improving the quality of data used in machine learning systems. The work will involve incorporating new sources of high-quality text data, developing models for text classification and OCR, collecting multimodal data, designing unique data generation pipelines, and integrating multiple annotation service providers.

Requirements

  • Detail oriented. Data mistakes are easy to make and hard to catch
  • Passionate about data. You should be happy to look at and deeply engage with the raw data
  • An excellent software engineer. We care about engineering best practices
  • Familiar with python

Responsibilities

  • Incorporate new sources of high quality text data into our existing data pipelines
  • Develop models for accurately classifying and extracting meaningful text from raw html
  • Create a high quality OCR pipeline for pulling pretraining text from images and scans
  • Collect a ludicrous amount of multimodal data(ex: transcripts for thousands of years of video)
  • Design unique data generation pipelines that leverage existing data(ex: convert code from one language to another)
  • Integrate multiple annotation service providers into a sensible interface for researchers

Benefits

  • Work on the most important part of our system
  • Work at a place that deeply cares about data quality
  • Work directly on creating software with human-like intelligence
  • Very generous compensation
  • Flexible working hours
  • Work remotely
  • Time and budget for learning and self improvement
  • Compensation packages are highly variable based on a variety of factors. If your salary requirements fall outside of the stated range, we still encourage you to apply. The range for this role is $170,000–$350,000 cash, $10,000–$2,000,000 in equity
This job is filled or no longer available

Similar Jobs