Ai Data Operations Labeling Engineer at N-Power Medicine

Summary

Join N-Power Medicine as an AI Data Operations/Labeling Engineer and play a critical role in building and optimizing our data labeling pipeline. You will ensure the creation of high-quality labeled datasets from diverse healthcare sources. This involves close collaboration with AI scientists, engineers, and subject matter experts to power the development and validation of advanced AI/ML models. This hybrid or remote role is preferably based in the Bay Area but not required. You will design and implement efficient data labeling workflows, manage data labeling platforms, and collaborate with stakeholders to create labeling guidelines. You will also track labeler performance, implement quality control measures, and develop data pipelines for data preprocessing and integration. The role requires strong proficiency in Python and SQL, along with experience in data engineering and AI.

Requirements

BS/MSc in the field of engineering, computer science, applied mathematics, physics, statistics, or relevant field of study
5+ years of relevant experience in AI oriented engineering or data science, including proven experience with data labeling platforms and tools
Strong proficiency in Python and SQL
Strong project management and organizational skills, with an understanding of data workflows
Solid understanding of data engineering principles and practices, particularly within distributed data processing environments
Experience with scalable data processing frameworks
Ability to see problems and solutions from a ‘holistic’ point of view and communicate how specific solutions bring business value
Excellent ability to see the big picture while decomposing complex solutions into incremental steps
Insight into how to create sustainable, reusable, and properly modular code
Excellent written, verbal, interpersonal, and communication skills
Generous, Curious and Humble

Responsibilities

Design and implement efficient data labeling workflows that integrate with data processing pipelines
Write appropriate statistical design and analysis plans in the curation of AI labeling dataset to ensure the translation of labeling effort into improvements in model analytical performance or validation
Select and manage data labeling platforms and tools
Collaborate with stakeholders, SMEs to craft detailed labeling guidelines
Work with TPM to recruit, train, and manage human labelers (internal or external)
Track labeler performance and provide feedback
Implement robust quality control measures to ensure labeling accuracy and consistency
Develop data pipelines for data preprocessing and labeled data integration, utilizing scalable data processing frameworks
Manage data storage and versioning for labeled datasets
Develop and refine tools to automate labeling tasks and improve efficiency
Integrate labeling platforms with other AI/ML tools
Create and implement quality assurance procedures for labeling
Collaborate closely with AI data scientists and engineers to understand labeling requirements
Understand the principles of AI models that explicitly learn from human feedback and assist humans in evaluating AI model output accuracy
Applies safeguards and protections in line with HIPAA and applicable privacy laws and adheres to relevant compliance, quality, security, privacy, legal, and ethical standards when it comes to the use of AI

Preferred Qualifications

Experience and/or knowledge of the healthcare data domain in at least one of: electronic medical records (EHR/EMR), clinical imaging, clinical workflows; Electronic Data captures (EDC), and/or clinical research/trials
Polyglot coder with hands-on development experience across additional modern programming languages (e.g. R, javascript, C++, …)
Familiarity of and documented experience with one or more deep learning frameworks (e.g. PyTorch, Tensorflow, MLflow, etc.)
Hands-on experience with modern, web-based compute - and storage services (e.g., Databricks, AWS, Google Colab, Microsoft Azure, etc.)
Familiarity with the software development life cycle (SDLC)
Experience in creating detailed labeling guidelines and quality assurance processes
Ability to effectively manage and motivate human labelers

Benefits

Equity at hire
Discretionary annual bonus
Company benefits
401K plan
Other great company “perks”
Hybrid or remote role

Ai Data Operations Labeling Engineer

N-Power Medicine

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

Data

Mid-level

Share this job:

Similar Remote Jobs

ServiceNow

Remote

DevOps

Mid-level

Scalable

Remote

Data

Senior

Scalable

Remote

Data

Senior

Goodnotes

Remote

Data

Mid-level

x.ai

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

Software Development

Mid-level

Canonical

Remote

Data

Mid-level

Remote

Project Management

Mid-level