Senior Data Scientist at PandaDoc -

Summary

Join PandaDoc's AI team and build a powerful AI platform to drive innovation across its products. Develop scalable AI solutions empowering small and medium-sized businesses. Design, evaluate, and optimize AI models, datasets, and pipelines, turning cutting-edge AI into practical solutions. This role involves creating evaluation frameworks for LLMs, developing high-quality datasets, optimizing LLMs for specific business use cases, deploying LLMs in production, and working with various LLMs for efficient task handling. You will contribute to AI-powered automation systems and collaborate with other teams. This position requires expertise in metadata extraction, RAG systems, knowledge graphs, and semantic search.

Requirements

Metadata extraction – experience building pipelines that extract structured information from documents
Retrieval-Augmented Generation (RAG) – designing and optimizing RAG systems, including chunking strategies, retrieval performance, and integration with LLMs
Knowledge graphs – understanding of how to represent and query document relationships to support better retrieval and reasoning
Semantic and conversational search – familiarity with search systems, especially Elasticsearch or similar tools, and how to build intelligent search experiences, including conversational interfaces
Python Expertise – 5+ years of experience in Python using in DS
GenAI Experience – Hands-on experience with LLMs, RAG, knowledge graphs and fine-tuning
LLM Evaluation & Optimization – Proven ability to measure and improve model performance
Dataset & Data Preparation Skills – Expertise in building and curating datasets for AI training
Production Deployment – Experience deploying LLMs and implementing guardrails for responsible AI usage
Trend Awareness – Stays updated on the latest AI research, tools, and frameworks, identifying ways to apply them effectively
Customer-Centric Mindset – Passion for building AI solutions that provide real value to end users
Fast-Paced & Independent – Thrives in a high-speed environment with autonomy to solve problems
Product-Driven Approach – Focuses on building AI solutions that drive tangible business impact

Responsibilities

Model Evaluation & Testing – Create evaluation frameworks to assess LLMs and GenAI applications, measuring recall, effectiveness, and accuracy
Dataset Creation & Data Preparation – Develop high-quality datasets for model training, fine-tuning, and inference
Prompt Engineering & Optimization – Establish evaluation pipelines for prompt performance, ensuring reliability and efficiency
Model Fine-Tuning – Optimize LLMs for specific business use cases to improve their effectiveness
LLM Deployment & Guardrails – Deploy LLMs in production and implement systems to maintain model constraints and safe usage
Multi-Model Support – Work with various LLMs (e.g., GPT-4, Claude, fine-tuned models) for efficient routing and task handling
AI-Driven Automation – Contribute to AI-powered automation systems that enhance customer workflows and streamline routine tasks
Collaboration Across Teams – Work closely with PMs, designers, and engineers to define AI requirements and bring solutions to life

Senior Data Scientist

PandaDoc

Summary

Requirements

Responsibilities

Remote

Data

Senior

Share this job:

Similar Remote Jobs

Remote

Data

Senior

Wealth

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Nuvei

Remote

Data

Senior

Remote

Data

Senior