Senior Data Scientist

PandaDoc Logo

PandaDoc

📍Remote

Summary

Join PandaDoc's AI team and build a powerful AI platform to drive innovation across its products. Develop scalable AI solutions empowering small and medium-sized businesses. Design, evaluate, and optimize AI models, datasets, and pipelines, turning cutting-edge AI into practical solutions. This role involves creating evaluation frameworks for LLMs, developing high-quality datasets, optimizing LLMs for specific business use cases, deploying LLMs in production, and working with various LLMs for efficient task handling. You will contribute to AI-powered automation systems and collaborate with other teams. This position requires expertise in metadata extraction, RAG systems, knowledge graphs, and semantic search.

Requirements

  • Metadata extraction – experience building pipelines that extract structured information from documents
  • Retrieval-Augmented Generation (RAG) – designing and optimizing RAG systems, including chunking strategies, retrieval performance, and integration with LLMs
  • Knowledge graphs – understanding of how to represent and query document relationships to support better retrieval and reasoning
  • Semantic and conversational search – familiarity with search systems, especially Elasticsearch or similar tools, and how to build intelligent search experiences, including conversational interfaces
  • Python Expertise – 5+ years of experience in Python using in DS
  • GenAI Experience – Hands-on experience with LLMs, RAG, knowledge graphs and fine-tuning
  • LLM Evaluation & Optimization – Proven ability to measure and improve model performance
  • Dataset & Data Preparation Skills – Expertise in building and curating datasets for AI training
  • Production Deployment – Experience deploying LLMs and implementing guardrails for responsible AI usage
  • Trend Awareness – Stays updated on the latest AI research, tools, and frameworks, identifying ways to apply them effectively
  • Customer-Centric Mindset – Passion for building AI solutions that provide real value to end users
  • Fast-Paced & Independent – Thrives in a high-speed environment with autonomy to solve problems
  • Product-Driven Approach – Focuses on building AI solutions that drive tangible business impact

Responsibilities

  • Model Evaluation & Testing – Create evaluation frameworks to assess LLMs and GenAI applications, measuring recall, effectiveness, and accuracy
  • Dataset Creation & Data Preparation – Develop high-quality datasets for model training, fine-tuning, and inference
  • Prompt Engineering & Optimization – Establish evaluation pipelines for prompt performance, ensuring reliability and efficiency
  • Model Fine-Tuning – Optimize LLMs for specific business use cases to improve their effectiveness
  • LLM Deployment & Guardrails – Deploy LLMs in production and implement systems to maintain model constraints and safe usage
  • Multi-Model Support – Work with various LLMs (e.g., GPT-4, Claude, fine-tuned models) for efficient routing and task handling
  • AI-Driven Automation – Contribute to AI-powered automation systems that enhance customer workflows and streamline routine tasks
  • Collaboration Across Teams – Work closely with PMs, designers, and engineers to define AI requirements and bring solutions to life

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.