Program Manager

Deepgram
Summary
Join Deepgram, a leading voice AI platform, as a Program Manager (Data Operations) to lead the design and execution of various voice data programs. You will build from scratch, translating abstract modeling goals into concrete data strategies and pipelines. This role involves owning the creation of tools, guidelines, and quality safeguards, requiring understanding of frontier research strategies and direct influence on how data shapes products. You will work across Research, Engineering, and Product teams, handling both day-to-day execution and systems thinking for scalability. The ideal candidate thrives in ambiguity, prefers action over perfection, and is passionate about voice and speech technology. Deepgram offers a collaborative environment focused on customer satisfaction and rapid growth.
Requirements
- 4+ years of experience in technical data operations, ML programs, or hands-on AI data workflows
- Demonstrated ability to design and build scalable processes, not just manage existing ones
- Strong documentation, communication, and project leadership skills
- Ability to work cross-functionally across technical and non-technical teams
- Experience managing or working closely with vendors, freelancers, or distributed teams
- Deep curiosity about how high-quality data shapes the performance of real-world ML products
Responsibilities
- Design, launch, and own end-to-end data workflows: from raw audio ingestion to production-ready datasets
- Build and evolve labeling specs, style guides, and instructional documentation for global annotation teams
- Identify opportunities for better tooling, automation, and workflow optimization, and lead their implementation
- Translate product goals and model requirements into data creation strategies, deciding what to build, how to build it, and why it matters for product impact
- Prototype and deploy data tools and infrastructure (e.g. Label Studio, custom Python scripts)
- Collaborate with Research and Engineering to align data collection with model training architecture and downstream product impact
- Track advancements in speech AI research and evolving market use cases to inform labeling approaches and data design priorities
- Partner with QA and Evaluation leads to deliver high-quality, human-in-the-loop datasets and benchmarks
- Manage and mentor data vendors, freelancers, and potentially internal ICs as the team grows
- Track throughput, data quality, and vendor performance
- Drive continuous improvement in speed, cost-efficiency, and quality across all data operations
- Curate and refine datasets to align with specific product goals, linguistic coverage, or research hypotheses
Preferred Qualifications
- Familiarity with Python and data visualization tools
- Experience designing datasets for multilingual, low-resource, or domain-specific applications
- Startup or high-ambiguity experience
- Experience building or configuring annotation platforms like Label Studio