Staff Data Engineer

Thoughtful AI Logo

Thoughtful AI

πŸ“Remote - United States

Summary

Join Thoughtful's mission to revolutionize healthcare by joining their team as a Staff Data Engineer. You will help scale and strengthen their AI-powered Revenue Cycle Automation platform's data platform, working with technologies like Aurora RDS, AWS Glue, Apache Iceberg, and more. Responsibilities include building and maintaining data pipelines, optimizing performance and cost-efficiency, extending data ingestion patterns, and collaborating with various teams. You will contribute to best practices and data governance. The ideal candidate possesses 8-10+ years of relevant experience, strong knowledge of the data lakehouse ecosystem, and proficiency in Python, Spark, and Athena/Trino/PrestoDB. Thoughtful offers competitive compensation, equity participation, comprehensive health benefits, and generous time off.

Requirements

  • 8-10+ years of experience building and maintaining data pipelines in production environments
  • Strong knowledge of the data lakehouse ecosystem, with an emphasis on AWS data services - particularly Glue, S3, Athena/Trino/PrestoDB, and Aurora
  • Proficiency in Python, Spark and Athena/Trino/PrestoDB for data transformation and orchestration
  • Experience managing infrastructure with OpenTofu/Terraform or other Infrastructure-as-Code tools
  • Solid understanding of data modeling, partitioning strategies, schema evolution, and performance tuning
  • Comfortable working with cloud-native data pipelines and batch processing (streaming experience is a plus but not required)

Responsibilities

  • Develop and maintain data pipelines and transformations across the stack. Starting from ingesting transactional data into the data lakehouse to refining data up the medallion data architecture
  • Tune performance, storage layout, and cost-efficiency across our data storage and query engines
  • Help design and implement new data ingestion patterns and improve platform observability and reliability
  • Partner with engineering, product, and operations teams to deliver well-structured, trustworthy data for diverse use cases
  • Help establish and evolve best practices for our data infrastructure, from pipeline design to OpenTofu-managed resource provisioning
  • Help design and implement a data governance strategy to secure our data lakehouse

Preferred Qualifications

  • Systems thinker - you understand the tradeoffs in data architecture and design for long-term stability and clarity
  • Outcome-driven - you focus on building useful, maintainable systems that serve real business needs
  • Strong collaborator - you're comfortable working across teams and surfacing data requirements early
  • Practical and hands-on - able to dive into logs, schemas, and IAM policies when needed
  • Thoughtful contributor - committed to improving code quality, developer experience, and documentation across the board

Benefits

  • Competitive compensation
  • Equity participation: Employee Stock Options
  • Health benefits: Comprehensive medical, dental, and vision insurance
  • Time off: Generous leave policies and paid company holidays

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs