Staff Data Engineer

Thoughtful AI
Summary
Join Thoughtful's mission to revolutionize healthcare by joining their team as a Staff Data Engineer. You will help scale and strengthen their AI-powered Revenue Cycle Automation platform's data platform, working with technologies like Aurora RDS, AWS Glue, Apache Iceberg, and more. Responsibilities include building and maintaining data pipelines, optimizing performance and cost-efficiency, extending data ingestion patterns, and collaborating with various teams. You will contribute to best practices and data governance. The ideal candidate possesses 8-10+ years of relevant experience, strong knowledge of the data lakehouse ecosystem, and proficiency in Python, Spark, and Athena/Trino/PrestoDB. Thoughtful offers competitive compensation, equity participation, comprehensive health benefits, and generous time off.
Requirements
- 8-10+ years of experience building and maintaining data pipelines in production environments
- Strong knowledge of the data lakehouse ecosystem, with an emphasis on AWS data services - particularly Glue, S3, Athena/Trino/PrestoDB, and Aurora
- Proficiency in Python, Spark and Athena/Trino/PrestoDB for data transformation and orchestration
- Experience managing infrastructure with OpenTofu/Terraform or other Infrastructure-as-Code tools
- Solid understanding of data modeling, partitioning strategies, schema evolution, and performance tuning
- Comfortable working with cloud-native data pipelines and batch processing (streaming experience is a plus but not required)
Responsibilities
- Develop and maintain data pipelines and transformations across the stack. Starting from ingesting transactional data into the data lakehouse to refining data up the medallion data architecture
- Tune performance, storage layout, and cost-efficiency across our data storage and query engines
- Help design and implement new data ingestion patterns and improve platform observability and reliability
- Partner with engineering, product, and operations teams to deliver well-structured, trustworthy data for diverse use cases
- Help establish and evolve best practices for our data infrastructure, from pipeline design to OpenTofu-managed resource provisioning
- Help design and implement a data governance strategy to secure our data lakehouse
Preferred Qualifications
- Systems thinker - you understand the tradeoffs in data architecture and design for long-term stability and clarity
- Outcome-driven - you focus on building useful, maintainable systems that serve real business needs
- Strong collaborator - you're comfortable working across teams and surfacing data requirements early
- Practical and hands-on - able to dive into logs, schemas, and IAM policies when needed
- Thoughtful contributor - committed to improving code quality, developer experience, and documentation across the board
Benefits
- Competitive compensation
- Equity participation: Employee Stock Options
- Health benefits: Comprehensive medical, dental, and vision insurance
- Time off: Generous leave policies and paid company holidays