Data Architect

Suvoda
Summary
Join Suvoda as a Data Architect and lead the design and delivery of our data architecture and pipelines, integrating clinical and financial data under GxP and SOX compliance. Modernize our data architecture, build efficient data pipelines, and develop advanced models to support analytics and reporting. Leverage your deep AWS data architecture expertise, strong data engineering skills, and experience guiding cross-functional teams in regulated environments. Define and communicate the reference architecture for a secure, compliant cloud data platform on AWS. Design curated, versioned data zones and modernize ingestion and ELT processes. Lead the design and implementation of dimensional and data-vault models, and develop and maintain secure, versioned GraphQL endpoints and data services. Champion data governance and data quality, and mentor engineering teams.
Requirements
- At least 10 years designing and delivering data-intensive systems, with at least 3 years in a principal-level or architecture role
- Direct experience with clinical-trial, life-sciences, or financial-ledger data
- Knowledge of ledger-based payments processing and reconciliation
- Deep expertise with AWS analytics stack: S3, Glue/Glue Studio & Catalog, Athena, Redshift, Lake Formation, Lambda, EMR or EKS-hosted PySpark
- Experience building GraphQL or REST data services that front a lakehouse or warehouse
- Mastery of data modeling (dimensional, data vault, lakehouse patterns) and performance tuning for large-scale analytical workloads
- Hands-on with Python & PySpark, SQL, and infrastructure-as-code (Terraform/CDK)
- Familiarity with regulatory / GxP / SOX environments and secure-by-design principles (encryption, tokenization, IAM, PII/PHI segregation)
- Experience guiding cross-functional teams in an Agile/DevOps/SRE culture
Responsibilities
- Define and communicate the reference architecture for a secure, compliant cloud data platform on AWS, harmonizing clinical and financial data under GxP and SOX controls
- Design curated, versioned data zones using S3, AWS Glue Catalog, and Iceberg/Delta-style file formats to optimize performance and cost
- Modernize ingestion and ELT processes by establishing real-time (AWS DMS, PySpark Structured Streaming) and batch (Glue Jobs, PySpark) pipelines, replacing legacy ETL with declarative, testable workflows for clinical and payment data
- Lead the design and implementation of dimensional and data-vault models, unifying subject, site, supply chain, and payment domains; govern semantic layers in Athena/Redshift Spectrum and Tableau/QuickSight for analytics and AI enablement
- Develop and maintain secure, versioned GraphQL endpoints and data services to provide streamlined access to curated datasets for strategic reporting
- Champion data governance and data quality by defining data contracts, schema evolution strategies, lineage, and automated validation—ensuring auditability across the SDLC
- Mentor and influence engineering teams of 20+ engineers, lead architecture reviews, collaborative design, and internal workshops
- Serve as a technical advisor to product, security, and compliance stakeholders