Staff Data Engineer

SandboxAQ Logo

SandboxAQ

πŸ“Remote - United States

Summary

Join SandboxAQ, a high-growth company delivering AI solutions, as a Staff Data Engineer for Operations. You will architect and manage the scalable data infrastructure for CardiAQβ„’, a noninvasive cardiovascular disease diagnostic device. This role involves designing and managing data pipelines on AWS, overseeing core data systems, creating data ingestion and transformation processes, and automating data processing pipelines. You will work closely with scientists and ML engineers to ensure data quality and meet healthcare regulations. This is an opportunity to build and own a critical data system within a dynamic startup environment and make a significant impact on the future of healthcare.

Requirements

  • Bachelor's or Master's Degree in Computer Science, Computer Engineering, Mathematics, Physics, Statistics, or other relevant technical discipline
  • 8+ years of experience as a Data Engineer, focusing on data operations
  • Extensive experience working with large datasets in a cloud environment, specifically AWS
  • Strong proficiency in AWS data tools and services (e.g., S3)
  • Experience with data pipeline development and batch processing
  • Experience with programming languages related to data engineering tools and frameworks such as Spark, Scala, Hadoop, Python, or similar
  • Proficiency with SQL and databases
  • Comfortable working in a collaborative, fast-paced team with a strong mission
  • Proactive, self-driven, and excited to learn new technologies and approaches
  • Demonstrates strong ownership by proactively taking initiative, accountability, and attention to detail

Responsibilities

  • Design and manage scalable data pipelines on AWS to support AI-driven medical research
  • Oversee core data systems, ensuring high uptime, data quality , and efficient batch workflows
  • Create and refine data ingestion , transformation, and storage processes for research use
  • Automate data processing pipelines to improve reliability, speed, and reproducibility
  • Build and maintain automated ML training and reporting workflows to support model development and monitoring
  • Use AWS services (like S3 and related tools) to build cost-effective, reliable data solutions
  • Develop and enforce consistent data practices, architecture, and documentation
  • Work closely with scientists, ML engineers, and medical experts to meet data needs
  • Build data systems that meet healthcare regulations and prioritize privacy

Preferred Qualifications

  • Exposure to multichannel sensor data or time series data (e.g., ECG/EKG, medical device data)
  • Experience with MLOps and exposure to machine learning workflows

Benefits

  • Competitive salaries
  • Stock options depending on employment type
  • Generous learning opportunities
  • Medical/dental/vision
  • Family planning/fertility
  • PTO (summer and winter breaks)
  • Financial wellness resources
  • 401(k) plans

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.