Data Engineer

SandboxAQ Logo

SandboxAQ

๐Ÿ’ต $125k-$175k
๐Ÿ“Remote - United States, Canada

Summary

Join SandboxAQ, a high-growth company delivering AI solutions, as a generalist software engineer. Build and operate scalable data pipelines for AI and simulation in chemistry and life sciences. Develop tools to ingest, process, and serve large amounts of data, improving developer experience and velocity. Collaborate with R&D and product teams, design data models, and implement data processing and storage systems. Manage infrastructure for SandboxAQโ€™s simulation and data platform. Review code and participate in design and architectural discussions. This role requires experience with Python, databases, cloud infrastructure, data pipelines, and CI/CD.

Requirements

  • 3+ years of experience with Python, with strong knowledge of software design principles
  • Understanding of database principles and best practices
  • Experience with large-scale analytic databases. BigQuery preferred
  • 2+ years of experience with infrastructure as code management of public cloud providers. Familiar with terraform. GCP preferred
  • 2+ years of experience building data pipelines or data processing systems at scale
  • Experience with orchestration tools like Airflow
  • Experience with writing and optimizing database queries, graph database experience is a plus
  • Strong understanding of web and network fundamentals and experience with designing, building, and testing web APIs
  • Knowledge of CI/CD best practices and building CI/CD pipelines
  • Excellent communication and collaboration skills, with the ability to effectively influence a cross-functional team

Responsibilities

  • Build and operate scalable data pipelines for data ingestions, processing, analytics, and storage
  • Optimize performance and cost-effectiveness of data pipelines and storage
  • Maintain data warehouse and data lake solutions
  • Collaborate closely with R&D teams to build and operate data tooling to meet project goals
  • In collaboration with domain experts, design and implement data models for scientific data and APIs to store and manipulate data across file storage, graph databases, and relational databases
  • Contribute to the design and implementation of complex, security-sensitive data processing and storage systems with complex tenancy and data isolation requirements
  • Collaborate closely with the product team and internal stakeholders in all phases of software development to validate the solutions you propose and implement
  • In collaboration with the rest of the engineering team, build and manage infrastructure for SandboxAQโ€™s simulation and data platform
  • Review code and participate in design and architectural discussions

Preferred Qualifications

  • Experience with GraphQL and familiarity with Strawberry or similar
  • Experience with FastAPI
  • Experience with CircleCI

Benefits

  • Medical/dental/vision
  • Family planning/fertility
  • PTO (summer and winter breaks)
  • Financial wellness resources
  • 401(k) plans
  • Competitive salaries
  • Stock options depending on employment type
  • Generous learning opportunities

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs