Lead Data Engineer

closed
Machinify, Inc. Logo

Machinify, Inc.

πŸ’΅ $220k-$250k
πŸ“Remote - United States

Summary

Join Machinify, a leading provider of AI-powered healthcare software, as a Staff/Lead Data Engineer. You will build and own critical data pipelines, working cross-functionally with engineering, product, and data science teams. This role requires deep experience in data engineering, ETL orchestration (preferably Apache Airflow), distributed computing (Spark), SQL, Python, and cloud platforms (AWS & GCP). You will map customer data, manage data quality, and ensure pipeline SLAs. The salary range is $220k-$250k, and the total compensation package includes equity, excellent healthcare, flexible time off, and other benefits.

Requirements

  • Deep experience as a hands-on Data Engineer building production data pipelines
  • Experience managing the delivery of complex data
  • Experience in ETL orchestration and workflow management tools with a strong preference for Apache Airflow
  • Experience in Spark or other distributed computing frameworks
  • SQL and Python
  • Advanced SQL performance tuning
  • Kubernetes and building Docker images
  • AWS & GCP
  • Experience working with APIs to collect or ingest data
  • Manage SLA for all pipelines in allocated areas of ownership
  • Streaming technologies like kafka , spark streaming etc
  • ELK stack , Grafana etc

Responsibilities

  • Independently understand all aspects of a business problem including those unrelated to their area of expertise, weigh pros and cons of different approaches and suggest ones likely to succeed
  • Work with a cross-functional organization including engineering, delivering, subject-matter experts, product managers, as well as platform engineers to deliver a scalable framework
  • Map the customer data into Machinify canonical form. Identify and ingest non canonical fields and generalize the process to a minimal level of customization
  • Proactively design and adapt the canonical form to suit changing query patterns and needs
  • Ultimately own data availability and quality for the Data Science organization

Benefits

  • Excellent healthcare
  • Flexible time off
  • Meaningful equity
This job is filled or no longer available

Similar Remote Jobs