Summary

Join Fluent, a commerce media solutions provider, as a Senior Data Engineer to build and support scalable data pipelines using PySpark and Spark Structured Streaming. Leverage your Databricks and Spark expertise to create enterprise-grade data products that power Fluent’s business lines. Partner with Data Architects, Data Scientists, and Product Managers to transform data models and build real-time pipelines. Contribute to elevating code quality, observability, and architecture design. This fully remote role (Ontario) requires occasional travel to NYC or Toronto. You will implement monitoring and observability, collaborate cross-functionally, and utilize AWS services. Stay current on emerging trends within the Databricks and data engineering ecosystem.

Requirements

5+ years of experience in Data Engineering, including strong Spark (PySpark) and SQL expertise
3+ years of hands-on experience building pipelines on Databricks (Workflows, Notebooks, Delta Lake)
Deep understanding of Apache Spark distributed processing model and internals
Strong experience with streaming data architectures and event-driven processing using Kafka
Familiarity with Databricks metrics, observability, and monitoring features
Understanding of Unity Catalog and Lakehouse architecture
Knowledge of idempotent processing patterns and robust data modeling
Proficiency in Git-based, CI/CD-driven development workflows
Strong debugging, optimization, and performance tuning skills
Proven experience building large-scale data pipelines handling massive volumes of data

Responsibilities

Design, build, and support scalable real-time and batch data pipelines using PySpark and Spark Structured Streaming
Develop pipelines following the Bronze → Silver → Gold architecture using Delta Lake and Enterprise Data Model best practices
Integrate with Kafka for event-driven ingestion and stream processing
Orchestrate workflows with Databricks Workflows/Jobs and DABs
Implement monitoring and observability—Databricks metrics, dashboards, and alerts to ensure pipeline reliability and performance
Collaborate cross-functionally in agile sprints with Product Managers, Data Scientists, and downstream data consumers
Partner closely with Data Architects to translate Enterprise Data Models into performant physical data models
Write clean, modular, and version-controlled code in Git-based CI/CD environments; perform rigorous peer reviews
Implement robust logging, error handling, and data quality validation throughout pipelines
Utilize AWS services (S3, IAM, Secrets Manager) for storage and infrastructure
Evangelize engineering best practices through brown bags, tech talks, and documentation
Stay current on emerging trends within the Databricks and data engineering ecosystem

Preferred Qualifications

Familiarity with schema management tools such as Schema Registry
Experience with data validation frameworks (Great Expectations, Deequ)
Exposure to real-time ML systems and feature pipelines
Prior experience in startups or small agile teams
Exposure to test-driven development in data engineering

Benefits

Competitive compensation
Ample career and professional growth opportunities
New Headquarters with an open floor plan to drive collaboration
Health, dental, and vision insurance
Pre-tax savings plans and transit/parking programs
401K with competitive employer match
Volunteer and philanthropic activities throughout the year
Educational and social events
The amazing opportunity to work for a high-flying performance marketing company!

Senior Data Engineer

Fluent

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

Data

Senior

Share this job:

Similar Remote Jobs

Remote

Data

Senior

Remote

Data

Senior

Netskope

Remote

Data

Senior

Remote

Data

Senior

Included Health

Remote

Software Development

Senior

United States Department of Defense

Remote

Data

Senior

Wealth

Remote

Data

Senior

Remote

Data

Senior

Remote

DevOps

Senior

Remote

Data

Senior