Lead Data Engineer

Bertoni Solutions
Summary
Join our multinational team as a Lead Data Engineer and help us translate technology into our clients' success. We are seeking a highly skilled individual with expertise in PySpark, SQL, and Python, and a solid understanding of ETL and data warehousing principles. You will design, build, and maintain scalable data pipelines, collaborate with cross-functional teams, and ensure data quality and integrity. This is a 6-month contract position (with possible extension) that is 100% remote for nearshore candidates located in Central or South America. The role offers opportunities for professional development and career growth within a collaborative and inclusive environment. The contract is for independent contractors and does not include PTO, tax deductions, or insurance.
Requirements
- 8+ years of overall experience working with cross-functional teams (machine learning engineers, developers, product managers, analytics teams)
- 3+ years of hands-on experience developing and managing data pipelines using PySpark
- Strong programming skills in Python and SQL
- Deep understanding of ETL processes and data warehousing fundamentals
- Self-driven, resourceful, and comfortable working in dynamic, fast-paced environments
- Advanced written and spoken Engllish is must have for this position (B2, C1 or C2 only)
- 3+ years of experience with PySpark/Python, ETL and datawarehousing processes
- Proven leadership experience in a current project or previous projects/work experiences
- Advanced written and spoken English fluency is a MUST HAVE (from B2 level to C1/C2)
- MUST BE located in Central or South america, as this is a nearshore position (Please note that we are not able to consider candidates requiring relocation or those located offshore)
Responsibilities
- Design and develop scalable data pipelines using PySpark to support analytics and reporting needs
- Write efficient SQL and Python code to transform, cleanse, and optimize large datasets
- Collaborate with machine learning engineers, product managers, and developers to understand data requirements and deliver solutions
- Implement and maintain robust ETL processes to integrate structured and semi-structured data from various sources
- Ensure data quality, integrity, and reliability across pipelines and systems
- Participate in code reviews, troubleshooting, and performance tuning
- Work independently and proactively to identify and resolve data-related issues
- If applicable, contribute to Azure-based data solutions, including ADF, Synapse, ADLS, and other services
- Support cloud migration initiatives and DevOps practices, if relevant to the role
- Provide guidance on best practices and mentor junior team members when needed
Preferred Qualifications
- Databricks certification
- Experience with Azure-native services, including: Azure Data Lake Storage (ADLS), Azure Data Factory (ADF), Azure Synapse Analytics / Azure SQL DB / Fabric
- Familiarity with Event Hub, IoT Hub, Azure Stream Analytics, Azure Analysis Services, and Cosmos DB
- Basic understanding of SAP HANA
- Intermediate-level experience with Power BI
- Knowledge of DevOps, CI/CD pipelines, and cloud migration best practices
Benefits
Opportunities for professional development and career growth
Share this job:
Similar Remote Jobs
