πUnited States
Data Engineer

Mercedes-Benz.io
πRemote - Portugal
Please let Mercedes-Benz.io know you found this job on JobsCollider. Thanks! π
Summary
Join Mercedes-Benz.io's Central Operations and Services Area as a Data Engineer to build and maintain a cutting-edge data platform. You will develop and maintain data pipelines, build data processing infrastructure, and leverage Azure cloud services. Responsibilities include data modeling, performance optimization, and collaboration with data analysts and scientists. The ideal candidate possesses strong software engineering experience, proficiency in PySpark and SQL, and expertise in Azure cloud services. This role offers opportunities for professional development and a collaborative, open-minded work environment.
Requirements
- 3+ years of hands-on software engineering experience, developing and maintaining data pipelines
- Strong experience with software engineering best practices (OOP, TDD, CI/CD, version control)
- Proficiency with PySpark and SQL, including the ability to optimize queries and understand Spark internals
- Proficient in Python for data processing and automation tasks
- Strong expertise with Azure cloud services, including Databricks, Data Factory, DevOps, and Datalake Storage
- Solid experience in dimensional data modelling and data warehousing principles
- Knowledge of Kubernetes and Docker for containerized applications
- Strong problem-solving and analytical thinking
- Excellent communication skills to interact with stakeholders and team members effectively
- Ability to work collaboratively in a fast-paced, team-oriented environment
Responsibilities
- Develop & Maintain Data Pipelines: Design, build, and optimize scalable and reliable data pipelines for cleaning, integrating, and transforming large datasets
- Infrastructure Development: Create and maintain data processing infrastructure to ensure high-quality data supply for digital analysts and data scientists
- Cloud Integration: Leverage Azure cloud services, including Data Factory, Databricks, and Datalake Storage, to deploy robust data solutions
- Data Modelling: Develop dimensional data models and maintain data warehousing solutions to support analytics and reporting needs
- Performance Optimization: Write efficient, maintainable, and well-tested Python and SQL code, ensuring optimal performance for large-scale datasets
- Spark Expertise: Work with Apache Spark to write and optimize PySpark code and SQL queries, understanding Spark internals for maximum efficiency
- Collaboration: Partner with data analysts, data scientists, and stakeholders to understand data requirements, ensure quality, and support data-driven decision-making
- Governance & Quality: Ensure data governance, monitor data quality, and implement best practices for data management
- Deployment & Monitoring: Support deployment and monitoring of machine learning pipelines and data solutions in production environments
Preferred Qualifications
Experience with Google Analytics data is a plus
Benefits
- Health insurance for you and your family
- Life insurance
- Proactive self-development in international Trainings and Conferences
- Language Training courses
- Brand Connection Perks
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Similar Remote Jobs
π°$204k-$255k
πUnited States
π°$191k-$223k
πUnited States
π°$204k-$255k
πUnited States
π°$138k-$254k
πWorldwide
πBrazil
πFrance, Spain
πLithuania
πArgentina
πWorldwide