Senior Data Engineer
closed
NewRich Network
Summary
Join The NewRich Network as a part-time Data Engineer with the option to transition to full-time after three months. This remote position focuses on building data integrations using AWS and an ELK stack. You will design and build applications for data analysis and transformations on large datasets within a spark-based AWS environment. Collaboration with various teams is crucial, including product management, data research, and engineering teams. The role involves developing new features, improving existing integrations, and ensuring data governance and security compliance. The ideal candidate possesses extensive experience in data engineering, strong communication skills, and proficiency in relevant technologies.
Requirements
- 4 + years experience in Data Engineering
- Excellent communication and interpersonal skills are a MUST
- 3+ years of experience working on Apache Spark applications using Python (PySpark) or Scala
- Experience creating spark jobs that work on at least 1 billion records
- Strong knowledge of ETL architecture and standards
- Software development experience working with Apache Airflow, Spark, MongoDB, MySQL
- Strong SQL knowledge
- Strong command of Python
- Experience creating data pipelines in a production system
- Proven experience in building/operating/maintaining fault tolerant and scalable data processing integrations using AWS
- Ability to identify and resolve problems associated with production grade large scale data processing workflows
- Experience with crafting and maintaining unit tests and continuous integration
- Strong capacity to handle numerous projects are a must
Responsibilities
- Design and build applications that perform data analysis, transformations, aggregations, and other augmentations on large sets of data in a spark-based AWS environment (EMR, S3, Glue, Redshift, Athena)
- Evaluate various pipeline models, tools, and environments and implement these to push data from our sources through your transformations and finally to our customers
- Work with product management and data research teams to prototype and test new ideas then take those to production
- Work in a fast-paced, innovate-and-test environment
- Collaborate with Data architects, Enterprise architects, Solution consultants and Product engineering teams to gather customer data integration requirements, conceptualize solutions & build required technology stack
- Collaborate with enterprise customer's engineering team to identify data sources, profile and quantify quality of data sources, develop tools to prepare data and build data pipelines for integrating customer data sources and third party data sources
- Develop new features and improve existing data integrations with customer data ecosystem
- Encourage the team to think out-of-the-box and overcome engineering obstacles while incorporating new innovative design principles
- Collaborate with a Project Manager to bill and forecast time for product owner solutions
- Building data pipelines
- Reconciling missed data
- Acquire datasets that align with business needs
- Develop algorithms to transform data into useful, actionable information
- Build, test, and maintain database pipeline architectures
- Collaborate with management to understand company objectives
- Create new data validation methods and data analysis protocols
- Ensure compliance with data governance and security policies
Preferred Qualifications
- Bachelorβs degree in Computer Science, Engineering or a related discipline
- Experience using Docker or Kubernetes is a plus
- Passion for crafting Intelligent data pipelines that teams love to use
Benefits
Annual Salary Monthly Ranges from 9,000,000 to 12,000,000 COP