Data Engineer

MAS Global Consulting Logo

MAS Global Consulting

πŸ“Remote - Worldwide

Summary

Join MAS Global, a leading digital engineering services company, as a Data Engineer. You will design, develop, and manage scalable data pipelines using advanced technologies like Java, Apache Spark, and AWS. Responsibilities include optimizing job execution, troubleshooting production issues, implementing event-driven architectures, and performing data validation. You will collaborate with cross-functional teams and manage AWS resources. The ideal candidate possesses strong experience in Java, Apache Spark, AWS services, SQL, and databases like Cassandra, Hive, and Snowflake. A Bachelor's degree in a related field is required.

Requirements

  • Bachelor's degree in Computer Science, Information Technology, or a related field
  • Strong experience in Java, Apache Spark, and other big data technologies
  • Proficient in AWS services (SNS, SQS, Lambda, EventBridge, EMR) and deploying applications using Terraform
  • Expertise in SQL and experience with databases like Cassandra, Hive, and Snowflake
  • Familiarity with machine learning concepts and integration into data pipelines
  • Experience with CI/CD automation and version control systems (e.g., Git)
  • Strong understanding of distributed systems, data governance, and security best practices
  • Familiar with tools like Splunk and Dynatrace for monitoring and troubleshooting
  • Excellent problem-solving, communication, and collaboration skills

Responsibilities

  • Design, develop, and manage scalable data pipelines using advanced technologies like Java, Apache Spark, and AWS to process large-scale data and ensure seamless data integration and transformation
  • Optimize job execution and performance for both real-time and batch data workflows, ensuring efficient processing and minimal downtime
  • Troubleshoot and resolve production issues to maintain system reliability and performance across platforms
  • Implement and manage event-driven architectures using AWS services (SNS, SQS, Lambda, EventBridge) to deliver processed data to downstream applications
  • Perform advanced data validation, reconciliation checks, and error tracking mechanisms to ensure data accuracy, integrity, and security
  • Develop complex SQL queries for data manipulation, including deduplication, filtering, and encryption to secure sensitive financial data
  • Collaborate with cross-functional teams to design and deliver data-driven solutions that meet business needs and align with organizational goals
  • Manage AWS resources and deploy infrastructure-as-code using Terraform to maintain consistent environments across development, staging, and production
  • Maintain and manage operational storage in Cassandra DB and Hive tables for large-scale querying and error logging
  • Develop and maintain automated workflows using Shell scripts, Control-M, and AWS EMR to enable smooth pipeline execution and job scheduling
  • Utilize Snowflake as the central data warehouse for analytics, integrating structured and unstructured data into a unified reporting framework
  • Develop and maintain Java-based test suites using JUnit to validate pipeline integrity and conduct unit and integration testing
  • Integrate machine learning models into data pipelines to enhance customer experiences with personalized recommendations
  • Monitor and troubleshoot distributed systems, including Spark job failures, Lambda timeouts, and performance issues
  • Enforce data governance practices to maintain data security and prevent unauthorized access or data leaks

Preferred Qualifications

  • Experience with Scala, Python, or other programming languages
  • Familiarity with financial data processing and transaction systems
  • Experience with testing frameworks such as Mockito for unit testing

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.