Summary

The job is for a Data Team Lead at Alexa Translations, responsible for managing a team, developing data collection strategies, and ensuring high-quality bilingual data sets for their AI engine. The ideal candidate has a background in data engineering, fluency in English (French is advantageous), and experience in data crawling tools, ETL processes, and big data platforms.

Requirements

Bachelor's degree (Master's Degree is an advantage) in Computer Science, Data Science, or a related field
At least 3 years of working experience working in a data engineering department, preferably as a Senior Data Engineer in a fast-paced environment and complex business setting
Experience in data collection, cleaning, and management, preferably in a linguistic or translation-related field
Fluency in English (French is an advantage), with excellent written and verbal communication skills in both languages
Strong analytical and problem-solving skills, with the ability to work with large and complex data sets
Extensive hands-on experience with data crawling tools (Scrapy, etc) and techniques, should be able to develop customized crawlers
Demonstrated experience in building and maintaining reliable and scalable ETL on big data platforms as well as experience working with varied forms of data infrastructure inclusive of relational databases such as SQL and Spark
Experience in data warehousing inclusive of dimensional modeling concepts and demonstrating proficiency in scripting languages, for example, Python, Perl, and so forth
Deep knowledge of data mining techniques, and relational, and non-relational databases
Sound understanding of building complex ETL pipelines using either open-source tools such as mage, Luigi and airflow or cloud-based solutions such as AWS Glue
Highly proficient in the use of MS Office products (Word, Excel, PowerPoint)
Ability to perform complex data analyses with large data volumes
Strong knowledge in Linux, OS tools, and file-system level troubleshooting
Substantial experience working with big data infrastructure tools such as Python, SQS, and Redshift
A suitable candidate will also be proficient in Scala, Spark, Spark Streaming, AWS

Responsibilities

Leading and managing a team of data specialists responsible for crawling domain-specific bilingual data for training the AI engine
Developing and implementing data collection strategies to ensure the acquisition of high-quality bilingual data sets
Overseeing the cleaning and preprocessing of crawled data to remove noise and ensure accuracy
Collaborating with other teams, such as engineering and linguistics, to understand data requirements and optimize data collection processes
Monitoring data quality and performance metrics, identifying areas for improvement and implementing solutions
Staying up-to-date with industry trends and best practices in data collection, cleaning, and management
Designing, deploying, and maintaining the business’s data platforms
Owning and extending the business’s data pipeline through the collection, storage, processing, and transformation of large data sets
Participating in the design, and providing insights and guidance on database technology and data modeling best practices
Developing and managing scalable data processing platforms used for exploratory data analysis and real-time analytics
Building a metadata system where all available data is maintained and cataloged
Developing ETL processes that convert data into formats through a team of data analysts and dashboard charts
Retrieving and analyzing data through the use of SQL, Excel, among other data management systems
Building data loading services for the purpose of importing data from numerous disparate data sources, inclusive of APIs, logs, relational, and non-relational databases
Developing reliable data pipelines that translate raw data into powerful useful data points

Preferred Qualifications

Working knowledge with CAT Tools such as: memoQ, SDL, Memsource
Desire to continue to grow professional capabilities with ongoing training and educational opportunities
AWS Cloud Applications and Services

Alexa Translations is hiring a
Data Engineering Lead in Worldwide

Summary

Requirements

Responsibilities

Preferred Qualifications

Similar Jobs

Alexa Translations is hiring aData Engineering Lead in Worldwide

Summary

Requirements

Responsibilities

Preferred Qualifications

Similar Jobs

Alexa Translations is hiring a
Data Engineering Lead in Worldwide