AI Intern

G-P Logo

G-P

πŸ“Remote - Ireland

Summary

Join G-P's diverse, remote-first team as an AI Intern and contribute to cutting-edge AI-driven solutions for monitoring, optimizing, and securing our data platform. This internship offers the chance to apply AI/ML techniques to real-world Big Data and cloud-based challenges, working at the intersection of artificial intelligence, data engineering, and distributed systems. You will design and build AI solutions, leverage machine learning for forecasting and resource allocation, develop failure prediction models, and build a Flask-based web application. The internship provides an opportunity to expand your skills and positively impact lives globally. G-P offers competitive compensation and benefits.

Requirements

  • Programming proficiency in Python
  • Understanding of Machine Learning concepts and frameworks (PyTorch, TensorFlow)
  • Version control with Git

Responsibilities

  • Design and build solutions leveraging AI to detect anomalies and deviations in data and provide real-time alerts to enable quick responses and mitigate risks. For example, automatically identify and flag potential occurrences of sensitive information in plain text format within diverse datasets
  • Work closely with data engineers and analysts to implement scalable AI-driven solutions, optimize model performance, and enhance data quality monitoring
  • Leverage machine learning and AI techniques to forecast query execution time, considering factors such as query complexity, data volume, and system load, leading to improved query scheduling and prioritization in large-scale data platforms
  • Use ML models to recommend optimal resource allocation for data pipelines based on past usage trends. By analyzing historical usage patterns and current system states, the system anticipates future resource needs and optimizes allocation decisions. Integrate intelligent recommendations, improving pipeline efficiency, cost-effectiveness, and scalability
  • Develop a failure prediction model using historical pipeline failure patterns to proactively mitigate issues. This proactive monitoring solution continuously analyzes performance metrics, log data, and system events to detect anomalies that signal potential failures. Integrate predictive insights, enabling proactive issue resolution and improving pipeline reliability
  • Build Flask-based web application within Databricks to display real-time status of key data products. This application will provide a centralized dashboard with visual indicators (e.g., green/red status, last refresh time) to monitor data pipeline health and freshness

Preferred Qualifications

  • Familiarity with cloud platforms (AWS S3, EC2)
  • Experience with distributed data processing frameworks (e.g., Spark)
  • Familiarity with LLM models and Gen AI

Benefits

Competitive compensation and benefits

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.