Summary

Join G-P's diverse, remote-first team as an AI Intern and contribute to cutting-edge AI-driven solutions for monitoring, optimizing, and securing our data platform. This internship offers the chance to apply AI/ML techniques to real-world Big Data and cloud-based challenges, working at the intersection of artificial intelligence, data engineering, and distributed systems. You will design and build AI solutions, leverage machine learning for forecasting and resource allocation, develop failure prediction models, and build a Flask-based web application. The internship provides an opportunity to expand your skills and positively impact lives globally. G-P offers competitive compensation and benefits.

Requirements

Programming proficiency in Python
Understanding of Machine Learning concepts and frameworks (PyTorch, TensorFlow)
Version control with Git

Responsibilities

Design and build solutions leveraging AI to detect anomalies and deviations in data and provide real-time alerts to enable quick responses and mitigate risks. For example, automatically identify and flag potential occurrences of sensitive information in plain text format within diverse datasets
Work closely with data engineers and analysts to implement scalable AI-driven solutions, optimize model performance, and enhance data quality monitoring
Leverage machine learning and AI techniques to forecast query execution time, considering factors such as query complexity, data volume, and system load, leading to improved query scheduling and prioritization in large-scale data platforms
Use ML models to recommend optimal resource allocation for data pipelines based on past usage trends. By analyzing historical usage patterns and current system states, the system anticipates future resource needs and optimizes allocation decisions. Integrate intelligent recommendations, improving pipeline efficiency, cost-effectiveness, and scalability
Develop a failure prediction model using historical pipeline failure patterns to proactively mitigate issues. This proactive monitoring solution continuously analyzes performance metrics, log data, and system events to detect anomalies that signal potential failures. Integrate predictive insights, enabling proactive issue resolution and improving pipeline reliability
Build Flask-based web application within Databricks to display real-time status of key data products. This application will provide a centralized dashboard with visual indicators (e.g., green/red status, last refresh time) to monitor data pipeline health and freshness

Preferred Qualifications

Familiarity with cloud platforms (AWS S3, EC2)
Experience with distributed data processing frameworks (e.g., Spark)
Familiarity with LLM models and Gen AI

Benefits

Competitive compensation and benefits

AI Intern

G-P

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

All Others

Intern

Share this job:

Similar Remote Jobs

Remote

Software Development

Intern

Remote

All Others

Intern

Remote

All Others

Intern

Remote

All Others

Intern

Data

Intern

Software Development

Intern

Red Cell Partners

Remote

Data

Intern

Remote

Data

Intern

Remote

Software Development

Intern