Visual Representation Of Technical Diagrams Intern
Bosch
Summary
Join Bosch and shape the future of technology! This 6-month internship focuses on researching optimal methods for processing and representing technical diagrams for integration with Large Language Models (LLMs). You will explore challenges in multi-modal LLMs, propose novel approaches for diagram comprehension, and design a pipeline for creating a synthetic dataset linking diagrams to textual descriptions. Rigorous testing on real-world diagrams and comparative analysis of state-of-the-art models are key components. You will also investigate the potential of an abstract markup syntax for universal representation of technical diagrams. The internship offers flexible work arrangements, allowing for remote work from Germany or on-site in Abstatt.
Requirements
- Master studies in Computer Science, Data Science or comparable
- Proficiency in Python
- Enrollment at university
Responsibilities
- Research optimal methods for processing and representing technical diagrams available solely as graphics for integration with Large Language Models (LLMs)
- Explore the challenges and limitations of current multi-modal LLMs when handling visual data and propose novel approaches for diagram comprehension
- Design and implement a pipeline for creating a synthetic dataset that links technical diagrams to textual descriptions in both directions (diagram-to-text and text-to-diagram)
- Investigate how variations in synthetic dataset quality and structure influence model training and performance
- Conduct rigorous testing of the trained model on real-world technical diagrams from diverse domains, such as engineering, biology or software development
- Develop metrics to assess the model's ability to interpret, explain, and generate meaningful outputs based on diagram inputs
- Perform a comparative analysis of state-of-the-art multi-modal models to determine their strengths and limitations in handling technical diagram as well as identify key factors (e.g., model architecture, training data) that influence performance on diagram-related tasks
- Investigate the potential of an abstract markup syntax to serve as a universal intermediate representation for technical diagram as well as evaluate how such a syntax could improve model interpretability, training efficiency, and generalization across domains
- Prototype and test possible syntaxes, assessing their practicality and alignment with existing diagramming standards
Preferred Qualifications
- Some prior experience in building RAG systems or chatbots
- A self-driven, proactive and solution-oriented individual with an independent work ethic
- Strong passion for developing the best possible solutions
- Fluent in English
Benefits
Flexible work arrangements, allowing for remote work from Germany or on-site in Abstatt