Visual Representation Of Technical Diagrams Intern

Bosch Logo

Bosch

πŸ“Remote - Germany

Summary

Join Bosch and shape the future of technology! This 6-month internship focuses on researching optimal methods for processing and representing technical diagrams for integration with Large Language Models (LLMs). You will explore challenges in multi-modal LLMs, propose novel approaches for diagram comprehension, and design a pipeline for creating a synthetic dataset linking diagrams to textual descriptions. Rigorous testing on real-world diagrams and comparative analysis of state-of-the-art models are key components. You will also investigate the potential of an abstract markup syntax for universal representation of technical diagrams. The internship offers flexible work arrangements, allowing for remote work from Germany or on-site in Abstatt.

Requirements

  • Master studies in Computer Science, Data Science or comparable
  • Proficiency in Python
  • Enrollment at university

Responsibilities

  • Research optimal methods for processing and representing technical diagrams available solely as graphics for integration with Large Language Models (LLMs)
  • Explore the challenges and limitations of current multi-modal LLMs when handling visual data and propose novel approaches for diagram comprehension
  • Design and implement a pipeline for creating a synthetic dataset that links technical diagrams to textual descriptions in both directions (diagram-to-text and text-to-diagram)
  • Investigate how variations in synthetic dataset quality and structure influence model training and performance
  • Conduct rigorous testing of the trained model on real-world technical diagrams from diverse domains, such as engineering, biology or software development
  • Develop metrics to assess the model's ability to interpret, explain, and generate meaningful outputs based on diagram inputs
  • Perform a comparative analysis of state-of-the-art multi-modal models to determine their strengths and limitations in handling technical diagram as well as identify key factors (e.g., model architecture, training data) that influence performance on diagram-related tasks
  • Investigate the potential of an abstract markup syntax to serve as a universal intermediate representation for technical diagram as well as evaluate how such a syntax could improve model interpretability, training efficiency, and generalization across domains
  • Prototype and test possible syntaxes, assessing their practicality and alignment with existing diagramming standards

Preferred Qualifications

  • Some prior experience in building RAG systems or chatbots
  • A self-driven, proactive and solution-oriented individual with an independent work ethic
  • Strong passion for developing the best possible solutions
  • Fluent in English

Benefits

Flexible work arrangements, allowing for remote work from Germany or on-site in Abstatt

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.