Master Thesis in Visual Representation of Technical Diagrams for LLMs
Bosch
๐Remote - Germany
Please let Bosch know you found this job on JobsCollider. Thanks! ๐
Summary
Join Bosch and shape the future by contributing to a master's thesis focused on processing and representing technical diagrams for integration with Large Language Models (LLMs). You will research optimal methods for diagram comprehension, design a pipeline for creating a synthetic dataset linking diagrams to textual descriptions, and rigorously test the trained model on real-world diagrams. The project involves comparative analysis of state-of-the-art multi-modal models and investigating the potential of an abstract markup syntax for technical diagrams. The thesis is a 6-month position, with flexible work options available in Germany. Enrollment at a university is required.
Requirements
- Master studies in Computer Science, Data Science or comparable
- Proficiency in Python
- A self-driven, proactive and solution-oriented individual with an independent work ethic
- Strong passion for developing the best possible solutions
- Fluent in English
- Enrollment at university
Responsibilities
- Research optimal methods for processing and representing technical diagrams available solely as graphics for integration with Large Language Models (LLMs)
- Explore the challenges and limitations of current multi-modal LLMs when handling visual data and propose novel approaches for diagram comprehension
- Design and implement a pipeline for creating a synthetic dataset that links technical diagrams to textual descriptions in both directions (diagram-to-text and text-to-diagram)
- Investigate how variations in synthetic dataset quality and structure influence model training and performance
- Conduct rigorous testing of the trained model on real-world technical diagrams from diverse domains, such as engineering, biology or software development
- Develop metrics to assess the model's ability to interpret, explain, and generate meaningful outputs based on diagram inputs
- Perform a comparative analysis of state-of-the-art multi-modal models to determine their strengths and limitations in handling technical diagrams as well as identify key factors (e.g., model architecture, training data) that influence performance on diagram-related tasks
- Investigate the potential of an abstract markup syntax to serve as a universal intermediate representation for technical diagrams as well as evaluate how such a syntax could improve model interpretability, training efficiency, and generalization across domains
- Prototype and test possible syntaxes, assessing their practicality and alignment with existing diagramming standards
Preferred Qualifications
Some prior experience in building RAG systems or chatbots
Benefits
- Flexible work from home in Germany
- Work at the Bosch location in Abstatt
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.