Please find our joint appointments and professorships on our website: Joint appointments / Professorships

Thesis (f/m/x): Text-Object Relation Extraction in Engineering Drawings Using LLMs
Job Description
Req ID:  4254
Place of work:  Sankt Augustin
Starting date:  sofort
Career level:  Student research project and final thesis
Type of employment:  Part time
Duration of contract:  befristet auf 6 Monate

Remuneration: Remuneration is in accordance with the Collective Agreement for the Public Sector - Federal Government (TVöD-Bund)

Enter the fascinating world of the German Aerospace Center (Deutsches Zentrum für Luft- und Raumfahrt e. V.; DLR) and help shape the future through research and innovation! We offer an exciting and inspiring working environment driven by the expertise and curiosity of our 12,000 employees from 100 nations and our unique infrastructure. Together, we develop sustainable technologies and thus contribute to finding solutions to global challenges. Would you like to join us in addressing this major future challenge? Then this is your place!

The Institute for the Protection of Terrestrial Infrastructures is dedicated to enhance the resilience of critical infrastructure. Digital Twins enable monitoring and response to crises and attacks, as well as analysis and optimization of the resilience of these infrastructures.

 

What to expect

The Institute for the Protection of Terrestrial Infrastructures aims to ensure the stable supply of society. Digital twins enable monitoring and response to crises and attacks, as well as analysis and optimization of the resilience of terrestrial infrastructures. A major bottleneck is the conversion of engineering documentation, especially Piping & Instrumentation Diagrams (P&IDs) and electrical schematics, into structured, machine readable data. In this project you will research, prototype and evaluate methods that leverage textual annotations (labels, notes, part numbers) by combining large language model (LLM) features with object detectors used to predict graphs. 

 

Your tasks

  • implement state of the art OCR on P&IDs and electrical drawings and build a curated (synthetic + real) dataset for training and evaluation.
  • create a semi automatic annotation tool to tag text strings and their associated graphical symbols, producing ground truth relation tables (text ↔ object).
  • compare and evaluate several approaches to fuse the text features with symbol detection:
    • design rule based methods that use geometric proximity, alignment and domain specific cues (e.g., “text placed above a symbol usually describes it”).
    • extract semantic embeddings from the detected text with a pre trained LLM and combine them with CNN features of the symbols
    • build a classifier that receives text features + object features as input and predicts a binary “related / not related” output.
  • documentation, scientific analysis, and presentation of the investigation results

 

Your profile

  • ongoing studies in Computer Science, Mechatronics, Electrical Engineering, or a related field
  • willingness to work on complex topics
  • basic knowledge in the field of Machine Learning
  • good knowledge of programming with Python
  • ability to work independently and good communication and team skills
  • proficiency in English (written and spoken) for documentation and presentations


Desired Qualifications:

  • Experience with computer vision libraries such as OpenCV, PyTorch or TensorFlow
  • Knowledge of LLMs
  • Experience in scientific writing or presenting research results

We offer

DLR stands for diversity, appreciation and equality for all people. We promote independent work and the individual development of our employees both personally and professionally. To this end, we offer numerous training and development opportunities. Equal opportunities are of particular importance to us, which is why we want to increase the proportion of women in science and management in particular. Applicants with severe disabilities will be given preference if they are qualified.

We look forward to getting to know you!

 

If you have any questions about this position (Vacancy-ID 4254) please contact:

 

Tobias Koch 
Tel.: +49 2241 20148 55