Uses graph theory and NLP approaches to train a classifier for assigning text documents to relevant classes.
This is based on conceptual network distances.
Potential to generalise to unseen classes with unseen concepts since the classifier is trained on conceptual distances.
Conclusion
Core idea: linking structured data models with unstructured textual data.
Concept network is an intermediary between data schema (structured layer) and unstructured sources (which are themselves represented as an unstructured concept network).
Classifier uses proposed layered network structure to calculate concept distances for text documents. Uses the classifier to determine whether they are related or not.
Classifier assigns a text record to an class if the concepts in the record are semantically close enough to the concepts of the class in the concept networks.
Paper’s approach tests using a dataset of work order records.
Classifier targets relatedness, not necessarily the class.
Concepts are extracted from the data rather than defined by the user.
This means we can derive context from the training set instead of defining it ourselves.