Exploring linguistic information and semantic contextual models for a relation extraction task using deep learning

Exploring linguistic information and semantic contextual models for a relation extraction task using deep learning

Schmitt, Bruna Koch

URI: http://www.repositorio.jesuita.org.br/handle/UNISINOS/9214

Fecha: 2020-03-23

xmlui.dri2xhtml.METS-1.0.item-contributorAdvisor: Rigo, Sandro José

Resumen:

Deep Learning (DL) methods have been extensively used in many Natural Language Processing (NLP) tasks, including in semantic relation extraction. However, the performance of these methods is dependent on the type and quality of information being used as features. In NLP, linguistic information is being increasingly used to improve the performance of DL algorithms, such as pre-trained word embeddings, part-of-speech (POS) tags, synonyms, etc, and the use of linguistic information is now present in several state-of-the-art algorithms in relation extraction. However, no effort has been made to understand exactly the impact that linguistic information from different levels of abstraction (morphological, syntactic, semantic) has in these algorithms in a semantic relation extraction task, which we believe may bring insights in the way deep learning algorithms generalize language constructs when compared to the way humans process language. To do this, we have performed several experiments using a recurrent neural network (RNN) and analyzed how the linguistic information (part-of-speech tags, dependency tags, hypernyms, frames, verb classes) and different word embeddings (tokenizer, word2vec, GloVe, and BERT) impact on the model performance. From our results, we were able to see that different word embeddings techniques did not present significant difference on the performance. Considering the linguistic information, the hypernyms did improve the model performance, however the improvement was small, therefore it may not be cost effective to use a semantic resource to achieve this degree of improvement. Overall, our model performed significantly well compared to the existing models from the literature, given the simplicity of the deep learning architecture used, and for some experiments our model outperformed several models presented in the literature. We conclude that with this analysis we were able to reach a better understanding of whether deep learning algorithms require linguistic information across distinct levels of abstraction to achieve human-like performance in a semantic task.

Mostrar el registro completo del ítem