Anotação de textos não canônicos: um estudo exploratorio de Grande sertão: veredas pelas dependências universais
Carregando...
Data
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Minas Gerais
Descrição
Tipo
Artigo de evento
Título alternativo
Primeiro orientador
Membros da banca
Resumo
This paper reports on an exploratory study of a sample of 175 sentences retrieved from the renowned Brazilian novel Grande Sertão: Veredas [Portuguese for Great Backlands: Paths; English translation: The devil to pay in the backlands], which were annotated for POS and syntactic relations following the Universal Dependencies guidelines. The study aimed to explore the feasibility of annotating non-canonical text to create treebanks for Brazilian Portuguese. We computed accuracy and precision of the model in order to verify categories annotated more and less successfully. The results show the model performed slightly better for POS than dependency relations and pointed out categories with higher demand for manual revision as being those related to orality phenomena represented by Guimarães Rosa in his novel. The study shows the potential of annotating noncanonical text to enhance existing models with categories less represented in the treebanks.
Abstract
Assunto
Processamento da linguagem natural (Computação), Linguística de corpus
Palavras-chave
Universal Dependencies, Non-canonical text, Brazilian Portuguese
Citação
Departamento
Curso
Endereço externo
https://aclanthology.org/2022.udfestbr-1.1.pdf