Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/ESBF-8ZKMCP
Full metadata record
DC FieldValueLanguage
dc.contributor.advisor1Alberto Henrique Frade Laenderpt_BR
dc.contributor.advisor-co1Adriano Alonso Velosopt_BR
dc.contributor.referee1Adriano Alonso Velosopt_BR
dc.contributor.referee2Gisele Lobo Pappapt_BR
dc.contributor.referee3Renato Martins Assuncaopt_BR
dc.contributor.referee4Luiz Enrique Zaratept_BR
dc.creatorDiego Marinho de Oliveirapt_BR
dc.date.accessioned2019-08-09T20:06:57Z-
dc.date.available2019-08-09T20:06:57Z-
dc.date.issued2012-10-26pt_BR
dc.identifier.urihttp://hdl.handle.net/1843/ESBF-8ZKMCP-
dc.description.abstractThe task of entity named recognition is to locate and classify elements in unstructured text through techniques of natural language processing appropriate to the application domain. In the Web context, this task is critical to the identification of entities such as people, organizations, places, among others. Recently, microblogs like Twitter and Tumblr became a phenomenon on the Web, representing a new challenge for the recognition of entities. In Twitter, for example, traffic a large volume of messages in a short time, dificulting the task and the extraction of information about a particular subject. Moreover, the Twitter environment is quite dynamic and driven by data stream, requiring thus tools and methods suited to its characteristics. There is not in the literature, however, many works that address this issue, showing a wide area of ​​research to be conducted for named entity recognition in this environment. Thus, this master thesis proposes an alternative approach to perform this task called FS-NER (Filter Stream Named Entity Recognition). The FS-NER approach is based on the use of filters in an independent and fast manner, highly scalable and suitable for the environment of the Twitter for named entity recognition. In order to evaluate the effectiveness of the proposed approach, we carried out an exhaustive set of experiments using messages of Twitter. In these experiments, we used three distinct collections: one containing messages in English, one in Portuguese and third in several languages. The results showed that despite the simplicities of the filters used, the proposed approach was able to outperform the other approach based on Conditional Random Fields with improvement mean of 3% for the F1 metric. Moreover, this approach presents orders of magnitude faster and therefore more suitable for the typical data stream paradigm of Twitter.pt_BR
dc.description.resumoA tarefa de reconhecimento de entidades consiste em localizar e classificar elementos em um texto não estruturado por meio de técnicas de processamento de linguagem natural apropriadas ao domínio da aplicação. Recentemente, microblogs como o Twitter, por exemplo, tornou-se um fenômeno na Web, representando um novo desafio para o reconhecimento de entidades. Dessa forma, este trabalho propõe uma abordagem alternativa denominada FS-NER (Filter Stream Named Entity Recognition) que se baseia na utilização de filtros de forma independente e rápida, altamente escalável e adequada ao ambiente do Twitter para o reconhecimento de entidades. Os resultados obtidos demonstraram que apesar da simplicidades dos filtros usados, a abordagem FS-NER foi capaz de superar as outras baseadas em Conditional Random Fields com melhoria média de 3% para a métrica F1. Além disso, essa abordagem apresenta ordem de magnitude mais rápida e, portanto, mais apropriada para o paradigma de fluxo de dados do Twitter.pt_BR
dc.languagePortuguêspt_BR
dc.publisherUniversidade Federal de Minas Geraispt_BR
dc.publisher.initialsUFMGpt_BR
dc.rightsAcesso Abertopt_BR
dc.subjectRedes sociaispt_BR
dc.subjectMicroblogspt_BR
dc.subjectTwitterpt_BR
dc.subjectConditional random fieldspt_BR
dc.subjectReconhecimento de entidadespt_BR
dc.subject.otherComputaçãopt_BR
dc.subject.otherRedes sociais on-linept_BR
dc.subject.otherTwitterpt_BR
dc.titleUma abordagem baseada em fluxo de filtros para o reconhecimento de entidades em mensagens do twitterpt_BR
dc.typeDissertação de Mestradopt_BR
Appears in Collections:Dissertações de Mestrado

Files in This Item:
File Description SizeFormat 
disserta__o___diegomoliveira.pdf1.97 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.