On the fractal patterns of language structures

Leonardo Costa Ribeiro; Américo Tristão Bernardes; Heliana Ribeiro de Mello

doi:https://doi.org/10.1371/journal.pone.0285630

On the fractal patterns of language structures

Arquivos

On the fractal patterns of language structures.pdf (1.85 MB)

Data

2023-05-18

Autor(es)

Leonardo Costa Ribeiro

Américo Tristão Bernardes

Heliana Ribeiro de Mello

Editor

Universidade Federal de Minas Gerais

Tipo

Artigo de periódico

Resumo

Natural Language Processing (NLP) makes use of Artificial Intelligence algorithms to extract meaningful information from unstructured texts, i.e., content that lacks metadata and cannot easily be indexed or mapped onto standard database fields. It has several applications, from sentiment analysis and text summary to automatic language translation. In this work, we use NLP to figure out similar structural linguistic patterns among several different languages. We apply the word2vec algorithm that creates a vector representation for the words in a multidimensional space that maintains the meaning relationship between the words. From a large corpus we built this vectorial representation in a 100-dimensional space for English, Portuguese, German, Spanish, Russian, French, Chinese, Japanese, Korean, Italian, Arabic, Hebrew, Basque, Dutch, Swedish, Finnish, and Estonian. Then, we calculated the fractal dimensions of the structure that represents each language. The structures are multi-fractals with two different dimensions that we use, in addition to the token-dictionary size rate of the languages, to represent the languages in a three-dimensional space. Finally, analyzing the distance among languages in this space, we conclude that the closeness there is tendentially related to the distance in the Phylogenetic tree that depicts the lines of evolutionary descent of the languages from a common ancestor.

Assunto

Linguística estrutural

URI

https://hdl.handle.net/1843/82153

Departamento

FALE - FACULDADE DE LETRAS
FCE - DEPARTAMENTO DE CIÊNCIAS ECONÔMICAS

Coleções

Artigo de Periódico

Página do item completo

On the fractal patterns of language structures

Arquivos

Data

Autor(es)

Título da Revista

ISSN da Revista

Título de Volume

Editor

Descrição

Tipo

Título alternativo

Primeiro orientador

Membros da banca

Resumo

Abstract

Assunto

Palavras-chave

Citação

URI

Departamento

Curso

Endereço externo

Coleções

Avaliação

Revisão

Suplementado Por

Referenciado Por