A fuzzy data reduction cluster method based on boundary information for large datasets

dc.creatorGustavo Rodrigues Lacerda Silva
dc.creatorPaulo Neto
dc.creatorLuiz Torres
dc.creatorAntônio Braga
dc.date.accessioned2025-05-13T14:59:17Z
dc.date.accessioned2025-09-09T00:44:57Z
dc.date.available2025-05-13T14:59:17Z
dc.date.issued2019
dc.identifier.doihttps://doi.org/10.1007/s00521-019-04049-4
dc.identifier.issn1433-3058
dc.identifier.urihttps://hdl.handle.net/1843/82228
dc.languageeng
dc.publisherUniversidade Federal de Minas Gerais
dc.relation.ispartofNeural computing and applications
dc.rightsAcesso Restrito
dc.subjectComputação
dc.subject.otherData reduction techniques can be considered a useful strategy to handle the heterogeneity and massiveness of big datasets by reducing the high data volume into a manageable size. One way to use data reduction in big datasets is to apply sampling approaches. Usually, these methods extract some piece of information from big datasets without resorting to high-performance computing.
dc.titleA fuzzy data reduction cluster method based on boundary information for large datasets
dc.typeArtigo de periódico
local.citation.epage10
local.citation.spage1
local.citation.volume31
local.description.resumoThe fuzzy c-means algorithm (FCM) is aimed at computing the membership degree of each data point to its corresponding cluster center. This computation needs to calculate the distance matrix between the cluster center and the data point. The main bottleneck of the FCM algorithm is the computing of the membership matrix for all data points. This work presents a new clustering method, the bdrFCM (boundary data reduction fuzzy c-means). Our algorithm is based on the original FCM proposal, adapted to detect and remove the boundary regions of clusters. Our implementation efforts are directed in two aspects: processing large datasets in less time and reducing the data volume, maintaining the quality of the clusters. A significant volume of real data application (> 106 records) was used, and we identified that bdrFCM implementation has good scalability to handle datasets with millions of data points.
local.publisher.countryBrasil
local.publisher.departmentENG - DEPARTAMENTO DE ENGENHARIA ELETRÔNICA
local.publisher.initialsUFMG
local.url.externahttps://link.springer.com/article/10.1007/s00521-019-04049-4

Arquivos

Licença do pacote

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
License.txt
Tamanho:
1.99 KB
Formato:
Plain Text
Descrição: