CUDA-based parallelization of power iteration clustering for large datasets

dc.creatorGustavo Rodrigues Lacerda Silva
dc.creatorRafael Ribeiro de Medeiros
dc.creatorBrayan Rene Acevedo Jaimes
dc.creatorCarla Caldeira Takahashi
dc.creatorDouglas Alexandre Gomes Vieira
dc.creatorAntônio de Pádua Braga
dc.date.accessioned2025-04-04T13:43:35Z
dc.date.accessioned2025-09-09T00:16:28Z
dc.date.available2025-04-04T13:43:35Z
dc.date.issued2017
dc.identifier.doihttps://doi.org/10.1109/ACCESS.2017.2765380
dc.identifier.issn2169-3536
dc.identifier.urihttps://hdl.handle.net/1843/81294
dc.languageeng
dc.publisherUniversidade Federal de Minas Gerais
dc.relation.ispartofIEEE Access
dc.rightsAcesso Aberto
dc.subjectOtimização matemática
dc.subjectBanco de dados
dc.subject.otherGraphics processing units , Clustering algorithms , Kernel , Eigenvalues and eigenfunctions , Clustering methods , Instruction sets , Symmetric matrices
dc.subject.otherScalable machine learning algorithms , GPU , power iteration clustering
dc.subject.otherLarge Datasets , Parallelization , Clustering Algorithm , Real Applications , Graphics Processing Unit , Clustering Quality , Good Scalability , Clustering Method , Image Segmentation , Massive Data , Row Vector , Graphical User Interface , Intel Xeon , Aerial Images , GB Memory , Spectral Method , Order Of Complexity , Spectral Clustering , Affinity Matrix , Code Version , Graphics Processing Unit Memory , Shared Memory , Spectral Clustering Method , Parallel Implementation , Dominant Eigenvalue , Hardware Configuration , Set Of Kernels , Projection Matrix
dc.titleCUDA-based parallelization of power iteration clustering for large datasets
dc.typeArtigo de periódico
local.citation.epage1
local.citation.spage1
local.citation.volume5
local.description.resumoThis paper presents a new clustering algorithm, the GPIC, a graphics processing unit (GPU) accelerated algorithm for power iteration clustering (PIC). Our algorithm is based on the original PIC proposal, adapted to take advantage of the GPU architecture, maintaining the algorithm’s original properties. The proposed method was compared against the serial implementation, achieving a considerable speedup in tests with synthetic and real data sets. A significant volume of real data application ( >107 records) was used, and we identified that GPIC implementation has good scalability to handle data sets with millions of data points. Our implementation efforts are directed towards two aspects: to process large data sets in less time and to maintain the same quality of the clusters results generated by the original PIC version.
local.publisher.countryBrasil
local.publisher.departmentENG - DEPARTAMENTO DE ENGENHARIA ELETRÔNICA
local.publisher.initialsUFMG
local.url.externahttps://ieeexplore.ieee.org/document/8078163

Arquivos

Pacote original

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
CUDA-based parallelization of power iteration clustering for large datasets.pdf
Tamanho:
3.59 MB
Formato:
Adobe Portable Document Format

Licença do pacote

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
License.txt
Tamanho:
1.99 KB
Formato:
Plain Text
Descrição: