Sincronização de threads em hardware SIMD

Teo Milanez Brandao

Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/ESBF-9GXNJJ

Full metadata record

DC Field	Value	Language
dc.contributor.advisor1	Fernando Magno Quintao Pereira	pt_BR
dc.contributor.advisor-co1	Renato Antonio Celso Ferreira	pt_BR
dc.contributor.referee1	Sandro Rigo	pt_BR
dc.contributor.referee2	Omar Paranaiba Vilela Neto	pt_BR
dc.creator	Teo Milanez Brandao	pt_BR
dc.date.accessioned	2019-08-09T14:06:33Z	-
dc.date.available	2019-08-09T14:06:33Z	-
dc.date.issued	2013-08-23	pt_BR
dc.identifier.uri	http://hdl.handle.net/1843/ESBF-9GXNJJ	-
dc.description.abstract	Performance is constrained by power consumption in modern computer architectures.One way to reduce power consumption, and hence increase performance, is to eliminate redundant operations between assembly instructions.This redundancy elimination, however, is difficult, because it involves solving a costly on-line problem: the shortest common supersequence.Previous work have proposed many different heuristics to solve this problem at either the architecture, or at the compiler level.The sheer number of different algorithms, and the vast search space makes a comparison between them a herculean task.In this dissertation, we dive into this task, providing the most extensive comparative analysis of these different heuristics ever seen in the literature.We match the different heuristics along several dimensions, including the amount of thread-level or data-level parallelism that they deliver.Our results show that relatively simple heuristics, such as the so called MinPcSp can outperform very convoluted algorithms.From this comparison we draw subsidies to design, test and implement new heuristics to share redundant work between parallel threads.Our new algorithms improve on the previous works in non-trivial ways.When testing these algorithms in industrial-strength benchmarks, we have observed that some of them are able to reduce the number of instructions to be processed by a factor of 3x.	pt_BR
dc.description.resumo	O desempenho é limitado pelo consumo de energia em arquiteturas de computadores e uma forma de reduzir o consumo de energia e aumentando o desempenho é eliminar operações redundantes. Mas isso é difícil, porque envolve a solução de um problema caro: a supersequência mais curta. Trabalhos anteriores propuseram diferentes heurísticas para resolver o problema a nível de arquitetura ou compilador. Nesta dissertação fizemos a mais extensa análise comparativa das diferentes heurísticas já vista na literatura. Combinamos as diferentes heurísticas em várias dimensões, incluindo a quantidade de paralelismo a nível de thread e a nível de dados. Nossos resultados mostram que uma heurística simples como MinPcSp pode superar algoritmos complicados. Implementamos também novas heurísticas, que melhoraram os trabalhos anteriores de maneiras não-triviais. Ao testar estes algoritmos em grandes benchmarks, observamos que alguns são capazes de reduzir o número de instruções processadas por um fator de 3x.	pt_BR
dc.language	Português	pt_BR
dc.publisher	Universidade Federal de Minas Gerais	pt_BR
dc.publisher.initials	UFMG	pt_BR
dc.rights	Acesso Aberto	pt_BR
dc.subject	Compartilhamento de recursos de hardware	pt_BR
dc.subject	Redundância de instruções	pt_BR
dc.subject	SIMD	pt_BR
dc.subject	Sincronização de threads	pt_BR
dc.subject	Paralelismo a nível de dados	pt_BR
dc.subject.other	Arquitetura de computadores	pt_BR
dc.subject.other	Computação	pt_BR
dc.title	Sincronização de threads em hardware SIMD	pt_BR
dc.type	Dissertação de Mestrado	pt_BR
Appears in Collections:	Dissertações de Mestrado

Files in This Item:

File	Description	Size	Format
teomilanez.pdf		1.08 MB	Adobe PDF	View/Open

Show simple item record