Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/BUOS-APSR9L
Type: Tese de Doutorado
Title: Network-based methods for analyzing the genetics of human complexdiseases
Authors: Gilderlanio Santana de Araújo
First Advisor: Eduardo Martin Tarazona Santos
First Co-advisor: Maíra Ribeiro Rodrigues
Abstract: ...
Abstract: Interpreting the high volume of genomic data is a challenge for epidemiologists, anthropologists and geneticists that aim to understand the genetic basis of populations and phenotype variations (diseases/traits). In this respect, several computational methodsand tools have been developed to extract knowledge and data patterns from publicomic data for a diverse set of populations in genetic studies, and also to deal with transparency and reproducibility that has been two subjects of high importance in science. The genetic architecture of parental populations, such as African, European and Asian, has been subject of studies to understand the process of population structureand the origins of genetic diseases. It is well known that allele frequency and allele differentiation based on genotype data reveal hallmarks of dierential demographic history in worldwide populations, and some alleles are risk variants that confer disease risk and their frequency may lead to varying susceptibility to complex diseases. In this context, this thesis has two main contributions, one is a network-based approach to integrate and visualize data from NGHRI/EBI GWAS Catalog and 1000 Genomes Project, and the second is scientific workflow approach to disclose scientific knowledge. First, we present DANCE (Disease-Ancestry Network), a new web tool to improve the understanding of the genetic architecture of diseases in a cross-ethnic view. DANCE is a tool to integrate, summarize and visualize molecular profiles of genetic disease associations in a network-based approach. It was implemented as a web-based tool to explore genetic associations and risk allele dierentiation across global populations to support a broad set of genetic population analyses, such as GWAS replication and admixture mapping. Our networks are bipartite, where nodes are either phenotypes (diseases/traits) or SNPs and diseases are connected to SNPs if there is a known association in current GWAS studies. In a graphical projection, the population variability of risk-alleles frequencies is represented as a color gradient based on the pairwise FST values of dierent populations, where higher values point out highly dierentiated SNPs between the two populations. In addition, this study presents the EPIGEN Scientific Workflow (EPIGEN-SW), which aims to improve transparency and reproducibility in genetic and epidemiology studies.The EPIGEN-SW is implemented as a web tool and facilitates the access to computational resources through an integrative and interactive approach based on flowcharts, masterscripts and auxiliary scripts. Both approaches are implemented as web tools and made freely available for the scientific community. DANCE are available online at www.ldgh.com.br/dance and the EPIGEN-Brazil Scientific Workflow is available at www.ldgh.com.br/scientificworkflow.
Subject: Bioinformática
language: Português
Publisher: Universidade Federal de Minas Gerais
Publisher Initials: UFMG
Rights: Acesso Aberto
URI: http://hdl.handle.net/1843/BUOS-APSR9L
Issue Date: 27-Jun-2017
Appears in Collections:Teses de Doutorado

Files in This Item:
File Description SizeFormat 
gilderlaniosaraujo_tese.pdf18.94 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.