Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/ESBF-BALHJU
Type: Dissertação de Mestrado
Title: Uma Nova Amostragem de Descritores para Predição de Atividade Biológica
Authors: João Vitor Soares Tenório
First Advisor: Loïc Pascal Gilles Cerf
First Referee: João Paulo Ataide Martins
Second Referee: Raquel Cardoso de Melo
Third Referee: Renato Martins Assuncao
Abstract: O planejamento de fármacos auxiliado por computador (CADD) usa modelos preditivos para planejar e aprimorar compostos que possuem atividade biológica e podem ser usados como fármacos. O LQTA-QSAR é uma técnica para CADD, onde a amostragem dos descritores usados para treinar o modelo preditivo é feita inserindo os perfis de amostragem conformacional (PAC) dos compostos em uma grade 3D, para calcular a interação entre o PAC e uma sonda nos pontos dessa grade. O problema dessa amostragem é que quando a sonda passa por pontos internos ao PAC, são amostrados descritores com valores irreais. Essa dissertação propõe uma nova amostragem que considera o formato do PAC e impede que a sonda passe por pontos internos ou próximos demais ao PAC. Foram realizados experimentos em conjuntos de compostos usados como fármacos para tratamento de diversas doenças. A proposta conseguiu melhorar a precisão dos modelos preditivos nos seis cenários avaliados. O maior aumento percentual obtido foi de 44%.
Abstract: Machine learning methods are being used to solve different problems in the areas of bioinformatics and chemometrics. One such problem is computer-aided drug design (CADD), which uses predictive modeling to design and improve compounds that have biological activity and can be used as drugs. One of the techniques used CADD is the study of quantitative structure-activity relationships (QSAR), which allows to develop a predictive model that relates the properties of the compounds and their biological activities, this model is typically a linear regression. LQTA-QSAR is a 4D-QSAR technique, where the descriptors used for predictive model training are sampled by aligning the conformational ensemble profiles (CEP) of the compounds in a 3D grid and calculating the interaction between the CEP and a probe (it can be an atom, ion, or functional group) in each point of this grid. The problem with this sampling is that the probe crosses the CEP, when the probe falls into or close to an atom of the CEP, some descriptors presents unrealistic values. To overcome this problem, a new approach for sampling descriptors was proposed in this thesis, which uses surface expansions defined by the convex hull to construct layers around the CEP where the probe must pass. This sampling prevents the probe from passing through the points inside or too close the CEP. To validate the proposal, several experiments were carried out on sets of compounds that can be used as drugs for the treatment of several diseases. The results showed that the proposal was able to build predictive models with greater precision than the original method in the six scenarios evaluated. The highest percentage increase was 44%. We also proposed a workflow where linear regression was replaced by regression tree, which allows to build models easier to interpret. Experiments with this new workflow were also carried out in six scenarios, where in one case the precision was superior to the linear models and in the other cases it was lower, but still satisfactory.
Subject: Aprendizado do Computador
Bioinformática
Computação
QSAR (Bioquímica)
Quimiometria
language: Português
Publisher: Universidade Federal de Minas Gerais
Publisher Initials: UFMG
Rights: Acesso Aberto
URI: http://hdl.handle.net/1843/ESBF-BALHJU
Issue Date: 26-Oct-2018
Appears in Collections:Dissertações de Mestrado

Files in This Item:
File Description SizeFormat 
joaovitorsoares_tenorio.pdf4.32 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.