Identification and understanding of Kinase Activating Missense Mutations

dc.creatorCarlos Henrique Miranda Rodrigues
dc.date.accessioned2019-08-11T17:59:29Z
dc.date.accessioned2025-09-08T22:50:10Z
dc.date.available2019-08-11T17:59:29Z
dc.date.issued2017-07-25
dc.description.abstractProtein phosphorylation and dephosphorylation play vital roles in a variety of cellular processes, and the balance between them must be closely regulated. Disturbances in the harmonic relationship between protein phosphorylation and dephosphorylation, through the introduction of dominant activating missense mutations in protein kinases, are known to be driver events of many cancer. Despite this, the identification of potential activating mutations has proven to be a difficult task, and has been limited to evolutionary and sequence-based comparisons with previously characterised mutations. This study aims to fill this gap by proposing a novel machine learning method for predicting missense activating mutations on protein kinases, named Kinact. Experimental data on 384 point mutations in 42 different protein kinases was collected from Kin-Driver, Clinvar and Ensembl databases. The resulting data sample was then manually curated and 258 mutations were mapped into solved 3D structures of the Protein Data Bank. Each protein was classified into one group of the Kinase Classification and a set of in-silico analysis were performed with sequence and structure data. The most descriptive features were then used as input for training and testing supervised learning algorithms and predictive classification models that rely on attributes solely from sequence level, structural level and in combination were generated. The best performing model was observed when a combination of structural and sequence-based features were used as evidence during the learning task, achieving a precision of up to 90% and Area Under ROC Curve of 0.96 under 10-fold cross-validation and precision of 81% and Area Under ROC Curve of 0.89 on blind tests. We show the best performing model of Kinact significantly outperforms the gold-standard methods used by clinical geneticists (p-value < 0.01), SIFT and PolyPhen-2, which achieved Area Under ROC Curve of 0.49 and 0.63 on the training data set, respectively and 0.67 and 0.53, respectively, on the blind test. Kinact conveniently combines high-performance open source web visualization tools to assist further research on how mutations affect protein kinases activity. The method is freely available as a user friendly, easy to use web server at <http://biosig.unimelb.edu.au/kinact/>
dc.identifier.urihttps://hdl.handle.net/1843/BUOS-ARRG8V
dc.languageInglês
dc.publisherUniversidade Federal de Minas Gerais
dc.rightsAcesso Aberto
dc.subjectBioinformática
dc.subject.otherKinase
dc.subject.otherActivating mutations
dc.subject.otherMachine learning
dc.subject.otherBioinformatics
dc.titleIdentification and understanding of Kinase Activating Missense Mutations
dc.typeDissertação de mestrado
local.contributor.advisor1Douglas Eduardo Valente Pires
local.contributor.referee1Lucas Bleicher
local.contributor.referee1Gisele Lobo Pappa
local.contributor.referee1Sandro Carvalho Izidoro
local.publisher.initialsUFMG

Arquivos

Pacote original

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
identification_and_understanding_of_kinase_carlos_rodrigues.pdf
Tamanho:
9.23 MB
Formato:
Adobe Portable Document Format