Microarray-based breast cancer classification using logistic regression and beyond

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de Minas Gerais

Descrição

Tipo

Tese de doutorado

Título alternativo

Primeiro orientador

Membros da banca

Vasco Ariston de Carvalho Azevedo
Jose Miguel Ortega
Carlos Ernesto Ferreira Starling
Braulio Roberto Gonçalves Marinho Couto

Resumo

Abstract

Cancer is a major global health problem with millions of new cancer cases emerging each year and millions of cancer-related deaths occurring per year. Breast cancer ranks as the first to affect women with the most disease-related cases being reported in developed countries but with the majority of deaths occurring in developing countries.In this PhD project, a novel and innovative genome-wide model was developed to classify breast cancer samples. This new logistic regression-based model that we propose uses a stabilizing term in that allows the assignment of values to parameters , a distinguishing feature among other methods which circumvents the need for variable pruning. Applying this methodology to classify samples found in NCBI's Gene Expression Omnibus (GEO) GSE65194, GSE20711 and GSE25055 data sets we obtained a minimum performance of 80% (both sensitivity and specificity). Genes associated with parameters i* holding extreme values were searched in the literature for a relation with breast cancer. Some hold no evidence in the literature of association with breast cancer but based on the rational followed during this PhD project, they were flagged to be investigated as yet-undiscovered candidates with potential diagnostic and/or therapeutic utilities in breast cancer.We examined the pattern and feature of a GRNs composed of TFs in MCF-7 breast cancer cell lines to provide valuable information relating breast cancer with some particular genes whose i* associated parameter values reveal extreme positive values and as such identify breast cancer prediction genes. The topological analysis of these networks, the direct correlation observed between some of the flagged genes with relevant TFs in the context of breast cancer and using the S-score system that has been used by many to confirm the tumour suppressor/oncogenic profile of genes in specific cancer types, allowed us to reveal some potential breast cancer prediction genes that are suggested to be be prioritized for further breast cancer clinical studies. These results establish the proof of concept for the proposed novel and innovative model to classify breast cancer samples that we propose here. A large number of oncolytic viruses have been proposed for cancer therapy, which includes Seneca Valley Virus. SEMA6A is a gene flagged by application of the new logistic regression model detailed in this PhD thesis, which produces a cell receptor. Keeping in mind that SVV-001 cancer cell tropism might be governed by binding to specific receptors on the surface of cancer cells, we hypothesize that this specific protein could be the door for Seneca Valley Virus V001 entrance in breast cancer cells. The results obtained make probable the creation of the complex Semaphorin-6A V001, indicating the oncolytic virus Seneca Valley Virus as a new therapeutic option to be considered and further studied for breast cancer treatment.

Assunto

Bioinformática

Palavras-chave

Bioinformática

Citação

Departamento

Curso

Endereço externo

Avaliação

Revisão

Suplementado Por

Referenciado Por