UNIVERSIDADE FEDERAL DE MINAS GERAIS INSTITUTO DE CIÊNCIAS BIOLÓGICAS LABORATÓRIO DE GENÉTICA CELULAR E MOLECULAR PROGRAMA DE PÓS-GRADUAÇÃO EM BIOINFORMÁTICA Tese de Doutorado Validação de um método para predição de redes de interação proteína-proteína e sua aplicação em Corynebacterium pseudotuberculosis para identificar proteínas essenciais BELO HORIZONTE 2015 Edson Luiz Folador Validação de um método para predição de redes de interação proteína-proteína e sua aplicação em Corynebacterium pseudotuberculosis para identificar proteínas essenciais Defesa de tese apresentada como requisito parcial para a obtenção do título de Doutor em Bioinformática pelo programa de pós-graduação em Bioinformática do Instituto de Ciências Biológicas da Universidade Federal de Minas Gerais. Orientador: Prof. Dr. Vasco Ariston de Carvalho Azevedo Coorientadora: Profa. Dra. Rafaela Salgado Ferreira BELO HORIZONTE 2015 Eu dedico este trabalho principalmente a meus pais que, mal concluindo o ensino primário, com toda sabedoria sempre me motivaram a estudar e, na pessoa deles, dedico a todos os cientistas que jamais concluíram o ensino médio por não terem condições de sair dos locais de origem. Dedico também a meus filhos Jiuliane e Eduardo e, na pessoa deles, dedico a todos aqueles que permaneceram por anos distantes do conforto e abrigo de um lar familiar para conseguirem defender suas dissertações e teses. Dedico a minha esposa Adriana e ao nosso filho Arthur por serem agora motivação para eu seguir em frente. AGRADECIMENTOS Primeiramente e antes de tudo eu agradeço ao meu orientador professor doutor Vasco de Azevedo, não somente pela sua orientação, mas principalmente por, em um momento muito peculiar, ter acreditado em mim e em minha proposta de trabalho, ter me assistido e dado autonomia para executar o projeto proposto. Não esquecerei a oportunidade que me deste em um momento que todas as outras oportunidades me eram tiradas. Da mesma forma agradeço à professora doutora Rafaela Salgado Ferreira pelo suporte biológico e metodológico durante a orientação. Sem citar nomes para não ser injusto, agradeço ainda a todos os membros dos grupos de pesquisa do LGCM (UFMG), do LPDNA (UFPA) e colaboradores internacionais, secos e molhados, quais direta ou indiretamente, contribuíram das mais variadas formas para a conclusão deste trabalho. Agradeço também a toda equipe técnica e administrativa da UFMG e UFPA por todo suporte oferecido. “A imaginação é mais importante que o conhecimento. ” Albert Einstein Resumo Corynebacterium pseudotuberculosis (Cp) pertence ao grupo CMNR (Corynebacterium, Mycobacterium, Nocardia, Rhodococcus), é uma bactéria patogênica intracelular facultativa, gram-positiva, possui fimbrias, porém não se move, não forma capsulas e não esporula, apresenta-se nos biovares ovis e equi. O biovar equi infecta equinos e bovinos. O biovar ovis infecta principalmente rebanhos de ovinos e caprinos, sendo o agente etiológico de linfadenite caseosa (LC). Cp é prevalente em diversos países, causando significantes perdas econômicas devido à baixa qualidade de carcaças, queda na produção de carne, lã e leite. Os métodos para diagnóstico e tratamento de LC ainda não são suficientemente eficazes devido Cp apresentar baixa resposta terapêutica e habilidade em persistir no meio ambiente e no hospedeiro, sendo importante entender a biologia deste patógeno a nível sistêmico. Neste aspecto, conhecer as proteínas e suas interações é fundamental para compreender os mecanismos moleculares da célula, sendo as redes de interação proteína- proteína uma boa ferramenta para este tipo de estudo. Visando gerar a rede de interação para Cp, nos preocupamos em validar uma metodologia para a predição de interações com dados experimentais e curados disponíveis publicamente. Como resultado, além de aumentarmos a cobertura da rede, obtivemos uma área sobre a curva (AUC) entre 0,93 e 0,96, cujo ponto de corte de 0,70 representa uma especificidade de 0,95 e a uma sensibilidade de 0,90. Com a metodologia validada, foram geradas as redes de interação para nove linhagens do biovar ovis de Cp, sendo ~99% das interações mapeadas do gênero Corynebacterium e possuindo 15.495 interações conservadas entre as linhagens. Validação quanto ao menor caminho e distribuição do grau de interação sugerem que as redes preditas possuem características de redes biológicas. Adicionalmente, comparamos os valores do Coeficiente de Clusterização, Correlação e R2 contra redes geradas aleatoriamente e submetemos as redes geradas ao teste de normalidade Shapiro-Wilk. Todos os resultados demonstraram que as redes de interação preditas não possuem uma distribuição aleatória, sugerindo que as redes não foram formadas por interações espúrias, existindo uma influência biológica em sua predição. Com as redes validadas, selecionamos os primeiros 15% das proteínas com maior número de interações e identificamos 181 proteínas essenciais. Apenas a proteína DNA repair protein (RecN) não teve homologia com a base de dados de genes essenciais (DEG) e outras três tiveram homologia em apenas um organismo em DEG: Catalase (KatA), Endonuclease III (Nth) e Trigger factor (Tig), sugerindo que podem ser bons alvos para diagnóstico ou desenvolvimento de drogas. Abstract Corynebacterium pseudotuberculosis (cp) belongs to the group CMNR (Corynebacterium, Mycobacterium, Nocardia, Rhodococcus), is a gram-positive facultative intracellular pathogenic bacterium, have fimbriae, is non-motile, do not form capsules and not sporulate, is presented in serovar ovis and equi. The serovar equi infects horses and cattle. The serovar ovis mainly infects herds of sheep and goats, and is the etiological agent of caseous lymphadenitis (CLA). Cp is prevalent in many countries, causing significant economic losses due to poor quality carcasses decrease in the production of meat, wool and milk. Methods for diagnosis and treatment of CLA are not yet effective enough due Cp have low therapeutic response and ability to persist in the environment, making it an important organism to be researched and understood the systemic level. In this regard, knowing the proteins and their interactions is crucial to understand the molecular mechanisms of the cell, being protein- protein interaction networks an important tool for this type of study. Aiming to generate the Cp interaction network, we worry about validate a methodology for the prediction of interactions with experimental and cured data publicly available. As a result, in addition to increasing the coverage of the network, we obtained an area under the curve (AUC) between 0.93 and 0.96, representing the cutoff of 0.70 a specificity of 0.95 and a sensitivity 0.90. With the validated methodology, the interaction networks were generated for nine serovar ovis Cp strains, being ~99% of interactions mapped from Corynebacterium gender, possessing 15,495 interactions conserved between strains. The shortest path and the degree interaction distribution analysis suggests the predicted networks have biological characteristics. Additionally, we compared the values of the clustering coefficient, Correlation and R2 against randomly generated networks and submit the networks generated to the Shapiro-Wilk normality test. All results show that the predicted interaction networks do not have a random distribution, suggesting the networks were not formed by spurious interactions, existing biological bias its prediction. With validated network, we selected the first 15% of the proteins with more interactions and we identified 181 essential proteins. Only the protein DNA repair protein (RecN) had no homology against database of essential genes (DEG) and other three had homology in just one DEG organism: Catalase (KatA), Endonuclease III (Nth) and trigger factor (Tig ), suggesting they may be good targets for diagnosis and drug development. Lista de Figuras FIGURE 1 - ORGANISMS FROM WHICH THE INTERACTIONS WERE MAPPED. ............................................................................. 87 FIGURE 2 - PARTIAL C. PSEUDOTUBERCULOSIS DNA REPAIR RECN INTERACTIONS NETWORK. .................................................... 90 FIGURE 3 - HOMOLOGY DISTRIBUTION OF CP ESSENTIAL PROTEINS ALIGNED AGAINST HOSTS. .................................................... 91 FIGURE 4 - CP1002 SHORTEST PATH ANALYSIS ................................................................................................................ 95 FIGURE 5 - CP267 SHORTEST PATH ANALYSIS .................................................................................................................. 95 FIGURE 6 - CP3995 SHORTEST PATH ANALYSIS ................................................................................................................ 95 FIGURE 7 - CP4202 SHORTEST PATH ANALYSIS ................................................................................................................ 95 FIGURE 8 - CPC231 SHORTEST PATH ANALYSIS ................................................................................................................ 96 FIGURE 9 - CPFRC SHORTEST PATH ANALYSIS ................................................................................................................... 96 FIGURE 10 - CPI19 SHORTEST PATH ANALYSIS ................................................................................................................. 96 FIGURE 11 - CPP54B96 SHORTEST PATH ANALYSIS .......................................................................................................... 96 FIGURE 12 - CPPAT10 SHORTEST PATH ANALYSIS ............................................................................................................ 96 FIGURE 13 - CPPAT10 DEGREE DISTRIBUTION ANALYSIS. ................................................................................................... 96 FIGURE 14 - CP1002 DEGREE DISTRIBUTION ANALYSIS. ..................................................................................................... 97 FIGURE 15 - CP267 DEGREE DISTRIBUTION ANALYSIS. ....................................................................................................... 97 FIGURE 16 - CP3995 DEGREE DISTRIBUTION ANALYSIS. ..................................................................................................... 97 FIGURE 17 - CP4202 DEGREE DISTRIBUTION ANALYSIS. ..................................................................................................... 97 FIGURE 18 - CPC231 DEGREE DISTRIBUTION ANALYSIS. ..................................................................................................... 97 FIGURE 19 - CPFRC DEGREE DISTRIBUTION ANALYSIS. ........................................................................................................ 97 FIGURE 20 - CPI19 DEGREE DISTRIBUTION ANALYSIS. ........................................................................................................ 98 FIGURE 21 - CPP54B96 DEGREE DISTRIBUTION ANALYSIS. ................................................................................................. 98 FIGURE 22 – RANDOM INTERACTION NETWORK 01. .......................................................................................................... 99 FIGURE 23 - RANDOM INTERACTION NETWORK 02. .......................................................................................................... 99 FIGURE 24 - RANDOM INTERACTION NETWORK 03. .......................................................................................................... 99 FIGURE 25 - RANDOM INTERACTION NETWORK 04. .......................................................................................................... 99 FIGURE 26 - RANDOM INTERACTION NETWORK 05. ........................................................................................................ 100 FIGURE 27 - RANDOM INTERACTION NETWORK 06. ........................................................................................................ 100 FIGURE 28 - RANDOM INTERACTION NETWORK 07. ........................................................................................................ 100 FIGURE 29 - RANDOM INTERACTION NETWORK 08. ........................................................................................................ 100 FIGURE 30 - RANDOM INTERACTION NETWORK 09. ........................................................................................................ 100 FIGURE 31 - NETWORK FORMED BY THE INTERACTION OF RNA POLYMERASE AND RIBOSOMAL PROTEINS, REPRESENTED BY THEIR ENCODING GENE. ............................................................................................................................................. 104 FIGURE 32 - NETWORK FORMED BY THE INTERACTION OF OPP PROTEINS, REPRESENTED BY THEIR ENCODING GENES .................... 106 FIGURE 33 - NETWORK FORMED BY THE INTERACTION OF COB PROTEINS, REPRESENTED BY THEIR ENCODING GENES .................... 107 FIGURE 34 - NETWORK FORMED BY THE INTERACTION OF IRON UPTAKE PROTEINS, REPRESENTED BY THEIR ENCODING GENES. ....... 109 FIGURE 35 - NETWORK FORMED BY THE INTERACTION OF PROTEINS INVOLVED IN CELL DIVISION AND PEPTIDOGLYCAN BIOSYNTHESIS, BOTH REPRESENTED BY THEIR ENCODING GENES. .................................................................................................... 112 FIGURE 36 - CP267 PPI NETWORK .............................................................................................................................. 116 FIGURE 37 - CP3995 PPI NETWORK ............................................................................................................................ 117 FIGURE 38 - CP4202 PPI NETWORK ............................................................................................................................ 118 FIGURE 39 - CPC231 PPI NETWORK ............................................................................................................................ 119 FIGURE 40 - CPFRC PPI NETWORK .............................................................................................................................. 120 FIGURE 41 - CPI19 PPI NETWORK ............................................................................................................................... 121 FIGURE 42 - CPP54B96 PPI NETWORK ....................................................................................................................... 122 FIGURE 43 - CPPAT10 PPI NETWORK ......................................................................................................................... 123 FIGURE 44 - CP1002 PPI NETWORK ............................................................................................................................ 124 FIGURE 45. REDE DE INTERAÇÃO PARCIAL DAS PROTEÍNAS CODIFICADAS PELOS GENES PHOPR. .............................................CLXXXV Lista de Tabelas TABLE 1 - OVERVIEW OF THE PUBLIC DATA SOURCES. ......................................................................................................... 83 TABLE 2 - AMOUNT OF PROTEINS AND INTERACTIONS FOR ECHA SEROVAR OVIS STRAIN ............................................................ 86 TABLE 3 - STATISTICAL COMPARISON BETWEEN THE CP OVIS PREDICTED NETWORKS AGAINST RANDOM NETWORKS. .................... 101 Lista de Abreviações AUC Area Under Curve BLAST Basic Local Alignment Search Tool CAPES Coordenação de Aperfeiçoamento de Pessoal de Nível Superior CENAPAD Centro Nacional de Processamento de Alto Desempenho LC Linfadenite Caseosa CMNR Corynebacterium, Mycobacterium, Nocardia, Rhodococcus CNPq Conselho Nacional de Desenvolvimento Científico e Tecnológico Cp Corynebacterium pseudotuberculosis DEG Database of Essential Genes DIP Database of Interacting Proteins DNA Acido desorribonucleico Fapemig Fundação de Amparo à Pesquisa do Estado de Minas Gerais LGCM Laboratório de Genética Celular e Molecular LPDNA Laboratório do Polimorfismo do DNA pDB Bases de dados públicas (public databases) PPI Interação proteína-proteína (protein-protein interaction) RNA Ácido ribonucléico ROC Receiver Operating Characteristic STRING Search Tool for the Retrieval of Interacting Genes/Proteins tRNA RNA transportador UFMG Universidade Federal de Minas Gerais UFPA Universidade Federal do Pará Sumário RESUMO ..................................................................................................................................................... XXIII ABSTRACT ................................................................................................................................................... XXIV LISTA DE FIGURAS ........................................................................................................................................ XXV LISTA DE TABELAS ...................................................................................................................................... XXVII LISTA DE ABREVIAÇÕES ............................................................................................................................. XXVIII APRESENTAÇÃO ........................................................................................................................................ XXXIV COLABORADORES ................................................................................................................................................ XVIII CONTEXTUALIZAÇÃO .............................................................................................................................................. XIX ESTRUTURA DA TESE .............................................................................................................................................. XXI 1 - INTRODUÇÃO ............................................................................................................................................. 23 1.1- GENOMICS: APPLICATION TO A BACTERIAL PROTEIN-PROTEIN INTERACTION ............................................................... 24 1.1.1 – Structural Genomics .......................................................................................................................... 26 1.1.1.1 – Genome Sequencing .................................................................................................................................... 27 1.1.1.2 – Genome Assembly ....................................................................................................................................... 29 1.1.1.3 – Genome Annotation (Automatic and Manual Steps) ................................................................................... 32 1.1.1.4 – Comparative Genomics ................................................................................................................................ 33 1.1.2 – Funcional Genomics .......................................................................................................................... 34 1.1.2.1 - Transcriptomics ............................................................................................................................................ 34 1.1.2.2 – Methodology of Study: Advantages and Disadvantages .............................................................................. 36 1.1.2.3 – Microarray X RNA-Seq ................................................................................................................................. 36 1.1.2.4 – Real time PCR ............................................................................................................................................... 36 1.1.2.5 – Applied Biotechnology: Looking in to the future ......................................................................................... 37 1.1.3 – Proteomics ........................................................................................................................................ 38 1.1.3.1 – Gel-based Proteomics .................................................................................................................................. 39 1.1.3.2 – Gel-free Proteomics ..................................................................................................................................... 40 1.1.3.3 – Proteomic in Apllied Microbiology and Biotechnology ................................................................................ 40 1.1.3.4 – Application to a Bacterial Protein-Protein Interaction ................................................................................. 41 1.1.4 – Referenes .......................................................................................................................................... 43 1.2 - IN SILICO PROTEIN-PROTEIN INTERACTIONS: AVOIDING DATA AND METHOD BIASES OVER SENSITIVITY AND SPECIFICITY ........ 45 1.2.1 - Introduction ....................................................................................................................................... 46 1.2.2 – Computational methods used for protein-protein interaction prediction ......................................... 47 1.2.2.1 – Docking-based method ................................................................................................................................ 47 1.2.2.2 – Text mining-based method .......................................................................................................................... 48 1.2.2.3 – Similarity of amino acid sequence-based method ....................................................................................... 48 1.2.2.3.1 – Phylogenetic profile-based method ..................................................................................................... 49 1.2.2.3.2 – Phylogenetic treee-based method ....................................................................................................... 49 1.2.2.3.3 – Gene colocalization-based method...................................................................................................... 50 1.2.2.3.4 – Interolog mapping-based method ....................................................................................................... 51 1.2.2.4 – Protein domain-based method .................................................................................................................... 52 1.2.2.5 – Machine learning-based method ................................................................................................................. 53 1.2.3 – Conclusion ......................................................................................................................................... 54 1.2.4 - References.......................................................................................................................................... 54 1.3 - CORYNEBACTERIUM PSEUDOTUBERCULOSIS ......................................................................................................... 59 2 - METODOLOGIA .......................................................................................................................................... 61 2.1 - AN IMPROVED INTEROLOG MAPPING-BASED COMPUTATIONAL PREDICTION OF PROTEIN–PROTEIN INTERACTIONS WITH INCREASED NETWORK COVERAGE ............................................................................................................................... 62 2.1.1 - Introduction ....................................................................................................................................... 63 2.1.2 - Materials and methods ...................................................................................................................... 64 2.1.3 - Result and discussion ......................................................................................................................... 65 2.1.4 – Conclusions ....................................................................................................................................... 69 2.1.5 – References ......................................................................................................................................... 69 2.1.6 - Supplementary material .................................................................................................................... 71 3 - RESULTADOS .............................................................................................................................................. 78 3.1 - IN SILICO PROTEIN-PROTEIN INTERACTION ANALYSIS REVELS CONSERVED ESSENTIAL PROTEINS IN NINE CORYNEBACTERIUM PSEUDOTUBERCULOSIS BIOVAR OVIS STRAINS ............................................................................................................... 79 3.1.1 - Abstract ............................................................................................................................................. 81 3.1.2 - Introduction ....................................................................................................................................... 82 3.1.3 – Materials and methods ..................................................................................................................... 83 3.1.3.1 - Data sources ................................................................................................................................................. 83 3.1.3.2 - The Interolog Mapping ................................................................................................................................. 83 3.1.3.3 - In silico PPI network validation ..................................................................................................................... 85 3.1.3.4 - Essential proteins ......................................................................................................................................... 85 3.1.4 - Results and discussion ....................................................................................................................... 86 3.1.4.1 - The C. pseudotuberculosis PPI network prediction ...................................................................................... 86 3.1.4.2 - In silico PPI network validation ..................................................................................................................... 87 3.1.4.3 - Essential proteins ......................................................................................................................................... 88 3.1.5 - Conclusions ........................................................................................................................................ 93 3.1.6 - Author Contributions ......................................................................................................................... 93 3.1.7 - Funding .............................................................................................................................................. 94 3.1.8 – Supplementary Material ................................................................................................................... 95 3.1.8.1 – Shortest path and Degree distribution analysis. .......................................................................................... 95 3.1.8.2 – In silico PPI network validation .................................................................................................................... 99 3.1.8.2.1 – References ......................................................................................................................................... 101 3.1.8.3 – Analyses of protein clusters ....................................................................................................................... 102 3.1.8.3.1 - Complex analysis................................................................................................................................. 102 3.1.8.3.2 - Ribosomal and RNA polymerase cluster ............................................................................................. 102 3.1.8.3.3 - Oligopeptide transport system cluster ............................................................................................... 105 3.1.8.3.4 - Cobalamin biosynthesis cluster .......................................................................................................... 106 3.1.8.3.5 - Iron uptake and intracellular regulation cluster ................................................................................. 108 3.1.8.3.6 - Cell division and peptidoglycan biosynthesis ...................................................................................... 110 3.1.8.3.7 - References .......................................................................................................................................... 113 3.1.8.4 – Cp267 PPI network ..................................................................................................................................... 116 3.1.8.5 – Cp3995 PPI network ................................................................................................................................... 117 3.1.8.6 – Cp4202 PPI network ................................................................................................................................... 118 3.1.8.7 – CpC231 PPI network .................................................................................................................................. 119 3.1.8.8 – CpFRC PPI network..................................................................................................................................... 120 3.1.8.9 – CpI19 PPI network ...................................................................................................................................... 121 3.1.8.10 – CpP54B96 PPI network ............................................................................................................................ 122 3.1.8.11 – CpPAT10 PPI network .............................................................................................................................. 123 3.1.8.12 – Cp1002 PPI network ................................................................................................................................. 124 3.1.8.13 – List of top 15% proteins with higher degree against DEG ........................................................................ 125 3.1.8.14 – Alignment output for 181 essential proteins agains five hosts ................................................................ 143 3.1.8.15 – Essential proteins homology against hosts .............................................................................................. 144 3.2 - LABEL-FREE PROTEOMIC ANALYSIS TO CONFIRM THE PREDICTED PROTEOME OF CORYNEBACTERIUM PSEUDOTUBERCULOSIS UNDER NITROSATIVE STRESS MEDIATED BY NITRIC OXIDE............................................................................................... 149 3.2.1 - Backgound ....................................................................................................................................... 150 3.2.2 - Methods ........................................................................................................................................... 151 3.2.3 - Results .............................................................................................................................................. 152 3.2.4 - Discussion ........................................................................................................................................ 155 3.2.5 - Conclusions ...................................................................................................................................... 162 3.2.6 - References........................................................................................................................................ 163 4 - DISCUSSÃO GERAL ................................................................................................................................... 165 5 - CONCLUSÃO E PERSPECTIVAS .................................................................................................................. 169 BIBLIOGRAFIA ............................................................................................................................................ CLXXI ANEXOS ................................................................................................................................................ CLXXXIV I - C. PSEUDOTUBERCULOSIS PHOP CONFERS VIRULENCE AND MAY BE TARGETED BY NATURAL COMPOUNDS ........................CLXXXV I.I - Introduction ...................................................................................................................................... clxxxvi I.II - Materials and methods ................................................................................................................... clxxxvii I.III - Result and discussion ........................................................................................................................... cxc I.IV - Conclusion.......................................................................................................................................... cxcvi I.V - References .......................................................................................................................................... cxcvi II - OUTROS RESULTADOS .................................................................................................................................... CXCVIII II.I - Genome Sequence of Lactococcus lactis subsp. lactis NCDO 2118, a GABA-Producing Strain ........... cxcix II.I.I - References ......................................................................................................................................................... cc II.II - Genome Sequence of Corynebacterium pseudotuberculosis MB20 bv. equi Isolated from a Pectoral Abscess of an Oldenburg Horse in California ................................................................................................ cci II.II.I - References ........................................................................................................................................................cci II.III - Genome Sequence of Corynebacterium ulcerans Strain 210932 ....................................................... cciii II.III.I - References ..................................................................................................................................................... cciii II.IV - Genome Sequence of Corynebacterium ulcerans Strain FRC11 ......................................................... ccv II.IV.I - References ..................................................................................................................................................... ccvi II.V - Proteome scale comparative modeling for conserved drug and vaccine targets identification in Corynebacterium pseudotuberculosis ........................................................................................................ ccvii II.V.I - Abstract ......................................................................................................................................................... ccvii II.V.II - Background.................................................................................................................................................. ccviii II.V.III - Materials and methods ................................................................................................................................. ccx II.V.III.I - Genomes selection ................................................................................................................................. ccx II.V.III.II - Pan-modelome construction ................................................................................................................. ccx II.V.III.III - Identification of intra-species conserved genes/proteins ................................................................... ccxi II.V.III.IV - Analyses of essential and non-host homologous (ENH) proteins ........................................................ ccxi II.V.III.V - Analyses of essential and host homologous (EH) proteins .................................................................. ccxii II.V.III.VI - Prediction of druggable pockets ......................................................................................................... ccxii II.V.III.VII - Virtual screening and docking analyses ............................................................................................ ccxiii II.V.IV - Results and discussion ................................................................................................................................ ccxiii II.V.IV.I - Modelome and common targets in C. pseudotuberculosis species ..................................................... ccxiii II.V.IV.II - Identification of ENH and EH proteins as putative drug and/or vaccine targets ................................ ccxiv II.V.IV.III - Prioritization parameters of drug and/or vaccine targets .................................................................. ccxv II.V.IV.IV - Virtual screening and molecular docking analyses of ENH targets .................................................... ccxv II.V.IV.V - Essential host homologous as putative targets ................................................................................ ccxviii II.V.V - Conclusion ................................................................................................................................................... ccxxi II.V.VI - Authors' contributions .............................................................................................................................. ccxxii II.V.VII - Conflict of interest.................................................................................................................................... ccxxii II.V.VIII - Acknowledgements ................................................................................................................................. ccxxii II.V.IX - References ................................................................................................................................................ ccxxiii II.VI - Curriculum Vitae ............................................................................................................................ ccxxvii II.VI.I - Dados pessoais .........................................................................................................................................ccxxviii II.VI.II - Formação acadêmica/titulação ...............................................................................................................ccxxviii II.VI.III - Formação complementar .......................................................................................................................ccxxviii II.VI.IV - Atuação profissional ................................................................................................................................. ccxxx II.VI.V - Linhas de pesquisa .................................................................................................................................. ccxxxiv II.VI.VI - Projetos .................................................................................................................................................. ccxxxiv II.VI.VII - Produção bibliográfica ...........................................................................................................................ccxxxv II.VI.VIII - Apresentação de trabalho e palestra .................................................................................................. ccxxxvii II.VI.IX - Programa de computador sem registro ............................................................................................... ccxxxviii II.VI.X - Orientações e Supervisões .................................................................................................................... ccxxxviii II.VI.XI - Eventos ................................................................................................................................................. ccxxxviii II.VI.XII - Organização de evento .......................................................................................................................... ccxxxix II.VI.XIII - Participação em banca de trabalhos de conclusão............................................................................... ccxxxix II.VI.XIV - Participação em banca de comissões julgadoras ...................................................................................... ccxl II.VI.XV - Outras informações relevantes .................................................................................................................. ccxl Apresentação XVIII Colaboradores Este trabalho foi auxiliado pelo Centro Nacional de Processamento de Alto Desempenho (CENAPAD-MG) situado na Universidade Federal de Minas Gerais (UFMG) e foi executado no Laboratório de Genética Celular e Molecular (LGCM) da UFMG e no Laboratório de Polimorfismo e DNA (LPDNA) da Universidade Federal do Pará (UFPA) em colaboração com os seguintes pesquisadores:  Prof. Dr. Vasco Ariston de Carvalho Azevedo, Pesquisador e Professor do LGCM/UFMG, Brasil;  Prof. Dra. Rafaela Salgado Ferreira, Pesquisadora e Professora do Departamento de Bioquímica e Imunologia da UFMG, Brasil.  Prof. Dr. Artur Luiz da Costa da Silva, Pesquisador e Professor do LPDNA/UFPA, Brasil.  Prof. Dr. Debmalya Barh, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal, India.  Prof. Dr. Richard Röttger e Dr. Jan Baumbach, Departamento de Matemática e Informática, Universidade do Sul da Dinamarca, Campusvej 55, Odense, Denmark  Dr. Preetam Ghosh, Departamento de Ciência da Computação, Universidade Virginia Commonwealth, Richmond, VA, USA. Este trabalho foi financiado pelas agências de fomento: Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), o Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) e a Fundação de Amparo à Pesquisa do Estado de Minas Gerais (Fapemig). XIX Contextualização Coordenados pelo grupo de pesquisa do Laboratório de Genética Celular e Molecular (LGCM) da Universidade Federal de Minas Gerais (UFMG) e do Laboratório de Polimorfismo e DNA (LPDNA) da Universidade Federal do Pará (UFPA), até o ano de 2014, quando esta tese começou a ser desenvolvida, haviam 21 genomas de Corynebacterium pseudotuberculosis sequenciados. Destes genomas, 15 estavam completos e publicamente disponíveis, sendo nove genomas do biovar ovis e seis genomas do biovar equi. Os grupos de pesquisa, objetivando desenvolver projetos relacionados a genômica comparativa e um grande projeto de patogenômica, estavam sequenciando ainda outras novas linhagens do biovar equi de C. pseudotuberculosis, enquanto outras montagens antigas estavam sendo aperfeiçoadas e resequenciadas com as novas tecnologias. Os vários genomas de C. pseudotuberculosis e outros organismos disponíveis, possibilitou ao grupo desenvolver em 2013 o primeiro trabalho de redes de interação proteína-proteína baseado no interactoma conservado entre patógeno-hospedeiro (Barh et al., 2013). Com o interesse do grupo em fortalecer o desenvolvimento de projeto na área de redes de interação, foi proposto em se gerar as redes de interação proteína-proteína interna para a bactéria C. pseudotuberculosis. Visto que o biovar ovis possuía a maior quantidade de genomas disponíveis (nove) e também ser mais clonal, este biovar foi selecionado para a predição das redes de interação proteína-proteína, visando futuramente comparar estas redes com as redes de interação do biovar equi. Limitações como custo e tempo foram impeditivos para realizar este trabalho experimentalmente para os nove proteomas disponíveis, optando-se assim pelo desenvolvimento in silico das redes de interação. A revisão bibliográfica apontou a existência de diversos métodos computacionais para a predição de rede de interação, sendo que cada método usa como entrada distintos tipos de dados biológicos. Uma característica comum entre estes métodos foi a ausência de informações na literatura sobre os detalhes de suas implementações e também sobre as formas de validação em larga escala que comprovasse a eficácia nas predições. Assim, antes de aplicar um destes métodos para a predição das interações em C. pseudotuberculosis biovar ovis, houve a preocupação de selecionar um método que pudesse oferecer uma boa cobertura na predição das interações e, ao mesmo tempo, oferecesse uma boa razão entre sensibilidade e especificidade na predição. Adicionalmente, XX houve a preocupação em validar este método com dados experimentais e curados em larga escala, visando identificar exatamente os índices de erros e acertos na predição. Pensando em todo este contexto, ao contrário de estruturas tridimensionais de proteínas que não são abundantes para C. pseudotuberculosis e outros organismos não modelo, foi selecionado um método que permitisse o uso dos dados mais abundantes de C. pseudotuberculosis, ou seja, os seus genomas e proteomas. Assim, considerando os recursos físicos e conhecimento disponível no laboratório para a implementação do projeto, foi selecionado o método denominado mapeamento de interações ortólogas (interolog mapping) para ser usado nas predições das redes de interação proteína-proteína de C. pseudotuberculosis biovar ovis, cuja validação seria possível com dados experimentais e curados disponíveis publicamente. XXI Estrutura da Tese Esta tese está organizada em formado de artigos e foi dividida em cinco capítulos. Mesmo estando em formato de artigo, a tese segue a linha clássica de escrita de trabalhos científico, apresentando inicialmente a introdução sobre os principais temas abordados na tese, seguido da apresentação da metodologia, dos resultados obtidos e finalizando com a discussão geral, conclusão e perspectivas. Segue uma breve apresentação dos cinco capítulos que compõe esta tese: a. No primeiro capítulo é apresentado a introdução da tese. Como esta tese é referente ao desenvolvimento e validação de uma metodologia para a predição de interações proteína-proteína, seguido da aplicação desta metodologia para a predição das interações de Corynebacterium pseudotuberculosis, a introdução foi também dividida em três seções, duas destacando as redes de interação proteína-proteína e a última destacando o organismo estudado:  A primeira seção, com o subtítulo “Application to a Bacterial Protein-Protein Interaction”, foi publicada em fevereiro de 2015 pela revista SM Online Publishers LLC e apresenta o capítulo de livro intitulado “Genomics”, do livro “A Textbook of Biotechnology”.  A segunda seção, com o título “In silico protein-protein interactions: avoiding data and method biases over sensitivity and specificity” foi publicado em maio de 2015 pela revista Current Protein & Peptide Science.  A terceira seção apresentando a introdução sobre C. pseudotuberculosis e as características principais deste organismo. b. No segundo capítulo é apresentado a metodologia. O artigo referente a validação do método intitulado “An improved interolog mapping-based computational prediction of protein-protein interactions with increased network coverage”, foi publicado na revista Integrative Biology em novembro de 2014, cuja validação das métricas permitiu realizar a predição in silico de redes de interação proteína-proteína para C. pseudotuberculosis. c. No terceiro capítulo são apresentados os resultados obtidos no desenvolvimento desta tese, relacionados à aplicação da metodologia validada para a predição das redes de interação de C. pseudotuberculosis. Este capítulo está dividido em dois trabalhos: XXII  O primeiro trabalho, com o título “In silico protein-protein interaction analysis reveals conserved essential proteins in nine Corynebacterium pseudotuberculosis serovar ovis strains”, submetido à revista Integrative Biology em agosto de 2015.  O segundo trabalho, com o título “Label-free proteomic analysis to confirm the predicted proteome of Corynebacterium pseudotuberculosis under nitrosative stress mediated by nitric oxide”, publicado em dezembro de 2014 pela revista BMC Genomics. d. No quarto capítulo é apresentado uma discussão geral considerando todos o conteúdo desenvolvido nesta tese. e. No quinto capítulo são apresentadas as conclusões e as perspectivas de trabalhos futuros. Durante o desenvolvimento desta tese, colaborando com outros integrantes dos grupos de pesquisa, outros trabalhos foram desenvolvidos. Assim, estes trabalhos publicados estão relacionados no anexo desta tese, também em formato de artigo. Por uma questão de organização, quando constar na tese um artigo publicado, este será apresentado integralmente em seu respectivo capítulo, conforme publicado pela revista. Como as figuras, tabelas, referências bibliográficas e materiais suplementares recebem formatação e numeração própria em cada artigo, estes itens figurarão somente no respectivo artigo, no capítulo que descreve o artigo, sem serem apresentados na lista de figuras ou tabelas da tese. Da mesma forma, visando não misturar as referências bibliográficas dos artigos publicados, que são distintas na forma de apresentação e organização para cada revista, estas estarão exclusivamente ao final da apresentação de cada artigo ou do respectivo material suplementar quando este existir. 23 1 - Introdução 24 1.1- Genomics: Application to a Bacterial Protein- Protein Interaction Flavia Figueira Aburjaile, Mariana P. Santana, Marcos Vinicius Canario Viana, Wanderson Marques Silva, Edson Luiz Folador, Artur Silva e Vasco Azevedo Neste capítulo de livro, foi feito uma breve revisão sobre genômica estrutural, genômica funcional (transcriptomica) e proteomica, destacando os métodos de análise experimentais de cada área. Adicionalmente, foram revisados os conceitos básicos relacionados às redes de interação proteína-proteína com uma breve discussão para possíveis aplicações biotecnológicas. Uma rede de interação é composta por nodos, no contexto deste trabalho, representando as proteínas e, por arestas, que ligam dois nodos e caracteriza uma interação. Independente do método usado, par-a-par, é possível formar uma complexa rede de interação proteína- proteína que viabiliza o estudo e compreensão de um organismo a nível de biologia de sistemas. Além de possibilitar um melhor conhecimento do organismo, uma rede de interação pode ser utilizada para direcionar o desenvolvimento de novas pesquisas em laboratório e novas aplicações biotecnológicas, bem como auxiliar na seleção de proteínas para o desenvolvimento de drogas, inclusive para inibir interações específicas. A seção “Application to a Bacterial Protein-Protein Interaction” que integra o capítulo intitulado “Genomics” do livro “A Textbook of Biotechnology”, foi publicada em fevereiro de 2015 pela revista SM Online Publishers LLC, disponível em http://www.smgebooks.com/a- textbook-of-biotechnology/index.php.com com ISBN número 978-0-9962745-3-1. 25 26 1.1.1 – Structural Genomics 27 1.1.1.1 – Genome Sequencing 28 29 1.1.1.2 – Genome Assembly 30 31 32 1.1.1.3 – Genome Annotation (Automatic and Manual Steps) 33 1.1.1.4 – Comparative Genomics 34 1.1.2 – Funcional Genomics 1.1.2.1 - Transcriptomics 35 36 1.1.2.2 – Methodology of Study: Advantages and Disadvantages 1.1.2.3 – Microarray X RNA-Seq 1.1.2.4 – Real time PCR 37 1.1.2.5 – Applied Biotechnology: Looking in to the future 38 1.1.3 – Proteomics 39 1.1.3.1 – Gel-based Proteomics 40 1.1.3.2 – Gel-free Proteomics 1.1.3.3 – Proteomic in Apllied Microbiology and Biotechnology 41 1.1.3.4 – Application to a Bacterial Protein-Protein Interaction 42 43 1.1.4 – Referenes 44 45 1.2 - In silico protein-protein interactions: avoiding data and method biases over sensitivity and specificity Edson Luiz Folador, Alberto Fernandes de Oliveira Junior, Sandeep Tiwari, Syed Babar Jamal, Rafaela Salgado Ferreira, Debmalya Barh, Preetam Ghosh, Artur Silva, Vasco Azevedo O estudo de redes de interação proteína-proteína permite se ter uma visão sistêmica dos mecanismos celulares de um organismo, possibilitando conhecer o organismo a nível molecular. Considerando os diversos métodos existentes para a identificação dos pares de interação, experimentais e computacionais, aqui nos concentramos em descrever os métodos computacionais. Desconsiderando detalhes da implementação de cada método, destacamos principalmente a natureza do dado biológico usados para a predição e como estes dados causam viés sobre a sensibilidade e especificidade destes métodos, visando levar o leitor a refletir sobre os pontos positivos e negativos de cada método. Secundariamente nos preocupamos em relatar em quais organismos os métodos foram usados, citando ainda onde pode ser encontrada informações mais detalhadas sobre o funcionamento de cada método. Adicionalmente, conforme os dados usados como entrada para a predição, cada método foi classificado como primário ou não primário. Foi considerado primário o método capaz de identificar interações proteína-proteína ainda não identificadas em algum organismo e, método não primário, aquele que depende da existência de interações entre duas proteínas para que outras interações sejam preditas. O artigo referente a esta seção foi publicado em 2015 pela revista Current Protein & Peptide Science com DOI número 10.2174/1389203716666150505235437. 46 1.2.1 - Introduction 47 1.2.2 – Computational methods used for protein-protein interaction prediction 1.2.2.1 – Docking-based method 48 1.2.2.2 – Text mining-based method 1.2.2.3 – Similarity of amino acid sequence-based method 49 1.2.2.3.1 – Phylogenetic profile-based method 1.2.2.3.2 – Phylogenetic treee-based method 50 1.2.2.3.3 – Gene colocalization-based method 51 1.2.2.3.4 – Interolog mapping-based method 52 1.2.2.4 – Protein domain-based method 53 1.2.2.5 – Machine learning-based method 54 1.2.3 – Conclusion 1.2.4 - References 55 56 57 58 59 1.3 - Corynebacterium pseudotuberculosis Corynebacterium pseudotuberculosis (Cp) faz parte do grupo de bactérias CMNR (Corynebacterium, Mycobacterium, Nocardia, Rhodococcus) (Butler, Ahearn e Kilburn, 1986). É uma bactéria patogênica intracelular facultativa, gram-positiva, possui fimbrias porém não se move, não forma capsulas e não esporula (Selim, 2001). Cp se apresenta em dois biovares: ovis e equi (Songer et al., 1988). O biovar equi infecta principalmente equinos e bovinos, já o biovar ovis é o agente etiológico de linfadenite caseosa (LC), uma doença crônica que afeta principalmente rebanhos de ovinos e caprinos, sendo a infecção em humanos associada à exposição profissional durante o manuseio dos rebanhos (Hémond et al., 2009; Ivanović et al., 2009). Estudo realizado no estado de Minas Gerais, Brasil, mostrou que 78.9% dos animais testados foram soropositivos para LC (Seyffert et al., 2010). Entretanto, o estudo de Cp se torna importante também pela prevalência em diversos países no globo (Windsor, 2011), como estado de Granada e ilhas Carriacou na India (Hariharan et al., 2014), Coréia (Jung et al., 2015), França (Trost et al., 2010), Patagônia na Argentina (Cerdeira et al., 2011), Brasil e Austrália (Ruiz et al., 2011), Israel (Silva et al., 2011), África (Hassan et al., 2012), norte da Califórnia (Lopes et al., 2012), Escócia (Pethick et al., 2012; Voigt et al., 2012), Espanha (Colom-Cadena et al., 2014), Argélia (Mira et al., 2014), região Selangor na Malásia (Osman et al., 2015), Egito (Oreiby et al., 2014), Turquia (SakmanoğLu et al., 2015) e mais recentemente na Etiópia (Abebe e Sisay Tessema, 2015). A LC causa significantes percas econômicas em diversos países devido a baixa qualidade de carcaças, queda na produção de carne, lã e leite (Dorella et al., 2006; Baird e Fontaine, 2007), além de mortalidade de animais causada por meningoencefalite supurativa (Santarosa et al., 2015). Até o ano de 2014, haviam sido sequenciadas e disponibilizadas publicamente pelo grupo de pesquisa do Laboratório de Genética Celular e Molecular (LGCM) da Universidade Federal de Minas Gerais (UFMG) e do Laboratório de Polimorfismo e DNA (LPDNA) da Universidade Federal do Pará (UFPA) 15 genomas de Cp, sendo nove linhagens do biovar ovis e seis do biovar equi. Mesmo com todas as informações genéticas disponíveis, os métodos desenvolvidos para diagnóstico e tratamento de LC ainda não são suficientemente eficazes devido Cp apresentar baixa resposta terapêutica aos medicamentos disponíveis e habilidade em persistir no meio ambiente (Williamson e Nairn, 1980; Dorella et al., 2006; Oreiby et al., 2014). 60 Considerando a resistência e prejuízos causados, Cp se torna um importante organismo para ser investigado, demandando ainda mais pesquisas da comunidade científica objetivando melhorar nosso conhecimento sobre os mecanismos moleculares e sua patogenicidade, viabilizando então, pensar em diferentes hipóteses e estratégias para o desenvolvimento de novos fármacos. Por estas razões, além dos genes, transcritos e proteínas, se faz necessário conhecer como estas moléculas interagem umas com as outras dentro da célula e com o meio ambiente para desempenharem suas funções biológicas (Barabási e Oltvai, 2004; Sharan et al., 2005; Flórez et al., 2010; Garma et al., 2012; Gonzalez e Kann, 2012). Neste aspecto, conhecer as proteínas e suas interações é fundamental para entender os mecanismos moleculares da célula a nível de sistêmico (Wetie et al., 2013; Peng et al., 2014). As redes de interação proteína-proteína (PPI) nos possibilitam ter uma visão sistêmica da biologia de um organismo a nível celular, viabilizando ainda fazer diversas análises. Além da identificação das interações e dos clusteres de proteínas que possibilita entender melhor o organismo, através de análise topológica da rede de interação, é possível identificar proteínas importantes, com potencial uso como alvos para drogas (Li et al., 2012; Cui e He, 2014; Li et al., 2014; Mulder et al., 2014; Wetie et al., 2014). Análises computacionais em uma rede de interação podem auxiliar no desenvolvimento de novas hipóteses sobre o organismo e no desenho de novos experimentos em laboratório conduzidos por estas hipóteses (Braun e Gingras, 2012; Zhang, Xu e Xiao, 2013). Em caso de organismos patogênico, entender a rede de interação proteína-proteína, viabiliza a identificação de proteínas importantes, oferecendo consequentemente, oportunidades para o desenvolvimento de novas drogas, vacinas ou outros produtos biotecnológicos (Mosca et al., 2013; Zoraghi e Reiner, 2013; Häuser et al., 2014; Lage, 2014; Li et al., 2014). Devido à importância veterinária de C. pseudotuberculosis e conhecendo o potencial das redes de interação, visando fornecer recursos para que outros pesquisadores conheçam melhor este organismo a nível molecular e também identificar proteínas essenciais com potencial uso para diagnóstico ou alvos para fármacos, neste trabalho, foi validada uma metodologia para posterior aplicação na predição das redes de interação proteína-proteína de nove linhagens do biovar ovis de C. pseudotuberculosis. 61 2 - Metodologia 62 2.1 - An improved interolog mapping-based computational prediction of protein–protein interactions with increased network coverage Edson Luiz Folador, Syed Shah Hassan, Ney Lemke, Debmalya Barh, Artur Silva, Rafaela Salgado Ferreira e Vasco Azevedo Existem diversos métodos computacionais para a predição de interação proteína-proteína, cada um com vantagens e desvantagens, devendo cada metodologia ser cuidadosamente validada para que tenha sua viabilidade comprovada, principalmente quanto a sensibilidade e especificidade. Cada método computacional exige como entrada para a predição um determinado tipo de dado biológico, sendo as sequências de nucleotídeos e aminoácidos os tipos mais abundantes, principalmente devido ao surgimento das tecnologias de sequenciamento de nova geração. O mapeamento de interações ortólogas (Interolog mapping) é um método que usa as sequências de aminoácidos como entrada para a predição de interações. Este método é baseado na premissa biológica que, se um par de proteínas interage em um organismo “a” e este par de proteínas é ortólogo no organismo “b”, a interação também ocorrerá no organismo “b”. Como existem vários bancos de dados de interação proteína-proteína disponíveis publicamente, o desafio em usar este método consiste em garantir que somente os pares de proteínas ortólogos sejam mapeados para o organismo de interesse. Antes de usarmos este método para construirmos as redes de interação de C. pseudotuberculosis, tivemos a preocupação de o validar, comparando as interações preditas com interações experimentais e curadas (Xenarios et al., 2000; Orchard et al., 2012). Como resultado da validação, além de obtermos uma cobertura maior da rede de interação, identificamos um ponto de corte que melhor representasse a razão entre sensibilidade e especificidade. O artigo referente a este trabalho foi publicado na revista Integrative Biology em setembro de 2014 com DOI número 10.1039/c4ib00136b, estando também disponível no endereço eletrônico http://pubs.rsc.org/en/content/articlehtml/2014/ib/c4ib00136b. 63 2.1.1 - Introduction 64 2.1.2 - Materials and methods 65 2.1.3 - Result and discussion 66 67 68 69 2.1.4 – Conclusions 2.1.5 – References 70 71 2.1.6 - Supplementary material 72 73 74 75 76 77 78 3 - Resultados 79 3.1 - In silico protein-protein interaction analysis revels conserved essential proteins in nine Corynebacterium pseudotuberculosis biovar ovis strains Edson Luiz Folador, Paulo Vinícius Sanches Daltro de Carvalho, Wanderson Marques Silva, Syed Shah Hassan, Rafaela Salgado Ferreira, Artur Silva, Jan Baumbach, Vasco Azevedo Tendo uma metodologia com métricas validadas para a predição de redes de interação, a aplicamos na predição de nove redes de interação de nove linhagens do biovar ovis de C. pseudotuberculosis. O biovar ovis de C. pseudotuberculosis é um organismo extremamente clonal (Soares et al., 2013) e todas as redes preditas tiveram características semelhantes, sendo a grande maioria das interações conservadas entre as nove linhagens. As redes foram validadas considerando o menor caminho (Shortest Path) (Jeong et al., 2001; Wang et al., 2010; Taylor e Wrana, 2012) e considerando a distribuição do grau de interação (Barabási e Oltvai, 2004). As redes formadas possuem uma topologia livre de escala (scale-free) com a distribuição do grau de interação se aproximando a lei do poder (power law), demostrando possuirem características de rede biológica. Adicionalmente, comparando as redes de itneração preditas com redes de interação geradas aleatoriamente, os valores de Coeficiente de Clusterização, Correlação e R2 foram extremamente diferentes. Em tempo, o teste de normalidade Shapiro-Wilk descartou definitivamente que as interações preditas tivessem uma distribuição normal (Shapiro e Wilk, 1965). Todas as validações sugerem que as redes não foram formadas por interações espúrias ou aleatórias, existindo um viés biológico na rede, provavelmente devido a pressão biológica exercida sobre as interações e os clusteres (Galeota et al., 2015). Este viés biológico é confirmado na análise dos clusteres, cujo apoio na literatura reforça a integridade da rede predita. Dos cinco clusteres analisados todos estavam descritos na literatura, reforçando a consistência das redes preditas e que as interações realmente podem ocorrem em C. pseudotuberculosis, sendo um bom exemplo o mecanismo de aquisição de ferro, recentemente revisado e que, com apoio das rede de interação, contribui para melhor entendimento da dinâmica deste mecanismo em C. pseudotuberculosis (Sheldon e Heinrichs, 2015). 80 Finalmente, pela análise do grau de interação das proteínas, foram identificadas 181 proteínas essenciais nas redes de interação de C. pseudotuberculosis, sendo que somente a proteína DNA repair (RecN) não teve sua essencialidade confirmada na base de dados de genes essenciais (DEG) (Luo et al., 2014). Dentre estas proteínas, 41 não tiveram homologia contra as proteínas do hospedeiro, sendo boas candidatas para propósitos terapêuticos ou diagnóstico. Este fato faz das redes de interação uma valiosa ferramenta para pesquisadores entenderem melhor o mecanismo celular do organismo estudado e identificarem proteínas ou interações como potencial alvo para drogas (Pelay‐Gimeno et al., 2015). O artigo referente a este trabalho será em breve submetido à revista Integrative Biology ou outra revista com similar importância, como para a revista BMC series, cuja avaliação prévia indicou que o artigo pode ser considerado para publicação. 81 In silico protein-protein interaction analysis reveals conserved essential proteins in nine Corynebacterium pseudotuberculosis serovar ovis strains Edson Luiz Folador1, Paulo Vinícius Sanches Daltro de Carvalho1, Wanderson Marques Silva1, Syed Shah Hassan1, Rafaela Salgado Ferreira2, Artur Silva3, Jan Baumbach4, Michael Gromiha5, Preetam Ghosh6, Debmalya Barh7, Richard Röttger4, Vasco Azevedo1,* 1Department of General Biology, Institute of Biological Sciences (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil 2Department of Biochemistry and Immunology, Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil 3Institute of Biological Sciences, Federal University of Para, Belém, PA, Brazil. 4Department for Mathematics and Informatics, University of Southern Denmark, Campusvej 55, Odense, Denmark 5Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Tamilnadu, India 6Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA 7Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal, India 3.1.1 - Abstract The Corynebacterium pseudotuberculosis is a gram-positive bacterium that belongs to the CMNR group (Corynebacterium, Mycobacterium, Nocardia, Rhodococcus), exhibits characteristics of both equi and ovis serovars. The serovar ovis is the etiological agent of caseous lymphadenitis, a chronic infection affecting sheep and goats, causing economic losses due to carcass condemnation and decrease in the production of meat, wool and milk. The protocols for diagnosis or treatment are not fully effective, requiring further research for a better understanding of C. pseudotuberculosis pathogenesis. In this context, the protein- protein interaction network serves as a tool for researchers to get a systemic view of an organism. We mapped the orthologous interactions from public databases to nine strains of C. pseudotuberculosis. The validations suggest that the interactions are not spurious and the networks possess the basic characteristics of biological networks. Based on literature support, the clustering analyses further reinforce the biological reliability of the predicted networks. For each strain we predicted on average 16,669 interactions, ~99% of which were mapped from Corynebacterium genus, resulting in 15,495 conserved interactions among the nine C. pseudotuberculosis strains. Analyzing these networks we identified 181 conserved essential 82 proteins, of which 41 are non-host homologous and serve as good targets for diagnosis or drug development. Keywords: Protein-protein interaction, biologic network, system biology, essential proteins, interolog mapping, Corynebacterium Pseudotuberculosis, caseous lymphadenitis. 3.1.2 - Introduction Corynebacterium pseudotuberculosis (Cp) belongs to the supra generic CMNR group (Corynebacterium, Mycobacterium, Nocardia, Rhodococcus) of bacteria (Butler, Ahearn e Kilburn, 1986). It is an intracellular pathogen and Gram-positive bacterium that is fimbriated, non-motile and non-capsulated (Selim, 2001) and is present in two serovars: ovis and equi (Songer et al., 1988). The serovar equi infects mainly horses and cattle while the serovar ovis is the etiological agent of caseous lymphadenitis (CLA), a chronic infectious disease affecting mainly sheep and goat populations, that can lead to infection in humans associated to occupational exposure (Hémond et al., 2009; Ivanović et al., 2009). Furthermore, CLA disease is prevalent in several countries around the world (Jung et al.; Seyffert et al., 2010; Trost et al., 2010; Cerdeira et al., 2011; Ruiz et al., 2011; Silva et al., 2011; Windsor, 2011; Hassan et al., 2012; Lopes et al., 2012; Pethick et al., 2012; Voigt et al., 2012; Colom-Cadena et al., 2014; Hariharan et al., 2014; Mira et al., 2014; Oreiby et al., 2014; Osman et al., 2015) and causes significant economic losses due to low carcass quality, a decrease in the production of meat, wool and milk (Dorella et al., 2006; Baird e Fontaine, 2007), while also causing animal mortality due to suppurative meningoencephalitis (Santarosa et al., 2015). The available methods for CLA diagnosis or treatment are not effective enough, requiring further research to tackle the threats posed by C. pseudotuberculosis. Hence, it becomes important to know how the genes, transcripts, proteins and other molecules inside the bacterial cells interact with each other and also with the outer environment to perform their biological functions (Barabási e Oltvai, 2004; Sharan et al., 2005; Flórez et al., 2010; Garma et al., 2012; Gonzalez e Kann, 2012). From this perspective, the study of proteins and their interactions allows for a better understanding of the molecular mechanism of cells at a system level (Wetie et al., 2013; Peng et al., 2014). The protein-protein interactions (PPI) form a complex network represented as a graph, where the nodes represent proteins and undirected edges connecting these nodes represent the interactions between the proteins (Wang et al., 2010; De Las Rivas e Fontanillo, 2012). Computationally analyzed PPI supports developing new hypotheses and designing novel laboratory experiments driven by such hypotheses 83 (Braun e Gingras, 2012; Zhang, Xu e Xiao, 2013). A PPI network provides a systematic view of the biology of an organism at the cellular level, hence, essential proteins and potential drug targets can be identify by topological analysis (Li et al., 2012; Cui e He, 2014; Li et al., 2014; Mulder et al., 2014; Wetie et al., 2014), enabling the development of new drugs against pathogenic microorganisms (Mosca et al., 2013; Zoraghi e Reiner, 2013; Häuser et al., 2014; Lage, 2014). In this paper, we predict and validate the PPI networks of nine strains of C. pseudotuberculosis serovar ovis (Cp). Additionally, to better understand the organism and its pathogenicity we perform a cluster analysis and identify the conserved essential proteins in the PPIs, suggesting potential drug or diagnostic targets to be experimentally verified. 3.1.3 – Materials and methods 3.1.3.1 - Data sources The prediction of the PPI networks is based on the protein sequence similarity and the information of already known PPIs. The protein sequences for the nine Cp were downloaded from NCBI, while known PPIs and their respective protein sequences were retrieved from three publicly available databases (Table 1). Table 1 - Overview of the public data sources. Data Proteins Interactions Reference DIP 23,680 70,630 (Xenarios et al., 2000) String 5,214,234 673,123,356 (Franceschini et al., 2013) Intact 60,846 314,019 (Hermjakob et al., 2004) Cp1002 2,090 n/a (Rezende et al., 2012) Cp267 2,148 n/a (Lopes et al., 2012) Cp3995 2,142 n/a (Pethick et al., 2012) Cp4202 2,051 n/a (Pethick et al., 2012) CpC231 2,091 n/a (Ruiz et al., 2011) Cpfrc41 2,110 n/a (Trost et al., 2010) CpI19 2,095 n/a (Silva et al., 2011) CpP54B96 2,084 n/a (Hassan et al., 2012) CpPAT10 2,079 n/a (Cerdeira et al., 2011) Note: The interactions in the String database are represented both in the A -> B and B -> A directions, having 336,561,678 distinct interactions. The Cp proteomes were downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/. The interactions for nine Cp strains (n/a) will be predicted in this work. 3.1.3.2 - The Interolog Mapping The interolog mapping method was used to map the homologous pairs of interacting proteins from public databases to Cp biovar ovis. This method was already successfully applied to predict the interactions in organisms such as Mycobacterium tuberculosis (Liu et al., 2012), 84 Leishmania (Rezende et al., 2012) and Mouse (Lo et al., 2015). We were already able to show that when using the method, whose previous validation with experimental interactions from DIP database (Xenarios et al., 2000) cured by IMEX (Orchard et al., 2012) consortium, we obtained an Area Under Curve (AUC) of 0.93, a specificity of 0.95, sensitivity exceeding 0.83 and a precision of 0.99, whose detailed flow-diagram was presented in (Folador et al., 2014). The NCBI BLASTp in the latest version was used to perform the reciprocal alignment of proteins from nine Cp strains against the proteins from public databases for which there are known interactions (Camacho et al., 2009). Aiming to eliminate false alignments that would only slow down the prediction process, the BLASTp e-value parameter was set to 1e-5 for proteins from DIP and Intact databases, and set to 1e-9 for proteins from the String database. All other BLASTp parameters were kept at their default values. To map the homologous proteins we used each of the nine Cp ovis proteomes as queries and the proteome of the public databases as subject. In a second step we inverted the search direction, i.e., we switched subject and query. In the remaining, we only consider those proteins alignments that yield a hit in both directions (a reciprocal hit). For each reciprocal hit, we retrieved the minimum identity and coverage values from BLASTp alignment, based on the following formula: RH(a) = min( identity x coverage(a→A), identity x coverage(a←A) ) Here ‘a’ represents a protein of Cp and ‘A’ the homologous counterpart of the known interaction. We assign for each known interaction for which we have homologous proteins in Cp an interaction conservation score. Thus, an interaction pair (IP) is represented by: IP = RH(a), RH(b) Here, the Cp proteins "a" and "b" are reciprocal hits of public databases proteins "A" and "B", respectively. Moreover, "A" and "B" are the public databases identifiers used to map the interaction pairs "a" and "b" to Cp ovis. The smallest value of each RH was assessed to compose the interaction score pair (ISP), which is denoted by the following formula: ISP(ab) = min( RH(a), RH(b) ) The ISP(ab) equates to the lower value of identity and coverage identified among the four alignments composing the interaction pair. Aiming to map homologous protein pairs from public databases we considered only interactions with an ISP(ab) greater than 0.5625 (corresponds to on average 75% identity and 75% coverage) as conserved. Furthermore, aiming to map high confidence and experimental interaction, we regarded only interactions of 85 the String database with a confidence score greater than 700. To ensure the accuracy of predictions, we validated the networks both statistically and with literature support. 3.1.3.3 - In silico PPI network validation Additionally to utilizing our previously reported and validated methodology (Folador et al., 2014), we verify if the nine Cp PPI networks have typical characteristics of biological networks. We submit the PPI networks to Cytoscape plugin NetworkAnalyzer (Assenov et al., 2008) and analyzed the PPI distribution, the node degree distribution (Barabási e Oltvai, 2004) and the Shortest Path (Jeong et al., 2001; Wang et al., 2010; Taylor e Wrana, 2012). Aiming verify if the predicted interactions are spurious, we compared the clustering coefficient, correlation and R-Squared regression values from predicted networks against random networks containing 16,000 interactions for Cp267 lineage. As an additional validation, in order to check whether the networks have random distribution, the predicted networks were subjected to distribution analysis by the Shapiro-Wilk normality test (Shapiro e Wilk, 1965), available in the statistical R package (Royston, 1982). Finally, the clusters in the predicted networks were identified by using Markov Cluster Algorithm (MCL) (Van Dongen, 2000), implemented in the ClusterMaker (Morris et al., 2011) plug-in available in the Cytoscape (Shannon et al., 2003) software, with MCL inflation value parameter set to 3.0. To reinforce that these interactions do occur in Cp, a literature search was performed to verify the existence of these clusters in phylogenetically close organisms. 3.1.3.4 - Essential proteins In Saccharomyces cerevisiae the degree interaction of nodes was observed to be correlated with the lethality of removing such proteins from the network (Jeong et al., 2001; Estrada, 2006).Large degree and centrality measures are the means for identifying the essential proteins (Betul e Eric, 2013; Tang et al., 2014), explained by the disruption that knockout of one could cause in the interaction network (Han et al., 2004). With the modeled interaction network, we perform topological analysis to identify the Cp essential proteins by selecting the top 15% proteins with high degree interaction, named as hub proteins. Next, to validate the essential hub proteins, we searched for homologous sequences in the bacterial protein sequences from DEG (Zhang, Ou e Zhang, 2004; Luo et al., 2014) (v11.2, updated on July 3, 2015). For the alignment of Cp proteins against DEG, the BLASTp parameters were set to: e- value = 1e-5 , low complexity filter = false and matrix = BLOSUM62. Finally, the BLASTp 86 program was used to align the essential proteins of Cp against the proteins from five hosts: Ovis aries (taxid: 9940), Capra hircus (taxid: 9925), Bos Taurus (taxid: 9913), Equus caballus (taixd: 9796) and Homo sapiens (taxid: 9606). 3.1.4 - Results and discussion 3.1.4.1 - The C. pseudotuberculosis PPI network prediction Among the 18,890 proteins present in nine Cp strains, 10,370 participated in interactions, accounting for in total 150,019 predicted interactions (16,669 on average per Cp strain). The contribution of each public database to the formation of networks is shown in (Table 2). Table 2 - Amount of proteins and interactions for echa serovar ovis strain Linhagem Proteins Proteome Interactions DIP Intact String Cp1002 1.156 2.090 16.710 103.514 121.035 39.276.922 Cp267 1.164 2.148 16.728 102.140 120.193 39.415.241 Cp3995 1.141 2.142 16.600 100.868 119.895 39.454.010 Cp4202 1.148 2.051 16.712 99.881 118.356 38.973.203 CpC231 1.151 2.091 16.647 95.314 116.142 38.866.646 cpfrc 1.165 2.110 16.897 106.993 126.679 41.393.479 CpI19 1.158 2.095 16.715 96.181 117.188 38.957.265 CpP54B96 1.149 2.084 16.537 95.231 114.476 38.776.672 CpPAT10 1.138 2.079 16.473 94.058 115.149 38.730.691 Proteins: amount of proteins participating in the interaction network for each strain. Proteome: amount of proteins for each strain. Interactions: amount of predicted interactions used for network composition. DIP: amount of interactions mapped from DIP. Intact: mount of interactions mapped from Intact. String: amount of interactions mapped from String. Despite the large number of interaction pairs predicted from each public database individually, only a small percentage were harnessed to generate the Cp ovis interactome. The reduced number of harnessed interaction pairs is due the following three reasons: (i) despite the cut-off point defined for the BLASTp alignments, by having ISP(ab) lower than 0.5625, the majority of the interactions were not considered homologous; (ii) in addition, only the interactions with String score >= 700 (Franceschini et al., 2013) were mapped and; (iii) when redundant interactions were found, the one with highest ISP(ab) was utilized. The latter condition occurs when the interaction is mapped to more than one public database or mapped multiple times due to the existence of homologous interactions in the same database. Hence, little more than 50% of the total proteins for each Cp strain composed the interaction networks, demonstrating the need for further research to learn about all interactions among the proteins of this organism. Only a small fraction of the interactions were mapped and, considering the predicted interactions came from organisms whose interactions are already 87 known (interolog mapping), we indirectly realize that we still have a lot to learn about Cp ovis and others phylogenetically close organisms until all interactions became known. The phylogenetically close organisms are the most similar and hence their genotypes and phenotypes probably will also be similar. As this work uses interolog mapping to predict the interactions, we verify from which organism the Cp ovis interactions came. The vast majority of interactions were mapped from phylogenetically close organisms and the genus Corynebacterium accounted for ~99% of the mappings (Figure 2). This fact reinforce the reliability of the method and the interaction networks generated, after all, being the homologous PPI mapped from phylogenetically close organisms, greatly increases the chances they are realized in Cp. Figure 1 - Organisms from which the interactions were mapped. Although, such evidences suggest that these interactions really occur in Cp ovis, we further perform both statistical and literature-based validation to check the reliability of the predicted interaction networks. 3.1.4.2 - In silico PPI network validation We were able to show that the node degree distribution follows a power-law and together with shortest-path analysis suggest that the predicted networks have a scale-free distribution, possessing relevant characteristics pertaining to biological networks (Supplementary Material S1). Comparing the clustering coefficient, correlation and regression analysis using the R- 88 Squared metric from predicted Cp interaction networks, we observed that the values are higher than those obtained from random networks. With p-value < 2.2e-16 the Shapiro-Wilk normality test demonstrated that the predicted interaction networks do not show a normal distribution (Supplementary Material S2). All analyses suggest the networks were not formed by spurious interactions, and may have a biological bias, probably due to evolutionary pressure exerted over the interactions (Shapiro e Wilk, 1965). Moreover, the high Clustering Coefficient of the predicted networks suggest the existence of self-organization inside the biological cell motivated by the interactions (Galeota et al., 2015). The statistical analysis values from the predicted networks are quite close to other works using the same methodology (Rezende et al., 2012). Finally, based on biological literature support, we validate some conserved clusters identified in the networks, showing that the predicted interactions indeed exist in nature and therefore take place in C. pseudotuberculosis (Supplementary Material S3). With the predicted and validated PPI networks, for each Cp strain we also modeled the networks (Supplementary Material S4 to S12). Almost all pairs of the predicted interactions are common to the nine Cp ovis strains (core-interactome), a fact which is not surprising since Cp is extremely clonal (Soares et al., 2013). For each Cp ovis strain were predicted on average 16,669 interactions. In this work, we focused primarily on validating these interactions with computational methods or through literature support. The strain specific interactions or the accessory interactions are also important and cannot be ignored as they can explain the biology of a specific strain. However, here we focused on exploring the common PPIs for nine Cp ovis strains (core-interactome) aiming to better understand the serovar ovis instead of only a specific strain. Based on our predicted networks, we identified the conserved essential proteins in the serovar ovis. 3.1.4.3 - Essential proteins The hub proteins are highly interconnected, forming a dense network of interactions, probably participating in various cellular processes and metabolic pathways. Thus, these proteins are termed essential, where the knockout of any one of them can disrupt the interaction network (Han et al., 2004). From the interaction network view point, essentiality is measured by the degree of interaction of a protein (Khuri e Wuchty, 2015). So it is natural to conclude that these essential proteins interact with many other proteins, perhaps exerting various biological activities and participating in several metabolic pathways; thus the inhibition of these proteins 89 could interrupt their activity in various biological complexes (Han et al., 2004). Laboratory studies are necessary to confirm this hypotheses in Cp because every organism may have a particular and alternative repertoire of proteins to various stress type responses (Caufield et al., 2015). In order to identify the essential proteins from Cp ovis PPI network, we select the top 15% proteins with more interactions, termed hubs, conserved in all nine strains. Thus, we identified 181 hub essential proteins having 68 or more interactions. In the set of essential proteins, we find proteins involved in biological processes related to carbon metabolism, cell envelope and cell wall, DNA metabolism, nucleotides biosynthesis, folding, translocation, ribosomal translation factors, tRNA synthetase, RNA metabolism and respiratory pathways, among others. Aiming to verify the essentiality of these Cp proteins, we searched for homologous proteins in the DEG database. Among the 181 essential proteins, only one had no homology against bacterial DEG proteins, showing the effectiveness of our methods for identifying the essential proteins (Supplementary Material S13). Perhaps fewer essential proteins would be identified in DEG if we used a more restrictive cut-off point, which would reveal more Cp- exclusive list of essential proteins without homologous in DEG. The DNA repair protein (RecN), was the only Cp essential protein not found in DEG. RecN is responsible for maintaining DNA integrity when exposed to various stress conditions. Despite the conserved mechanism, both metabolic pathways and proteins can differ in each species (Eisen e Hanawalt, 1999). In E. coli and Clostridium difficile, the LexA repressor interacts with RecA regulating the DNA damage response (Walter et al., 2014); LexA is also reported to regulate RecN (Rostas et al., 1987), keeping the same expression pattern in Shewanella oneidensis when submitted to stress (Brown et al., 2006). All these interactions are also found in the C. pseudotuberculosis PPI network, wherein the interactions between LexA and RecN in the biovar ovis interact with proteins encoded by the following genes: recA, recO, recR, recF and recG are too conserved (Figure 2). This suggests an important role for both RecN and LexA proteins. Using RNA-Seq data, we verified that RecN and LexA had no significant change in their expression, thereby indicating a constitutive expression in conditions of thermal shock, acid and osmotic stress (Pinto et al., 2014), which is an expected characteristic for essential genes. 90 Figure 2 - Partial C. pseudotuberculosis DNA repair RecN interactions network. The vast majority of proteins have homologous proteins in DEG however, this does not reduce the importance of describing their essentiality. Considering Cp is not covered by DEG till date, the description of essentiality in this organism is novel for all 181 proteins. However, while most essential proteins have homologs from over 20 organisms, three proteins have homologs in a single organism covered by DEG, showing either the lack of experiments which would support their essentiality, the lack of protein conservation across species or that the essentiality of these proteins is not conserved across species (Caufield et al., 2015). These proteins are Catalase (KatA), Endonuclease III (Nth) and Trigger factor Tig (Tig). KatA has DEG homology against KatE from Salmonella enterica. KatA is an oxidoreductase enzyme which decomposes hydrogen peroxide (H2O2) at a rate of 40 million molecules per second (Nelson e Cox, 2002). In C. glutamicum, levels of KatA are increased quickly in response to the H2O2 addition (Milse et al., 2014) and, was highly up-regulated for the SOS and stress response (Park et al., 2014); the same occurring in C. pseudotuberculosis when exposed to acid medium (Pinto et al., 2014). Due to the fast response to oxidative stress, KatA is an important survival mechanism in host macrophages, and therefore may have biotechnological or pharmaceutical applications (Cutler, 2005; Mitra, 2014). Endonuclease III (Nth) has DEG homology against Haemophilus influenzae. Nth is a base excision repair enzyme (Sahbani et al., 2014) that participates in a pathway to prevent the loss of DNA functionality e.g., by spontaneous mutagenic lesion (Saito et al., 1997) or near-UV radiations (Serafini e Schellhorn, 1999). This mechanism was well studied and is conserved in the Corynebacterium species (Resende et al., 2011). Trigger factor Tig (Tig) has DEG homology against Pseudomonas aeruginosa. Tig participates in the protein folding process. In Escherichia coli, Tig cooperates with Chaperone protein DnaK to promote protein folding, however, is not essential for intermediate growth temperatures (Deuerling et al., 1999). In Exiguobacterium 91 antarcticum, a gram-positive psychrotrophic bacteria, only Tig was overexpressed in response to cold; the remaining chaperone proteins were underexpressed at 0°C (Dall et al., 2014). For C. pseudotuberculosis at 50°C, no significant change was observed in Tig expression, where the same also occurs with the Chaperonins GroEL, however DnaK was overexpressed (Pinto et al., 2014). It would be necessary to submit C. pseudotuberculosis to lower temperature to check the behavior of Tig. Additionally, in order to identify potential biomarkers or therapeutic targets among the essential proteins, a search for homologous proteins in the host organisms O. aries, C. hircus, B. taurus, E. caballus and H. sapiens was performed. Considering the Blastp alignment results (Supplementary Material S14), we identified 41 non-host homologous proteins, 24 having no alignment hit against O. aries and C. hircus proteins and 17 having both low identity (0-38%) and low coverage (0-44%) (Figure 3). Alignment details against hosts can be observed in Supplementary Material S15. Figure 3 - Homology distribution of Cp essential proteins aligned against hosts. Dark green: proteins homologous to host; Yellow: Proteins with low identity against hosts (identity < 30%). Dark red: non-host homologous proteins, proteins with low identity and low coverage alignment against hosts (identity x coverage <= 10%). Dark blue: non-host homologous proteins, proteins with no alignment hits against O. aires and C. hircus. Light blue: non-host homologous proteins, proteins with no alignment hits against the five hosts. The alignment details can observed in Supplementary Material S15. The 24 non-host homologous proteins without hits against hosts are: chorismate synthase (aroC), dihydrodipicolinate reductase (dapB), DNA primase (dnaG), elongation factor P (efp), cell division protein (ftsZ), ATP phosphoribosyl transferase (hisG), dihydroxy-acid dehydratase (ilvD), aspartate kinase (lysC), UDP-N-acetylglucosamine (murA), transcription anti-termination protein (nusG), uridylate kinase (pyrH), DNA repair protein (recN), 92 transcription termination factor (rho), 50S ribosomal protein L1 (rplA), 50S ribosomal protein L10 (rplJ), 50S ribosomal protein L31 (rpmE), DNA-directed RNA polymerase subunit alpha (rpoA), 30S ribosomal protein S3 (rpsC), 30S ribosomal protein S6 (rpsF), 30S ribosomal protein S13 (rpsM), holliday junction DNA helicase subunit (ruvA), SsrA-binding protein/SmpB superfamily (smpB), indole-3-glycerol phosphate synthase (trpC2) and anthranilate synthase (trpE). These 41 (24+17) non-host homologous essential proteins of Cp are good choices for therapeutic and diagnostic propose, not only by the disruption which may cause in the intra-species interactions but also by having greater potential to participate in inter-species interactions with host (Zhou et al., 2014). From the set of non-host homologous essential proteins, two classes draw special attention, both participating in the beginning of aromatic amino acids metabolic pathways, well characterized in Corynebacterium glutamicum (Ikeda, 2006), the proteins encoded by the trp operon, involved in tryptophan biosynthesis, and the protein prephenate dehydratase (pheA). The cluster analysis draws attention to the Cp iron acquisition system, which is a well characterized system contributing to the survival and virulence of microorganisms (Köster, 2001; Kunkle e Schmitt, 2005). The Cp cluster presents the interaction among proteins of multiple iron acquisition systems, a strategy to acquire iron from different sources or in low availability (Wandersman e Delepelaire, 2004), suggesting both, alternative metabolic pathways and alternative proteins from different operons exerting the same function. In Cp networks, these multiple systems interact and consist mainly of proteins from operon fag, ciu, fec and hmu (Supplementary Material S3). The use of interaction networks for identifying essential proteins can have a better sensitivity than other approaches. While we identified 181 essential proteins, of which 41 were non-host homologous, approaches using three-dimensional structures identify less than 10 essential protein units (Hassan et al., 2014). Besides the essential proteins, the identified interactions are equally important in Cp as it allows to search for small molecules inhibitors of binding interactions (Mora e Donaldson, 2012; Zoraghi e Reiner, 2013; Villoutreix et al., 2014), making feasible modern drug discovery research (Sheng et al., 2015). Such interaction network can also be used with RNA-Seq or proteomics experiments to assist in data interpretation. As an example of a biological application, the PPI network from C. pseudotuberculosis 1002 strain was used to investigate the interactions among the proteins identified as exclusive and differentially regulated in cells exposed to nitrosative stress (Silva et al., 2014). The results obtained in this work might serve as a basis for further essentiality 93 studies in other organisms by using the interaction network. By knowing the interaction partners of a protein, it is hence possible to provide a systemic view of the organism (Anh et al., 2015). 3.1.5 - Conclusions Here, for the first time we reported the PPI networks for nine Cp ovis strains and the biological relevance of the essential proteins identified in the networks. In addition to the validated networks, our contributions include the identification of 181 Cp essential proteins, 41 of them being non-host homologous, hence becoming good candidates for drug development or CLA diagnosis (Supplementary Material S13-S15). Since the essential proteins (hubs) interact with many others, it is natural to assume they associate differentially in various biological processes, in their own species well as the host, thereby participating in the formation of different clusters with other proteins to perform their functions, and hence are attractive targets for therapeutic and diagnostic propose. Similarly for the essential proteins, each specific interaction is a potential candidate to be subjected to identification of inhibitors (Villoutreix et al., 2014; Gowthaman, Lyskov e Karanicolas, 2015), thus opening several drug development opportunities about C. pseudotuberculosis. The PPI networks reported here are valuable tools for researchers to identify proteins or interactions as potential targets that may have a better sensibility than other approaches. The experimental validation for the predicted interactome is out of the scope of this study but is, vital and will be carried out in the near future. 3.1.6 - Author Contributions Conceived and designed the experiments: ELF. Designed and modeled the database in PostgreSQL DBMS: ELF. Developed routines in PL/PgSQL: ELF. Performed the experiments: ELF. Analyzed the data: ELF. Structured the paper: ELF, MG. Wrote the paper: ELF. Performed the clusters description: PVSDC, WMS. Performed the essential protein description ELF, Participated in revising the draft: ALL. Contributed materials/analysis tools/structure: JB, MG, RR, RSF, AS and VA. 94 3.1.7 - Funding Coordenação de Aperfeiçoamento de Pessoal de Ensino Superior (CAPES), Conselho Nacional de Pesquisa (CNPq) and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (Fapemig). 95 3.1.8 – Supplementary Material 3.1.8.1 – Shortest path and Degree distribution analysis. Supplementary Pictures S1: Shortest path and Degree distribution analysis. Shortest Path analysis of the nine Corynebacterium pseudotuberculosis serovar ovis strains (Figure 1- 9). Degree distribution analysis of the nine C. pseudotuberculosis serovar ovis strains. The red line indicate the perfect power-law distribution (Figure 10-18). Figure 4 - Cp1002 Shortest Path analysis Figure 5 - Cp267 Shortest Path analysis Figure 6 - Cp3995 Shortest Path analysis Figure 7 - Cp4202 Shortest Path analysis 96 Figure 8 - CpC231 Shortest Path analysis Figure 9 - Cpfrc Shortest Path analysis Figure 10 - CpI19 Shortest Path analysis Figure 11 - CpP54B96 Shortest Path analysis Figure 12 - CpPAT10 Shortest Path analysis Figure 13 - CpPAT10 Degree distribution analysis. Clustering coefficient = 0.407, Correlation = 0.938, R-Squared = 0.790, Shapiro-Wilk test = p-value < 2.2e-16. 97 Figure 14 - Cp1002 Degree distribution analysis. Clustering coefficient = 0.408, Correlation = 0.933, R-Squared = 0.822, Shapiro-Wilk test = p-value < 2.2e-16. Figure 15 - Cp267 Degree distribution analysis. Clustering coefficient = 0.402, Correlation = 0.953, R-Squared = 0.785, Shapiro-Wilk test = p-value < 2.2e-16. Figure 16 - Cp3995 Degree distribution analysis. Clustering coefficient = 0.410, Correlation = 0.933, R-Squared = 0.798, Shapiro-Wilk test = p-value < 2.2e-16. Figure 17 - Cp4202 Degree distribution analysis. Clustering coefficient = 0.410, Correlation = 0.928, R-Squared = 0.799, Shapiro-Wilk test = p-value < 2.2e-16. Figure 18 - CpC231 Degree distribution analysis. Clustering coefficient = 0.407, Correlation = 0.936, R-Squared = 0.825, Shapiro-Wilk test = p-value < 2.2e-16. Figure 19 - Cpfrc Degree distribution analysis. Clustering coefficient = 0.408, Correlation = 0.930, R-Squared = 0.786, Shapiro-Wilk test = p-value < 2.2e-16. 98 Figure 20 - CpI19 Degree distribution analysis. Clustering coefficient = 0.403, Correlation = 0.932, R-Squared = 0.813, Shapiro-Wilk test = p-value < 2.2e-16. Figure 21 - CpP54B96 Degree distribution analysis. Clustering coefficient = 0.404, Correlation = 0.935, R-Squared = 0.800, Shapiro-Wilk test = p-value < 2.2e-16. 99 3.1.8.2 – In silico PPI network validation Supplementary Pictures S2: In silico PPI network validation. Degree distribution analysis of nine interaction networks formed from 16,000 pairs of interactions randomly selected among all possible distinct interactions of Corynebacterium pseudotuberculosis Cp267 strain. The pairs distribution was analyzed by the plugin NetworkAnalyzer (Assenov et al., 2008). The red line indicate the perfect power law distribution (Barabási e Oltvai, 2004). All random networks had a normal distribution and a clustering coefficient of 0.007. Figure 22 – Random interaction network 01. Correlation = -0.064, R-squared = 0.038 Figure 23 - Random interaction network 02. Correlation = -0.015, R-squared = 0.001 Figure 24 - Random interaction network 03. Correlation = -0.028, R-squared = 0.073 Figure 25 - Random interaction network 04. Correlation = -0.031, R-squared = 0.017 100 Figure 26 - Random interaction network 05. Correlation = -0.072, R-squared = 0.059 Figure 27 - Random interaction network 06. Correlation = -0.049, R-squared = 0.027 Figure 28 - Random interaction network 07. Correlation = -0.042, R-squared = 0.021 Figure 29 - Random interaction network 08. Correlation = -0.029, R-squared = 0.003 Figure 30 - Random interaction network 09. Correlation = -0.012, R-squared = 0.020 101 Table 3 - Statistical comparison between the Cp ovis predicted networks against random networks. Organism Clustering Coefficient Correlation R-Squared Shapiro-Wilk normality test Cp1002 0.408 0.933 0.822 p-value < 2.2e-16 Cp267 0.402 0.953 0.785 p-value < 2.2e-16 Cp3995 0.410 0.933 0.798 p-value < 2.2e-16 Cp4202 0.410 0.928 0.799 p-value < 2.2e-16 CpC231 0.407 0.936 0.825 p-value < 2.2e-16 Cpfrc41 0.408 0.930 0.786 p-value < 2.2e-16 CpI19 0.403 0.932 0.813 p-value < 2.2e-16 CpP54B96 0.404 0.935 0.800 p-value < 2.2e-16 CpPAT10 0.407 0.938 0.790 p-value < 2.2e-16 Random Networks 0.007 (for all) -0.012 to -0.072 0.001 to 0.073 Not performed The Clustering Coefficient, Correlation and R-Squared were calculated by NetworkAnalyzer plugin (Assenov et al., 2008). The Shapiro-Wilk test was performed in R (Royston, 1982). 3.1.8.2.1 – References 1 Assenov, Y., Ramírez, F., Schelhorn, S.-E., Lengauer, T. & Albrecht, M. Computing topological parameters of biological networks. Bioinformatics 24, 282-284 (2008). 2 Barabási, A. L. & Oltvai, Z. N. Network biology: understanding the cell's functional organization. Nature Reviews Genetics 5, 101-113 (2004). 3 Royston, J. An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, 115-124 (1982). 102 3.1.8.3 – Analyses of protein clusters Supplementary Material S3: Analyses of protein clusters formed from Corynebacterium pseudotuberculosis biovar ovis protein-protein interaction network. In the figures we provide further details and information; in addition to proteins that form the cluster, same proteins were included interacting within the cluster. Their respective coding genes represent the proteins. In the network pictures, the color and size of nodes and edges were configured to show specific properties of the network, always from the lowest to the highest value of the chosen property. The node size (from smallest to largest) and color (in a range of yellow, light green to dark green) represent the property Degree. The border node size (from smallest to largest) and color (in a range of white, pink and dark red) represent the “Betweenness Centrality” property. The edge color, on a scale of red, yellow, light green to dark green, represents the score from public database where the interaction was mapped. The lowest score represents 0.70 in all networks. The edge width, from thinner to widest, represents the interaction score pair (ISP). The lowest value of ISP is 0.5625. 3.1.8.3.1 - Complex analysis Complexes are formed by groups of identical proteins (homomers) or different proteins (heteromers), and their organization is important in performing specific biological activities in a biological process (Dai et al., 2014). Such complexes are subject to evolutionary selection to form metabolic pathways (Marsh et al., 2013). In an interaction network, complexes are large groups of densely connected proteins forming clusters (Morris et al., 2011). To identify the clusters in the predicted networks, we used the Markov Cluster Algorithm (MCL) with inflation value set to 3.0 (Van Dongen, 2000), implemented in the Plugin ClusterMaker (Morris et al., 2011) available in the Cytoscape (Shannon et al., 2003) software. In addition, to validate the interaction networks, a literature search was performed to verify the existence of these clusters in other organisms, in the form of operons or metabolic pathways. For the PPI network and the complex visualization, we used the Circular or Edge-weighted Spring Embedded Cytoscape Layout (Kohl, Wiese e Warscheid, 2011). 3.1.8.3.2 - Ribosomal and RNA polymerase cluster The complex is a network representation of protein-protein interactions (PPI) formed during the translational process of ribosomes (ribosomal RNAs + protein) in C. pseudotuberculosis. This complex is formed by 53 ribosomal proteins (RP) and four of the five proteins comprise the RNA polymerases (RNAP). All proteins are conserved in C. pseudotuberculosis biovar 103 ovis strains where the presence of transcriptional and translational machinery components is noted. The RPs in the network are encoded by 23 genes rpl (rplBICEMKAQSDNLTFPOVJRWUXY), 10 genes rpm (rpmAEHBDCGIFJ) and 20 genes rps (rpsLBKIDEOJGCMHARSPNFQT) (Haddadin e Harcum, 2005). The RNAP proteins are encoded by genes rpoA, rpoB, rpoC and rpoZ (Coenye e Vandamme, 2005; Teixeira et al., 2008) (Figure 31). In the interaction network it can be observed that operon containing genes encode ribosomal proteins and genes encode proteins that form the subunits of RNAP, for example, the rplKAJL-rpoBC operon encoding the proteins of the large subunit of ribosome and also the β and β' subunits of RNAP (Teixeira et al., 2008). As in prokaryotes, the transcriptional and translational systems are coupled and synchronized in space and time; such information may be relevant for understanding the dependence between these two processes (Mcgary e Nudler, 2013). It is because when transcripts are generated for RP, probably transcripts for RNAP proteins are also generated and therefore will join with other components to assemble the respective machinery. Escherichia coli was the first organism having the ribosomal component (rRNA + proteins) elucidated (Stelzl et al., 2001), and hence is being widely used as a model for studies of ribosomal gene clusters in bacteria due to the similarity in the formation and organization of these clusters. In C. Glutamicum and C. diphtheriae, eleven gene clusters encoding 42 ribosomal proteins have been described. Comparing with the E. coli gene clusters, seven of them are organized in the same way and four have high similarity (Martı́N et al., 2003). Furthermore, when we look at the different bacterial genomes or even between different strains, we do not observe the conservation of all RPs (Coenye e Vandamme, 2005). This can possibly modify the pattern of interactions between the components of the translational and transcriptional machinery and somehow influence the expression of different genes in a given environmental condition. 104 Figure 31 - Network formed by the interaction of RNA polymerase and ribosomal proteins, represented by their encoding gene. Recent studies have attempted to identify and establish in vitro analyses in terms of possible physical-molecular contact between the components of the ribosomal machinery and RNAP and hence determine the influence of one machinery over the other. In one study, it was observed that the complex formed by the proteins encoded by the genes nusG-rpsJ, bind RNAP to the 30S subunit of the prokaryotic ribosome (Castro-Roa e Zenkin, 2012). In another study, the gene that encodes the S1 protein also binds to RNAP and stimulates transcriptional activity (Sukhodolets e Garges, 2003); these interactions are also observed in the networks of the present study. Other important observations can be found in the network, such as: large interaction of proteins encoded by genes rpoB, rpoC and rpoA with RP and no interactions of the protein encoded by the gene rpoZ with RP. This can be justified by the fact 105 that rpoZ is a sigma factor responsible for recognizing the binding site. After the protein beta subunits (β-encoded by rpoB gene), beta' (β'- encoded by gene rpoC) and alpha (α-encoded by rpoA gene) form the RNAP, rpoZ disconnects from the binding site. The network analysis can help us also select molecular targets for possible drug action. By observing the proteins encoded by the rpoA gene, rpoB and rpoC, we could note that they are highly connected proteins to RP. Thus, they can all potentially serve as candidate targets for drug development. An example in the literature is the RNAP β subunit inhibition (encoded by the rpoB gene) by antibiotic Rifampicin. There are also antibiotics like tetracycline, paromomycin, spectinomycin and streptomycin that exert their inhibitory activity on some proteins in the ribosomal 30S complex (Adékambi, Drancourt e Raoult, 2009). 3.1.8.3.3 - Oligopeptide transport system cluster The Opp transporters belonging to the ABC transporters family (ATP-binding cassette) were identified and characterized in several bacterial species, both in gram-positive and gram- negative (Braibant e Gilot, 2000; Monnet, 2003). This system consists of five protein subunits: OppA, responsible for the peptides capture of extracytoplasmic means; OppB and OPPC form the transmembrane channel through which the oligonucleotides will be transported to the intracellular environment; OppD and OppF, are located in the bacterial cytoplasm and are responsible for the hydrolysis of ATP molecules generating power for the process of internalizing peptides (Braibant e Gilot, 2000). From a genetic point of view, the genes encoding these subunits are organized as an operon oppABCDF (Hiron et al., 2007) (Figure 32). In bacteria, the main function of Opp is probably the peptides acquisition to be used as carbon and nitrogen source. In E. coli, it was demonstrated that this system is associated with the residues internalization of various amino acid types (Naider e Becker, 1975). A study of Lactococcus lactis has shown that the presence of a functional peptide transport system is required for the growth of bacteria in milk (Smid, Plapp e Konings, 1989). According to the generated interaction network, the Opp system is directly linked to the protein dihydrodipicolinate synthase (nanL) participating in L-lysine biosynthesis suggesting that this system may be associated with L-lysine metabolism. 106 Figure 32 - Network formed by the interaction of Opp proteins, represented by their encoding genes To date, no study was conducted to demonstrate the role of the Opp system in the transport of essential and nonessential amino acids in C. pseudotuberculosis. However, it was shown that the Opp system could contribute to the adhesion process of this pathogen. In tests conducted in experimental infection in a murine model, oppD mutant strains showed the same potential virulence compared to the wild type strain (Moraes et al., 2014). In Moraxella catarrhalis, it was demonstrated that the Opp system is also involved in the acquisition of arginine and contributes to the fitness and persistence of the pathogen in the respiratory tract (Jones et al., 2014). These studies demonstrate the versatility of the Opp system in pathogenic bacteria. 3.1.8.3.4 - Cobalamin biosynthesis cluster The cobalamin (CBL - Vitamin B12), members of the structurally complex cofactors class (Rodionov et al., 2003; Croft et al., 2005), is synthesized by a number of Archaea and Bacteria (Roth, Lawrence e Bobik, 1996; Scott e Roessner, 2002). However, the prosthetic group CBL is essential for the enzymatic activity of several enzymes in all the three biological domains (Yin e Bauer, 2013). In Bacteria and Archaea, the functional dependency is present 107 in the CBL methionine synthase, ribonucleotide reductase, glutamate, methylmalonyl-coA mutases, ethanolamine ammonia lyase, etc. (Rodionov et al., 2003). The biosynthesis pathways of CBL cofactors, chlorophyll and haem begin with the compound 5-aminolevulinic acid (ALA). This, through some enzymatic steps, is converted into Uroporphyrinogen III, the last common intermediate compound (Frankenberg, Moser e Jahn, 2003; Heldt et al., 2005; Yin e Bauer, 2013). It is noteworthy that all the co-factors are derived from tetrapyrroles molecules (Rondon, Trzebiatowski e Escalante-Semerena, 1996; Frankenberg, Moser e Jahn, 2003; Heldt et al., 2005). In the predicted PPI network for C. pseudotuberculosis CBL complex, we note the presence of several holoenzymes (HemABCDEL) interconnected with the holoenzymes (CobABDFGHJKLMNOQST) (Figure 33). Figure 33 - Network formed by the interaction of Cob proteins, represented by their encoding genes This may suggest a co-evolutionary dependence between clusters. A correspondence can be made with the Rhodobacter sphaeroides where excess haem inhibits 5-aminolevulinic acid synthase enzyme, affecting the biosynthesis of chloroplast (Yin e Bauer, 2013). Thus, observing the CBL network and the interaction between the different protein clusters, we can assume the existence of several regulatory mechanisms that are much more complicated. For cobalamin production, multiple steps and structural rearrangement of transmethylation are required (Rodionov et al., 2003). In C. pseudotuberculosis, 15 cob genes catalyze these reactions, with most of them being in the main cob operon, while the remaining genes (cobA, cobB, cobC and cobD) are not present in the main operon. This fact may indicate the 108 contribution of these genes to external assimilation of vitamin B12 precursors or secondary processes of de novo biosynthesis, as identified in Pseudomonas denitrificans (Roth, Lawrence e Bobik, 1996). The cbi gene cluster (cobinamide), responsible for CBL biosynthesis by anaerobic pathway (Moore e Warren, 2012), is absent in the network; so we can postulate that C. pseudotuberculosis use solely the aerobic pathway as an alternative to produce CBL (Rodionov et al., 2003), remembering that C. pseudotuberculosis is an anaerobic facultative microorganism (Dorella et al., 2006). 3.1.8.3.5 - Iron uptake and intracellular regulation cluster This complex is a representation of the PPI network for the capture process and intracellular regulation of iron (Fe) in C. pseudotuberculosis. Fe is an essential cofactor for diverse enzymatic activities that work in different metabolic processes (e.g., DNA replication, ATP synthesis, DNA repair and respiration etc.) in all eukaryotic organisms and various prokaryotes (Smith, 2004; Trost et al., 2010; Schalk, 2013). In pathogenic bacteria such as C. pseudotuberculosis, the Fe+ ions acquisition system contributes to the survival and virulence of the microorganism (Köster, 2001; Kunkle e Schmitt, 2005). A single bacterium can have multiple Fe acquisition systems. This feature is used as a strategy to acquire Fe from different sources and in low availability of this cofactor (Wandersman e Delepelaire, 2004). Thus, the complex represents these multiple systems and consists of 22 proteins encoded by genes fagABCD, ciuABCD, fecCDE (CD), hmuUVTO, htaA, pstA, fhuD, fpeC1, hemE and dtxR (Figure 34). 109 Figure 34 - Network formed by the interaction of Iron uptake proteins, represented by their encoding genes. During the infection process, C. pseudotuberculosis is able to survive and multiply within macrophages and hence escape from the host immune system response (Trost et al., 2010). One of these abilities can be related to the use of distinct or multiple siderophores (SIDS) (Correnti e Strong, 2012) synthesized by C. pseudotuberculosis or captured from the external environment (Schalk, 2013). In C. pseudotuberculosis, the SIDS are synthesized by genes fagD (Contreras et al., 2014) (represented in the network) and ciuE (Trost et al., 2010) (not present in the network). Probably these SIDS compete for the Iron ion (Fe+) with iron transporters used by the macrophage (Schalk, 2013). Another source of Fe+ can be derived from the transfer of the prosthetic group heme-Fe to the inside of C. pseudotuberculosis through hmuT receiver, whose interactions between hmuT and hemE can be seen in the network. This interaction occurs for the transfer of Heme-Fe to the inside of C. pseudotuberculosis; it then suffers a degradation process, releasing Fe+. In this process of degradation, hmuO operates in the cleavage of the tetrapyrrole ring of the group Heme-Fe (Contreras et al., 2014). Additionally, in the network the protein Cell-surface hemin receptor (htaA) exclusively interacts with proteins encoded by the hmuTUV genes, responsible for hemin binding and transport. These interactions agree with the literature in C. diphtheriae, wherein the HtaA was able to acquire hemin from hemoglobin and transport to cytosol by an 110 ABC transporter (Allen e Schmitt, 2011). These observations suggest that the interaction network is consistent and also C. pseudotuberculosis can use the same strategy for iron acquisition. In the network, there are also other systems for capturing iron, such as: Fag, Fec and Ciu proteins, as part of C. pseudotuberculosis strategy to acquire Fe+. One strategy that has been adopted to combat resistant bacteria is the ‘Trojan Horse’, which uses the iron uptake system to enter and kill the cell. The idea is based on the synthesis of the siderophore- drug complex, thus making the iron acquisition pathways through siderophore as potential targets for drug delivery (Górska, Sloderbach e Marszałł, 2014). Recently, a detailed review about iron acquisition strategies of gram-positive pathogens was published where the cluster proteins are cited, confirming the integrity of the predicted interaction network. Iron, being an important substance for survival and infection in gram-positive bacteria, the mechanisms of iron acquisition, transportation and processing become important areas of study, whose understanding might enable the development of new strategies to combat these organisms (Sheldon e Heinrichs, 2015). 3.1.8.3.6 - Cell division and peptidoglycan biosynthesis In various bacteria, there is a coupling and fine coordination between the processes related to cell division (cytokinesis), the formation of the peptidoglycan layer that makes up the cell walls, and DNA replication and segregation systems (Lutkenhaus And e Addinall, 1997; Buss et al., 2015). The 36 proteins from C. pseudotuberculosis represents this process and their interactions are shown in the predicted network (Figure 35), highlighting the FtsZIWHYXE protein involved in cell division (Lutkenhaus And e Addinall, 1997; Errington, Daniel e Scheffers, 2003) and the MurAFDEGIBC proteins responsible for the biosynthesis of peptidoglycans (El Zoeiby, Sanschagrin e Levesque, 2003). In the cytokinesis process, the FtsZ protein plays a central role in the formation of the cytoplasmic membrane ring constriction and, in the anchoring and recruitment of another protein set related to the cell division process (Lutkenhaus And e Addinall, 1997; Errington, Daniel e Scheffers, 2003). In the network, the FtsZ protein is highly connected to their neighbors, thereby suggesting the multiple connections as a representative element of the recruitment activity and anchoring conducted by FtsZ. As FtsZ is the main component of the cell division process, there is a need to maintain a harmony with the enzymes relating to the new cell wall synthesis (Carballido- López e Errington, 2003). In the C. pseudotuberculosis network, these enzymes are mainly represented by MurABCDEFGI and mraY proteins, related to the synthesis of new multilayer 111 peptidoglycans cell wall (Vollmer, Blanot e De Pedro, 2008). Thus, the network clearly shows a possible harmony between the components responsible for the peptidoglycan biosynthesis and FtsZ protein. It is worth noting the role of FtsW protein in nascent peptidoglycan transport to the outside of the plasma membrane. In the network, we could observe the presence of the proteins encoded by parA, parB and smc genes related to the chromosome partitioning process; soj with ATPase activity and scpA related to the condensation process and the bacterial chromosome segregation during cytokinesis. These proteins mainly interact with FtsZ, showing that FtsZ serves as a support for these proteins to perform their activities accordingly. Complementary approaches using PPI networks can be of great value to overcome the challenges related to the increasing number of resistant pathogenic bacteria to several current therapies. Thus, the organization and the connection between the network elements can help us in the identification and selection of new molecular targets for the development of more effective therapies. Currently, there are several compounds being synthesized and directed to act in the inhibition of peptidoglycan synthesis and in cell division steps (Den Blaauwen, Andreu e Monasterio, 2014). For example, compounds such as fosfomycin (phosphomycin), 4-thiazolidinone and phosphinic acid derivatives that act as inhibitors of MurA, MurB and MurCDEF respectively (El Zoeiby, Sanschagrin e Levesque, 2003). In this case, the bacteria does not survive by not forming the peptidoglycan layers. Inhibitors directed to block the beginning of cell division by preventing the formation of the constriction ring has been explored and tested against FtsZ, for example, the sanguinarine inhibitor that although showing inhibitory activity is not specific only to the target FtsZ (Den Blaauwen, Andreu e Monasterio, 2014). Therefore, further studies are needed to find more efficient inhibitors and most promising targets against various bacteria, especially against C. pseudotuberculosis; the protein-protein interaction networks are an important tool for this purpose. 112 Figure 35 - Network formed by the interaction of proteins involved in cell division and peptidoglycan biosynthesis, both represented by their encoding genes. In general, the clusters whose proteins are described in the literature (although in other organisms), prove the consistency of our predicted interaction network, reinforcing that the interactions can truly occur in Cp ovis. An example is the iron acquisition cluster participants whose proteins were cited in a recent review (Sheldon e Heinrichs, 2015). Despite the clusters identified and characterized individually in the interaction networks, it is common that some proteins also interact in several clusters, possibly exerting different function in each cluster. This is the case of Iron uptake, Cobalamin biosynthesis and Heme clusters, whose cooperation was characterized and described in other organisms (Köster, 2001). Likewise, clusters or interactions not previously described, or those poorly characterized in the literature, could bring further new and relevant information about Cp ovis. From the 113 clusters analysis, we conclude the following: some proteins, operons and interaction participants in the clusters are well described in the literature for other gram-positive organisms, fortifying that the predicted interaction networks are biologically feasible for Cp ovis and; although some proteins and operons are well described in the literature, in some cases, the interactions between these elements are not; hence, the interaction network has the potential to contribute more information leading to a better understanding of Cp ovis, and generating new testable hypotheses. A lack of information in the literature regarding certain interactions makes the PPI networks an important tool to better understand cellular behavior and to raise new hypotheses about the biochemical processes of Cp ovis, making possible direct future experiments to test the essentiality or druggability of these interactions. 3.1.8.3.7 - References 1 Dai, Q.-G., Guo, M.-Z., Liu, X.-Y., Teng, Z.-X. & Wang, C.-Y. CPL: Detecting Protein Complexes by Propagating Labels on Protein-Protein Interaction Network. Journal of Computer Science and Technology 29, 1083-1093 (2014). 2 Marsh, J. A. et al. Protein complexes are under evolutionary selection to assemble via ordered pathways. Cell 153, 461-470 (2013). 3 Morris, J. H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC bioinformatics 12, 436 (2011). 4 Van Dongen, S. A cluster algorithm for graphs. Report-Information systems, 1-40 (2000). 5 Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 13, 2498-2504 (2003). 6 Kohl, M., Wiese, S. & Warscheid, B. in Data Mining in Proteomics 291-303 (Springer, 2011). 7 Haddadin, F. a. T. & Harcum, S. W. Transcriptome profiles for high‐cell‐density recombinant and wild‐ type Escherichia coli. Biotechnology and bioengineering 90, 127-153 (2005). 8 Teixeira, D. et al. The tufB–secE–nusG–rplKAJL–rpoB gene cluster of the liberibacters: sequence comparisons, phylogeny and speciation. International Journal of Systematic and Evolutionary Microbiology 58, 1414-1421 (2008). 9 Coenye, T. & Vandamme, P. Organisation of the S10, spc and alpha ribosomal protein gene clusters in prokaryotic genomes. FEMS microbiology letters 242, 117-126 (2005). 10 McGary, K. & Nudler, E. RNA polymerase and the ribosome: the close relationship. Current opinion in microbiology 16, 112-117 (2013). 11 Stelzl, U., Connell, S., Nierhaus, K. H. & Wittmann‐Liebold, B. Ribosomal proteins: role in ribosomal functions. eLS (2001). 12 Martı́n, J. F., Barreiro, C., González-Lavado, E. & Barriuso, M. Ribosomal RNA and ribosomal proteins in corynebacteria. J. Biotechnol 104, 41-53 (2003). 13 Castro-Roa, D. & Zenkin, N. In vitro experimental system for analysis of transcription–translation coupling. Nucleic acids research 40, e45-e45 (2012). 114 14 Sukhodolets, M. V. & Garges, S. Interaction of Escherichia coli RNA polymerase with the ribosomal protein S1 and the Sm-like ATPase Hfq. Biochemistry 42, 8022-8034 (2003). 15 Adékambi, T., Drancourt, M. & Raoult, D. The rpoB gene as a tool for clinical microbiologists. Trends in microbiology 17, 37-45 (2009). 16 Monnet, V. Bacterial oligopeptide-binding proteins. Cellular and Molecular Life Sciences CMLS 60, 2100-2114 (2003). 17 Braibant, M. & Gilot, P. The ATP binding cassette (ABC) transport systems of Mycobacterium tuberculosis. FEMS microbiology reviews 24, 449-467 (2000). 18 Hiron, A., Borezée-Durant, E., Piard, J.-C. & Juillard, V. Only one of four oligopeptide transport systems mediates nitrogen nutrition in Staphylococcus aureus. Journal of bacteriology 189, 5119-5129 (2007). 19 Naider, F. & Becker, J. M. Multiplicity of oligopeptide transport systems in Escherichia coli. Journal of bacteriology 122, 1208-1215 (1975). 20 Smid, E. J., Plapp, R. & Konings, W. Peptide uptake is essential for growth of Lactococcus lactis on the milk protein casein. Journal of bacteriology 171, 6135-6140 (1989). 21 Moraes, P. M. et al. Characterization of the Opp Peptide Transporter of Corynebacterium pseudotuberculosis and Its Role in Virulence and Pathogenicity. BioMed research international 2014 (2014). 22 Jones, M. M. et al. Role of the Oligopeptide Permease ABC Transporter of Moraxella catarrhalis in Nutrient Acquisition and Persistence in the Respiratory Tract. Infection and immunity 82, 4758-4766 (2014). 23 Croft, M. T., Lawrence, A. D., Raux-Deery, E., Warren, M. J. & Smith, A. G. Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature 438, 90-93 (2005). 24 Rodionov, D. A., Vitreschak, A. G., Mironov, A. A. & Gelfand, M. S. Comparative genomics of the vitamin B12 metabolism and regulation in prokaryotes. Journal of Biological Chemistry 278, 41148- 41159 (2003). 25 Roth, J., Lawrence, J. & Bobik, T. Cobalamin (coenzyme B12): synthesis and biological significance. Annual Reviews in Microbiology 50, 137-181 (1996). 26 Scott, A. & Roessner, C. Biosynthesis of cobalamin (vitamin B (12)). Biochemical Society Transactions 30, 613-620 (2002). 27 Yin, L. & Bauer, C. E. Controlling the delicate balance of tetrapyrrole biosynthesis. Philosophical Transactions of the Royal Society of London B: Biological Sciences 368, 20120262 (2013). 28 Frankenberg, N., Moser, J. & Jahn, D. Bacterial heme biosynthesis and its biotechnological application. Applied microbiology and biotechnology 63, 115-127 (2003). 29 Heldt, D. et al. Aerobic synthesis of vitamin B12: ring contraction and cobalt chelation. Biochemical Society Transactions 33, 815-819 (2005). 30 Rondon, M. R., Trzebiatowski, J. R. & Escalante-Semerena, J. C. Biochemistry and molecular genetics of cobalamin biosynthesis. Progress in nucleic acid research and molecular biology 56, 347-384 (1996). 31 Moore, S. & Warren, M. The anaerobic biosynthesis of vitamin B12. Biochemical Society Transactions 40, 581 (2012). 115 32 Dorella, F. A., Pacheco, L. G. C., Oliveira, S. C., Miyoshi, A. & Azevedo, V. Corynebacterium pseudotuberculosis: microbiology, biochemical properties, pathogenesis and molecular studies of virulence. Veterinary research 37, 201-218 (2006). 33 Smith, J. L. The physiological role of ferritin-like compounds in bacteria. Critical reviews in microbiology 30, 173-185 (2004). 34 Schalk, I. J. Innovation and Originality in the Strategies Developed by Bacteria To Get Access to Iron. Chembiochem 14, 293-294 (2013). 35 Trost, E. et al. The complete genome sequence of Corynebacterium pseudotuberculosis FRC41 isolated from a 12-year-old girl with necrotizing lymphadenitis reveals insights into gene-regulatory networks contributing to virulence. BMC genomics 11, 728 (2010). 36 Köster, W. ABC transporter-mediated uptake of iron, siderophores, heme and vitamin B 12. Research in microbiology 152, 291-301 (2001). 37 Kunkle, C. A. & Schmitt, M. P. Analysis of a DtxR-regulated iron transport and siderophore biosynthesis gene cluster in Corynebacterium diphtheriae. Journal of bacteriology 187, 422-433 (2005). 38 Wandersman, C. & Delepelaire, P. Bacterial iron sources: from siderophores to hemophores. Annu. Rev. Microbiol. 58, 611-647 (2004). 39 Correnti, C. & Strong, R. K. Mammalian siderophores, siderophore-binding lipocalins, and the labile iron pool. Journal of Biological Chemistry 287, 13524-13531 (2012). 40 Contreras, H., Chim, N., Credali, A. & Goulding, C. W. Heme uptake in bacterial pathogens. Current opinion in chemical biology 19, 34-41 (2014). 41 Allen, C. E. & Schmitt, M. P. Novel hemin binding domains in the Corynebacterium diphtheriae HtaA protein interact with hemoglobin and are critical for heme iron utilization by HtaA. Journal of bacteriology 193, 5374-5385 (2011). 42 Górska, A., Sloderbach, A. & Marszałł, M. P. Siderophore–drug complexes: potential medicinal applications of the ‘Trojan horse’strategy. Trends in pharmacological sciences 35, 442-449 (2014). 43 Sheldon, J. R. & Heinrichs, D. E. Recent developments in understanding the iron acquisition strategies of gram positive pathogens. FEMS microbiology reviews, fuv009 (2015). 44 Lutkenhaus and, J. & Addinall, S. Bacterial cell division and the Z ring. Annual review of biochemistry 66, 93-116 (1997). 45 Buss, J. et al. A Multi-layered Protein Network Stabilizes the Escherichia coli FtsZ-ring and Modulates Constriction Dynamics. (2015). 46 Errington, J., Daniel, R. A. & Scheffers, D.-J. Cytokinesis in bacteria. Microbiology and Molecular Biology Reviews 67, 52-65 (2003). 47 El Zoeiby, A., Sanschagrin, F. & Levesque, R. C. Structure and function of the Mur enzymes: development of novel inhibitors. Molecular microbiology 47, 1-12 (2003). 48 Carballido-López, R. & Errington, J. A dynamic bacterial cytoskeleton. Trends in cell biology 13, 577- 583 (2003). 49 Vollmer, W., Blanot, D. & De Pedro, M. A. Peptidoglycan structure and architecture. FEMS microbiology reviews 32, 149-167 (2008). 50 den Blaauwen, T., Andreu, J. M. & Monasterio, O. Bacterial cell division proteins as antibiotic targets. Bioorganic chemistry 55, 27-38 (2014). 116 3.1.8.4 – Cp267 PPI network Figure 36 - Cp267 PPI network 117 3.1.8.5 – Cp3995 PPI network Figure 37 - Cp3995 PPI network 118 3.1.8.6 – Cp4202 PPI network Figure 38 - Cp4202 PPI network 119 3.1.8.7 – CpC231 PPI network Figure 39 - CpC231 PPI network 120 3.1.8.8 – CpFRC PPI network Figure 40 - CpFRC PPI network 121 3.1.8.9 – CpI19 PPI network Figure 41 - CpI19 PPI network 122 3.1.8.10 – CpP54B96 PPI network Figure 42 - CpP54B96 PPI network 123 3.1.8.11 – CpPAT10 PPI network Figure 43 - CpPAT10 PPI network 124 3.1.8.12 – Cp1002 PPI network Figure 44 - Cp1002 PPI network 125 3.1.8.13 – List of top 15% proteins with higher degree against DEG Supplementary Material Supplementary Table S13: List of top 15% proteins with higher degree interaction, totaling 181 hub essential proteins. The amino acid sequence of hubs proteins was compared against bacterial proteins sequence from Database of Essential Genes (DEG) (Zhang, Ou e Zhang, 2004), v. 11.2, updated on July 3, 2015. DEG Blast Genome Result - 8Xblk57KOz Your job ID: 8Xblk57KOz, which was completed in Tue Jul 28 01:32:28 2015 Beijing time. The result will be stored for 7 days and download Here. Organism: Acinetobacter baylyi ADP1; Bacillus subtilis 168; Bacteroides fragilis 638R; Bacteroides thetaiotaomicron VPI-548 ; Burkholderia pseudomallei K96243; Burkholderia thailandensis E264; Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819; Caulobacter crescentus; Escherichia coli MG1655 I; Escherichia coli MG1655 II; Francisella novicida U112; Haemophilus influenzae Rd KW20; Helicobacter pylori 26695; Mycobacterium tuberculosis H37Rv; Mycobacterium tuberculosis H37Rv II; Mycobacterium tuberculosis H37Rv III; Mycoplasma genitalium G37; Mycoplasma pulmonis UAB CTIP; Porphyromonas gingivalis ATCC 33277; Pseudomonas aeruginosa PAO1; Pseudomonas aeruginosa PAO1; Pseudomonas aeruginosa UCBPP-PA14; Salmonella enterica serovar Typhi; Pseudomonas aeruginosa PAO1; Salmonella enterica serovar Typhimurium SL1344; Salmonella enterica subsp. enterica serovar Typhimurium str. 14028S; Salmonella typhimurium LT2; Shewanella oneidensis MR-1; Sphingomonas wittichii RW1; Staphylococcus aureus N315; Staphylococcus aureus NCTC 8325; Streptococcus pneumoniae; Streptococcus pyogenes MGAS5448; Streptococcus pyogenes NZ131; Streptococcus sanguinis; Vibrio cholerae N16961; Parameters: deg.py -i /var/www/tubic/cgi-bin/blast/temp_seq/8Xblk57KOz/seq.txt -db /var/www/tubic/cgi- bin/blast/temp_seq/8Xblk57KOz/db -type seq -score 100 -email edson.folador@gmail.com -job 8Xblk57KOz -F F -e 0.00001 -M BLOSUM62 -g T -v 100 -b 100 -blastprogram blastp Total protein-coding genes in your sequence: 181 genes In your sequence, the No. of genes having homologs with DEG: 180 genes. In DEG, the No. of genes having homologs with your sequence: 4356 genes. Your Query Protein No. of homologs in DEG DEG AC Number ackA 96 acetate kinase 6 DEG10140081; DEG10060294; DEG10180359; DEG10030589; DEG10220242; DEG10020202; adk 172 adenylate kinase 24 DEG10170313; DEG10160162; DEG10010056; DEG10030207; DEG10060142; DEG10120257; DEG10340158; DEG10320059; DEG10130162; DEG10210026; DEG10190055; DEG10110036; DEG10310077; DEG10330165; DEG10240056; DEG10140221; DEG10290187; DEG10180089; DEG10380017; DEG10220167; DEG10020257; DEG10350054; DEG10200155; DEG10070123; 126 alaS 106 alanyl-tRNA synthetase 29 DEG10220278; DEG10380163; DEG10270472; DEG10050295; DEG10340311; DEG10060236; DEG10160188; DEG10360038; DEG10290291; DEG10100415; DEG10200338; DEG10030105; DEG10010189; DEG10110153; DEG10170223; DEG10340111; DEG10350227; DEG10020178; DEG10230270; DEG10130175; DEG10320224; DEG10140212; DEG10250501; DEG10330190; DEG10210070; DEG10310060; DEG10180418; DEG10370149; DEG10120181; apt 72 Adenine phosphoribosyltransferase 8 DEG10080089; DEG10030222; DEG10290185; DEG10050439; DEG10310120; DEG10180087; DEG10140122; DEG10060229; argC 86 N-acetyl-gamma- glutamyl-phosphate reductase 7 DEG10270317; DEG10130194; DEG10180564; DEG10280024; DEG10250346; DEG10100280; DEG10300019; argF 115 ornithine carbamoyltransferase 9 DEG10130179; DEG10130195; DEG10280190; DEG10100218; DEG10250256; DEG10100284; DEG10280094; DEG10270250; DEG10240054; argG 77 argininosuccinate synthase 7 DEG10280324; DEG10130167; DEG10240014; DEG10350178; DEG10270318; DEG10100285; DEG10250348; argS 112 arginyl-tRNA synthetase 27 DEG10160081; DEG10230168; DEG10240020; DEG10180318; DEG10120144; DEG10350020; DEG10270225; DEG10380228; DEG10140084; DEG10330083; DEG10320175; DEG10340516; DEG10290353; DEG10170056; DEG10200407; DEG10060311; DEG10100191; DEG10010259; DEG10210210; DEG10360303; DEG10130019; DEG10190126; DEG10220065; DEG10250230; DEG10280448; DEG10020056; DEG10370219; aroB 86 3-dehydroquinate synthase 7 DEG10130454; DEG10280002; DEG10050099; DEG10250498; DEG10310131; DEG10100410; DEG10200369; aroC 125 Chorismate synthase 9 DEG10080112; DEG10050092; DEG10130248; DEG10330069; DEG10360070; DEG10250499; DEG10280523; DEG10100412; DEG10200378; asd 98 Aspartate-semialdehyde dehydrogenase 16 DEG10010139; DEG10220113; DEG10330298; DEG10130082; DEG10360111; DEG10340074; DEG10050206; DEG10240377; DEG10230221; DEG10160294; DEG10150120; DEG10250719; DEG10100582; DEG10320294; DEG10190236; DEG10180519; aspS 101 Aspartyl-tRNA synthetase 71 DEG10230041; DEG10290198; DEG10160082; DEG10230108; DEG10270474; DEG10070142; DEG10160126; DEG10150090; DEG10290095; DEG10060109; DEG10030264; DEG10370220; DEG10100563; DEG10010153; DEG10340426; DEG10180315; DEG10330200; DEG10120021; DEG10020181; DEG10270631; DEG10350280; DEG10140128; DEG10380068; DEG10380229; DEG10070078; DEG10280009; DEG10350040; DEG10320174; DEG10320231; DEG10330084; DEG10110114; DEG10240217; DEG10130100; DEG10220240; DEG10180155; DEG10030140; DEG10360151; DEG10370076; DEG10030238; DEG10380076; DEG10330128; DEG10170227; DEG10170035; DEG10130158; DEG10290234; DEG10340280; DEG10110060; DEG10060092; DEG10370069; DEG10080027; DEG10340389; DEG10120036; DEG10230258; DEG10190083; DEG10190125; DEG10220253; DEG10320103; DEG10220043; DEG10160197; DEG10250503; DEG10010018; DEG10010192; DEG10020159; DEG10250697; DEG10020039; DEG10360043; DEG10110164; DEG10210138; DEG10240043; DEG10200246; DEG10060025; atpA 127 ATP synthase subunit alpha 44 DEG10030559; DEG10380085; DEG10200418; DEG10380087; DEG10060328; DEG10150334; DEG10200416; DEG10360331; DEG10120357; DEG10230082; DEG10120359; DEG10210080; DEG10130026; DEG10350417; DEG10310006; DEG10290397; DEG10350419; DEG10130028; DEG10270237; DEG10250245; DEG10140165; DEG10280104; DEG10280102; DEG10250243; DEG10100207; DEG10240356; DEG10100205; DEG10240358; DEG10060330; DEG10270239; DEG10030561; DEG10070182; DEG10020238; DEG10370084; DEG10070184; DEG10360329; DEG10080206; DEG10140095; DEG10140097; DEG10140079; DEG10370086; DEG10210078; DEG10140273; DEG10290395; atpD 111 ATP synthase subunit beta 52 DEG10030559; DEG10380085; DEG10200418; DEG10310006; DEG10060328; DEG10150334; DEG10200416; DEG10080206; DEG10120357; DEG10360316; DEG10120359; DEG10210080; DEG10130026; DEG10240272; DEG10350417; DEG10380087; DEG10290397; DEG10290395; DEG10340350; DEG10130028; DEG10230082; DEG10250245; DEG10140165; DEG10280104; DEG10280102; 127 DEG10250243; DEG10240358; DEG10240356; DEG10100205; DEG10210078; DEG10100207; DEG10060330; DEG10120308; DEG10100196; DEG10270239; DEG10030561; DEG10070182; DEG10020238; DEG10270230; DEG10270237; DEG10070184; DEG10360329; DEG10360331; DEG10140095; DEG10140097; DEG10250235; DEG10140079; DEG10220347; DEG10370086; DEG10370084; DEG10140273; DEG10350419; atpG 85 ATP synthase subunit gamma 19 DEG10130027; DEG10350418; DEG10240357; DEG10140096; DEG10250244; DEG10290396; DEG10200417; DEG10100206; DEG10080207; DEG10060329; DEG10360330; DEG10270238; DEG10210079; DEG10030560; DEG10070183; DEG10380086; DEG10280103; DEG10370085; DEG10120358; carA 93 carbamoyl-phosphate synthase small chain 6 DEG10250259; DEG10130343; DEG10100221; DEG10350093; DEG10240302; DEG10280051; cysK 78 cysteine synthase 7 DEG10350174; DEG10130192; DEG10120293; DEG10270228; DEG10100194; DEG10250233; DEG10350313; dapA 74 Dihydrodipicolinate synthase 21 DEG10330063; DEG10270496; DEG10080171; DEG10230047; DEG10350274; DEG10340286; DEG10320197; DEG10290178; DEG10220439; DEG10180377; DEG10160062; DEG10360048; DEG10130493; DEG10200132; DEG10250534; DEG10100437; DEG10280080; DEG10190142; DEG10010140; DEG10050109; DEG10150232; dapB 72 Dihydrodipicolinate reductase 21 DEG10340026; DEG10190004; DEG10070080; DEG10240136; DEG10220433; DEG10180009; DEG10270499; DEG10200436; DEG10080085; DEG10030471; DEG10050467; DEG10280033; DEG10130496; DEG10150289; DEG10360274; DEG10330006; DEG10320008; DEG10290098; DEG10010156; DEG10250539; DEG10160006; dnaB 90 Replicative DNA helicase 30 DEG10170008; DEG10330337; DEG10010264; DEG10050567; DEG10230028; DEG10190283; DEG10020007; DEG10370223; DEG10350103; DEG10130289; DEG10290335; DEG10120221; DEG10180580; DEG10030066; DEG10060076; DEG10270016; DEG10340304; DEG10210214; DEG10380232; DEG10140196; DEG10220277; DEG10240260; DEG10160333; DEG10280467; DEG10200206; DEG10250016; DEG10100006; DEG10320337; DEG10070109; DEG10360282; dnaG 109 DNA primase 25 DEG10120215; DEG10130353; DEG10240367; DEG10100370; DEG10010179; DEG10210086; DEG10060211; DEG10230266; DEG10320247; DEG10170208; DEG10140167; DEG10200373; DEG10370090; DEG10250453; DEG10380091; DEG10050169; DEG10270420; DEG10220377; DEG10180456; DEG10360024; DEG10160216; DEG10020170; DEG10070161; DEG10290119; DEG10330219; dnaK 239 Chaperone protein DnaK 46 DEG10170214; DEG10240007; DEG10290207; DEG10150262; DEG10250060; DEG10220194; DEG10290096; DEG10240157; DEG10160001; DEG10360275; DEG10210188; DEG10200006; DEG10270062; DEG10230243; DEG10080013; DEG10160241; DEG10180385; DEG10380203; DEG10340449; DEG10350009; DEG10240216; DEG10130207; DEG10100038; DEG10280144; DEG10200186; DEG10330001; DEG10370193; DEG10290351; DEG10190198; DEG10180486; DEG10060246; DEG10220183; DEG10230317; DEG10140074; DEG10360243; DEG10050129; DEG10110001; DEG10020173; DEG10320001; DEG10330244; DEG10180002; DEG10010198; DEG10030073; DEG10360162; DEG10310027; DEG10280075; efp 133 elongation factor P 15 DEG10140194; DEG10350297; DEG10290217; DEG10180585; DEG10270470; DEG10200101; DEG10060017; DEG10100408; DEG10070028; DEG10240206; DEG10250496; DEG10120006; DEG10030245; DEG10020167; DEG10170204; engA 105 GTP-binding protein EngA 66 DEG10130096; DEG10030494; DEG10130311; DEG10150286; DEG10200195; DEG10170166; DEG10170194; DEG10060007; DEG10010137; DEG10120258; DEG10010182; DEG10150241; DEG10280349; DEG10220018; DEG10050356; DEG10350005; DEG10230249; DEG10060268; DEG10360036; DEG10240278; DEG10160061; DEG10380040; DEG10220019; DEG10070211; DEG10180396; DEG10140009; DEG10360159; DEG10170349; DEG10140144; DEG10240204; DEG10020161; DEG10140023; DEG10320198; DEG10330062; DEG10240293; DEG10120365; DEG10290279; DEG10060274; DEG10010162; DEG10170210; DEG10340439; DEG10120110; DEG10360266; DEG10050005; DEG10260085; DEG10130059; DEG10180543; DEG10160053; DEG10100299; DEG10320205; DEG10340440; DEG10150084; DEG10230250; DEG10020137; DEG10380059; 128 DEG10250362; DEG10370045; DEG10190143; DEG10200205; DEG10180380; DEG10210158; DEG10110143; DEG10190149; DEG10330054; DEG10290131; DEG10380130; eno 232 enolase 21 DEG10010237; DEG10100156; DEG10380080; DEG10350276; DEG10130245; DEG10320227; DEG10060336; DEG10140199; DEG10290297; DEG10250192; DEG10110158; DEG10170081; DEG10370079; DEG10330196; DEG10020073; DEG10270189; DEG10160193; DEG10210097; DEG10070168; DEG10190165; DEG10120154; ffh 123 Signal recognition particle protein 56 DEG10240125; DEG10350369; DEG10270524; DEG10100467; DEG10330294; DEG10380066; DEG10280268; DEG10060239; DEG10230083; DEG10350024; DEG10110193; DEG10320217; DEG10280321; DEG10070065; DEG10370131; DEG10080130; DEG10020123; DEG10160183; DEG10340351; DEG10020124; DEG10070070; DEG10230049; DEG10210140; DEG10180523; DEG10180407; DEG10330185; DEG10190158; DEG10120191; DEG10030109; DEG10130268; DEG10170147; DEG10170146; DEG10320298; DEG10340288; DEG10120343; DEG10010122; DEG10010123; DEG10160290; DEG10190240; DEG10240026; DEG10380142; DEG10060035; DEG10050261; DEG10200459; DEG10220251; DEG10290384; DEG10200455; DEG10220042; DEG10210115; DEG10250568; DEG10030018; DEG10370067; DEG10140132; DEG10140264; DEG10290133; DEG10250567; fmt 116 Methionyl-tRNA formyltransferase 38 DEG10240003; DEG10330331; DEG10060300; DEG10180208; DEG10010113; DEG10340404; DEG10270171; DEG10050569; DEG10310005; DEG10320263; DEG10230268; DEG10380180; DEG10070214; DEG10160327; DEG10250267; DEG10290009; DEG10190203; DEG10030007; DEG10070114; DEG10210163; DEG10290147; DEG10370165; DEG10180489; DEG10100227; DEG10150004; DEG10020090; DEG10140017; DEG10270255; DEG10050518; DEG10220436; DEG10240263; DEG10250176; DEG10280133; DEG10280463; DEG10360007; DEG10130497; DEG10170137; DEG10120183; folA 71 Dihydrofolate reductase 25 DEG10060193; DEG10120040; DEG10340385; DEG10150012; DEG10010151; DEG10070193; DEG10140203; DEG10250537; DEG10360015; DEG10200310; DEG10280411; DEG10180010; DEG10290317; DEG10130087; DEG10170184; DEG10380113; DEG10220457; DEG10350306; DEG10190005; DEG10370107; DEG10210110; DEG10050323; DEG10030078; DEG10320009; DEG10020156; folD 126 bifunctional protein folD 20 DEG10290173; DEG10140091; DEG10150189; DEG10270593; DEG10160158; DEG10070147; DEG10130339; DEG10350282; DEG10320063; DEG10060008; DEG10310111; DEG10120105; DEG10360076; DEG10100532; DEG10330161; DEG10240209; DEG10190059; DEG10250662; DEG10110039; DEG10010172; frr 147 Ribosome-recycling factor (RRF) 30 DEG10230126; DEG10130197; DEG10290153; DEG10270518; DEG10010131; DEG10330037; DEG10120044; DEG10050292; DEG10340098; DEG10150096; DEG10320036; DEG10200260; DEG10190031; DEG10240237; DEG10250559; DEG10070051; DEG10210145; DEG10280062; DEG10370059; DEG10060354; DEG10030447; DEG10100457; DEG10310023; DEG10160036; DEG10380056; DEG10140072; DEG10170158; DEG10360146; DEG10220391; DEG10180042; ftsH 90 cell division protein 26 DEG10190184; DEG10080192; DEG10120163; DEG10200390; DEG10380004; DEG10360272; DEG10330229; DEG10250396; DEG10140307; DEG10160226; DEG10100569; DEG10110173; DEG10290109; DEG10240299; DEG10340120; DEG10130342; DEG10350095; DEG10350460; DEG10250704; DEG10370004; DEG10220007; DEG10230277; DEG10280402; DEG10020038; DEG10060369; DEG10030130; ftsY 91 cell division protein FtsY 58 DEG10230049; DEG10210115; DEG10170147; DEG10240026; DEG10330294; DEG10280268; DEG10060239; DEG10070070; DEG10350024; DEG10110193; DEG10200459; DEG10280321; DEG10020124; DEG10350369; DEG10370131; DEG10080130; DEG10020123; DEG10160183; DEG10340351; DEG10380066; DEG10230083; DEG10240125; DEG10210140; DEG10180523; DEG10180407; DEG10320298; DEG10190240; DEG10120191; DEG10140264; DEG10130268; DEG10270524; DEG10170146; DEG10330185; DEG10340288; DEG10240293; DEG10120343; DEG10010122; DEG10010123; DEG10160290; DEG10190158; DEG10380142; DEG10100467; DEG10060035; DEG10050261; DEG10320217; DEG10220251; DEG10290384; DEG10200455; DEG10220042; DEG10070065; DEG10250568; DEG10030018; DEG10370067; DEG10140132; DEG10030135; DEG10030109; DEG10290133; DEG10250567; 129 ftsZ 184 Cell division protein ftsZ 31 DEG10340057; DEG10290360; DEG10350376; DEG10030475; DEG10070204; DEG10310088; DEG10140186; DEG10230203; DEG10160021; DEG10190018; DEG10110013; DEG10200339; DEG10370157; DEG10100322; DEG10210062; DEG10380170; DEG10330022; DEG10240115; DEG10050413; DEG10120032; DEG10010109; DEG10080167; DEG10360220; DEG10060191; DEG10180024; DEG10280449; DEG10130478; DEG10250404; DEG10020110; DEG10320022; DEG10170131; fusA 171 Elongation factor G 86 DEG10150286; DEG10020086; DEG10210198; DEG10100449; DEG10350413; DEG10180508; DEG10180509; DEG10340483; DEG10110170; DEG10340049; DEG10120051; DEG10060114; DEG10120365; DEG10360266; DEG10350405; DEG10350406; DEG10180471; DEG10340492; DEG10250133; DEG10250132; DEG10020174; DEG10140150; DEG10030135; DEG10230160; DEG10110190; DEG10010032; DEG10160222; DEG10180568; DEG10370036; DEG10280414; DEG10320291; DEG10120342; DEG10220335; DEG10160298; DEG10230195; DEG10300067; DEG10260053; DEG10140071; DEG10100100; DEG10130059; DEG10020051; DEG10020052; DEG10190182; DEG10170048; DEG10170049; DEG10010137; DEG10350216; DEG10220422; DEG10280185; DEG10010033; DEG10270119; DEG10130160; DEG10140160; DEG10330302; DEG10170166; DEG10270506; DEG10200381; DEG10100099; DEG10380031; DEG10050188; DEG10220060; DEG10020137; DEG10380190; DEG10290114; DEG10320249; DEG10130313; DEG10050638; DEG10060366; DEG10330225; DEG10130146; DEG10370177; DEG10200009; DEG10240293; DEG10370071; DEG10070012; DEG10290030; DEG10060071; DEG10290087; DEG10230154; DEG10020099; DEG10270120; DEG10220276; DEG10360202; DEG10190231; DEG10210136; DEG10250549; gap 122 glyceraldehyde-3- phosphate dehydrogenase 26 DEG10060242; DEG10230289; DEG10100232; DEG10270261; DEG10020194; DEG10280266; DEG10220016; DEG10340137; DEG10370037; DEG10250274; DEG10170077; DEG10160091; DEG10130309; DEG10210197; DEG10050001; DEG10180303; DEG10190123; DEG10380032; DEG10360021; DEG10130176; DEG10140018; DEG10320124; DEG10310185; DEG10330093; DEG10020070; DEG10070229; glmS 87 glucosamine--fructose- 6-phosphate 27 DEG10100542; DEG10150333; DEG10370138; DEG10120122; DEG10250670; DEG10020242; DEG10180545; DEG10200027; DEG10290391; DEG10190260; DEG10070011; DEG10070113; DEG10170302; DEG10240015; DEG10280494; DEG10130187; DEG10380152; DEG10010067; DEG10360327; DEG10210196; DEG10250154; DEG10350017; DEG10270602; DEG10270149; DEG10310182; DEG10100131; DEG10320312; gltA 125 Citrate synthase 6 DEG10270203; DEG10130349; DEG10250165; DEG10240376; DEG10350491; DEG10280364; gltX1 129 glutamyl-tRNA synthetase 49 DEG10050487; DEG10130234; DEG10030209; DEG10270537; DEG10230068; DEG10230162; DEG10240227; DEG10340318; DEG10050111; DEG10190138; DEG10150119; DEG10280365; DEG10220425; DEG10320075; DEG10210203; DEG10150191; DEG10360112; DEG10160067; DEG10370032; DEG10380029; DEG10160144; DEG10020041; DEG10070232; DEG10180117; DEG10140266; DEG10130461; DEG10330147; DEG10360074; DEG10100479; DEG10290169; DEG10170036; DEG10330068; DEG10120137; DEG10320192; DEG10120039; DEG10250585; DEG10350228; DEG10030428; DEG10010021; DEG10310173; DEG10180366; DEG10280246; DEG10350266; DEG10340494; DEG10220100; DEG10200249; DEG10060373; DEG10110046; DEG10190068; glyA 251 Serine hydroxymethyltransferase 17 DEG10350437; DEG10350340; DEG10360253; DEG10250204; DEG10120284; DEG10130263; DEG10150275; DEG10280131; DEG10010253; DEG10020234; DEG10270194; DEG10240161; DEG10160056; DEG10140131; DEG10060325; DEG10330057; DEG10180392; gmk 99 guanylate kinase 27 DEG10130451; DEG10230101; DEG10010111; DEG10120168; DEG10340379; DEG10180540; DEG10140269; DEG10380181; DEG10060088; DEG10080056; DEG10250261; DEG10200213; DEG10240176; DEG10210164; DEG10320306; DEG10290064; DEG10100223; DEG10330282; DEG10050642; DEG10360324; DEG10160278; DEG10350326; DEG10280276; DEG10220294; DEG10370166; DEG10170134; DEG10190251; greA 82 Transcription elongation factor GreA 8 DEG10170220; DEG10250200; DEG10070073; DEG10140210; DEG10310029; DEG10020176; DEG10080151; DEG10060232; 130 groEL 105 Chaperonin 26 DEG10170284; DEG10060323; DEG10010077; DEG10360216; DEG10350338; DEG10270080; DEG10340356; DEG10380223; DEG10320342; DEG10030742; DEG10180584; DEG10100059; DEG10250085; DEG10120323; DEG10110220; DEG10290080; DEG10330342; DEG10100537; DEG10240163; DEG10230091; DEG10210039; DEG10220290; DEG10160337; DEG10370214; DEG10070105; DEG10200093; groEL1 125 Chaperonin GroEL 26 DEG10170284; DEG10060323; DEG10010077; DEG10360216; DEG10350338; DEG10270080; DEG10340356; DEG10380223; DEG10320342; DEG10030742; DEG10180584; DEG10100059; DEG10250085; DEG10120323; DEG10110220; DEG10290080; DEG10330342; DEG10100537; DEG10200093; DEG10230091; DEG10210039; DEG10220290; DEG10160337; DEG10370214; DEG10070105; DEG10240163; guaA 142 GMP synthase 24 DEG10170020; DEG10270006; DEG10050500; DEG10280368; DEG10270597; DEG10020024; DEG10250005; DEG10360157; DEG10120208; DEG10100221; DEG10350093; DEG10050104; DEG10250666; DEG10350246; DEG10100534; DEG10250259; DEG10130295; DEG10240249; DEG10130018; DEG10080069; DEG10220125; DEG10280158; DEG10240302; DEG10280051; guaB 93 Inosine-5'- monophosphate dehydrogenase 19 DEG10240247; DEG10150086; DEG10220288; DEG10070111; DEG10020023; DEG10240037; DEG10270599; DEG10100536; DEG10080148; DEG10250667; DEG10360158; DEG10280044; DEG10250668; DEG10010005; DEG10270598; DEG10130475; DEG10310139; DEG10350247; DEG10200324; gyrA 158 DNA gyrase subunit A 50 DEG10060003; DEG10200196; DEG10290228; DEG10320187; DEG10170177; DEG10120126; DEG10230048; DEG10170005; DEG10330078; DEG10200197; DEG10270005; DEG10150117; DEG10180351; DEG10160208; DEG10190132; DEG10180449; DEG10020004; DEG10140142; DEG10370111; DEG10110131; DEG10280495; DEG10280027; DEG10210120; DEG10210122; DEG10250004; DEG10290331; DEG10130327; DEG10140048; DEG10060172; DEG10330211; DEG10340287; DEG10030489; DEG10150298; DEG10220082; DEG10350321; DEG10010149; DEG10380137; DEG10220187; DEG10010004; DEG10130032; DEG10230219; DEG10370127; DEG10160075; DEG10120318; DEG10240181; DEG10320241; DEG10020150; DEG10380117; DEG10360286; DEG10360128; gyrB 121 DNA gyrase subunit B 57 DEG10030490; DEG10340055; DEG10170176; DEG10060002; DEG10170004; DEG10270004; DEG10120327; DEG10190175; DEG10050551; DEG10160209; DEG10200287; DEG10350001; DEG10230200; DEG10220076; DEG10370110; DEG10020003; DEG10130002; DEG10140141; DEG10250003; DEG10020149; DEG10370077; DEG10210123; DEG10290006; DEG10290333; DEG10030004; DEG10320307; DEG10240353; DEG10280496; DEG10060171; DEG10350069; DEG10330281; DEG10150299; DEG10130487; DEG10220341; DEG10010148; DEG10380116; DEG10160277; DEG10210096; DEG10120150; DEG10110200; DEG10010003; DEG10380078; DEG10230175; DEG10070149; DEG10050180; DEG10340543; DEG10180452; DEG10200030; DEG10330212; DEG10360003; DEG10070042; DEG10280291; DEG10320242; DEG10190253; DEG10360287; DEG10100002; DEG10140293; hemE 180 uroporphyrinogen decarboxylase 17 DEG10030053; DEG10350416; DEG10330260; DEG10240355; DEG10270485; DEG10320333; DEG10200480; DEG10180574; DEG10290069; DEG10130299; DEG10150309; DEG10280295; DEG10160257; DEG10250521; DEG10120367; DEG10110214; DEG10360302; hisD 98 histidinol dehydrogenase 5 DEG10250321; DEG10130112; DEG10280286; DEG10270301; DEG10100258; hisF 79 Imidazole glycerol phosphate synthase subunit 10 DEG10360119; DEG10100263; DEG10100262; DEG10050161; DEG10250325; DEG10280279; DEG10280280; DEG10130468; DEG10130467; DEG10250326; hisG 73 ATP phosphoribosyl transferase 6 DEG10100317; DEG10250397; DEG10050158; DEG10130111; DEG10280287; DEG10270370; hisS 151 histidyl-tRNA synthetase 30 DEG10130095; DEG10270475; DEG10060024; DEG10220456; DEG10020182; DEG10340359; DEG10240276; DEG10160060; DEG10370221; DEG10210142; DEG10140129; DEG10320199; DEG10330061; DEG10170228; DEG10190144; DEG10200424; DEG10120363; DEG10210212; DEG10080218; DEG10100416; DEG10230092; DEG10380230; DEG10350115; DEG10250504; DEG10280507; 131 DEG10180381; DEG10290282; DEG10360160; DEG10010193; DEG10110144; ileS 129 isoleucyl-tRNA synthetase 91 DEG10330352; DEG10290306; DEG10160148; DEG10240062; DEG10230300; DEG10340272; DEG10280301; DEG10130006; DEG10290105; DEG10170264; DEG10210040; DEG10030147; DEG10060284; DEG10120199; DEG10160079; DEG10070203; DEG10220258; DEG10180006; DEG10320184; DEG10310102; DEG10010110; DEG10020186; DEG10120108; DEG10240137; DEG10170240; DEG10360173; DEG10320349; DEG10210063; DEG10010218; DEG10190066; DEG10380176; DEG10270449; DEG10130366; DEG10120038; DEG10030503; DEG10080214; DEG10360249; DEG10020210; DEG10060222; DEG10280409; DEG10370028; DEG10110126; DEG10360169; DEG10010199; DEG10290288; DEG10170133; DEG10330151; DEG10380169; DEG10050502; DEG10160347; DEG10350060; DEG10320072; DEG10220171; DEG10370155; DEG10210162; DEG10250479; DEG10060273; DEG10350223; DEG10330003; DEG10150271; DEG10060013; DEG10230038; DEG10190293; DEG10140090; DEG10350361; DEG10100396; DEG10320005; DEG10370162; DEG10140258; DEG10200097; DEG10180599; DEG10220095; DEG10010010; DEG10160003; DEG10310141; DEG10380024; DEG10050332; DEG10180113; DEG10330081; DEG10280125; DEG10270288; DEG10200161; DEG10140025; DEG10020112; DEG10340148; DEG10180342; DEG10190129; DEG10130395; DEG10110002; DEG10250305; DEG10150075; ilvA 72 Threonine dehydratase 14 DEG10130192; DEG10100268; DEG10130038; DEG10130102; DEG10250330; DEG10270228; DEG10100194; DEG10250233; DEG10280282; DEG10070217; DEG10270292; DEG10250310; DEG10270307; DEG10050519; ilvC 117 ketol-acid reductoisomerase 8 DEG10100483; DEG10350074; DEG10240080; DEG10130393; DEG10270541; DEG10200308; DEG10280099; DEG10250588; ilvD 89 Dihydroxy-acid dehydratase 7 DEG10350063; DEG10240068; DEG10280502; DEG10100014; DEG10300056; DEG10250028; DEG10270032; infA 75 translation initiation factor IF-1 25 DEG10170312; DEG10050175; DEG10010058; DEG10100549; DEG10060144; DEG10320086; DEG10250679; DEG10330142; DEG10210027; DEG10030338; DEG10190071; DEG10240318; DEG10120192; DEG10130081; DEG10150146; DEG10340459; DEG10160139; DEG10080250; DEG10020256; DEG10140219; DEG10370020; DEG10290248; DEG10280500; DEG10200327; DEG10220399; infB 144 translation initiation factor IF-2 100 DEG10150286; DEG10370177; DEG10020086; DEG10210198; DEG10270026; DEG10100449; DEG10350413; DEG10270156; DEG10180508; DEG10340483; DEG10280309; DEG10100036; DEG10110170; DEG10340049; DEG10120051; DEG10060114; DEG10120365; DEG10360266; DEG10250009; DEG10270346; DEG10350111; DEG10350405; DEG10180471; DEG10340492; DEG10250133; DEG10250132; DEG10020174; DEG10140150; DEG10250685; DEG10250684; DEG10030135; DEG10230160; DEG10110190; DEG10280185; DEG10180568; DEG10370036; DEG10270679; DEG10280414; DEG10320291; DEG10240038; DEG10120342; DEG10220335; DEG10270010; DEG10160298; DEG10230195; DEG10200381; DEG10320249; DEG10250057; DEG10140071; DEG10100100; DEG10130059; DEG10020051; DEG10020052; DEG10190182; DEG10170048; DEG10170049; DEG10010137; DEG10350216; DEG10220422; DEG10010032; DEG10010033; DEG10200394; DEG10270119; DEG10270610; DEG10270611; DEG10270612; DEG10160222; DEG10130160; DEG10140160; DEG10330302; DEG10170166; DEG10270506; DEG10100099; DEG10270627; DEG10220276; DEG10220060; DEG10020137; DEG10380190; DEG10290114; DEG10260053; DEG10130313; DEG10060366; DEG10330225; DEG10050236; DEG10270048; DEG10200009; DEG10240293; DEG10370071; DEG10070012; DEG10290030; DEG10060071; DEG10290087; DEG10270059; DEG10230154; DEG10020099; DEG10350030; DEG10270120; DEG10380031; DEG10210136; DEG10250549; infC 121 translation initiation factor IF-3 27 DEG10030595; DEG10100277; DEG10060164; DEG10010207; DEG10220197; DEG10380102; DEG10360093; DEG10080017; DEG10280284; DEG10200124; DEG10340212; DEG10130386; DEG10070155; DEG10170246; DEG10160094; DEG10290211; DEG10230016; DEG10050473; DEG10120269; DEG10020191; DEG10190120; DEG10140092; DEG10210135; DEG10330096; DEG10180288; DEG10250342; DEG10320128; katA 74 catalase 1 DEG10260019; 132 ksgA 122 Dimethyladenosine transferase 7 DEG10290316; DEG10050176; DEG10270183; DEG10030079; DEG10220031; DEG10310225; DEG10300018; ldh 77 L-lactate dehydrogenase 4 DEG10050433; DEG10280153; DEG10020293; DEG10200456; lepA 82 GTP-binding protein LepA 89 DEG10150286; DEG10020086; DEG10210198; DEG10100449; DEG10350413; DEG10270224; DEG10180508; DEG10180509; DEG10250229; DEG10340483; DEG10110170; DEG10340049; DEG10100190; DEG10120051; DEG10060114; DEG10120365; DEG10360266; DEG10350405; DEG10350406; DEG10180471; DEG10340492; DEG10250133; DEG10250132; DEG10020174; DEG10020137; DEG10030135; DEG10230160; DEG10110190; DEG10280185; DEG10010033; DEG10180568; DEG10370036; DEG10320291; DEG10120342; DEG10220335; DEG10160298; DEG10230195; DEG10200381; DEG10260053; DEG10140071; DEG10100100; DEG10130059; DEG10020051; DEG10020052; DEG10190182; DEG10170048; DEG10170049; DEG10010137; DEG10350216; DEG10220422; DEG10010032; DEG10160222; DEG10270119; DEG10130160; DEG10140160; DEG10330302; DEG10170166; DEG10270506; DEG10300067; DEG10100099; DEG10220276; DEG10050188; DEG10220060; DEG10140150; DEG10380190; DEG10290114; DEG10320249; DEG10130313; DEG10050638; DEG10060366; DEG10330225; DEG10050236; DEG10130146; DEG10370177; DEG10200009; DEG10240293; DEG10370071; DEG10070012; DEG10290030; DEG10060071; DEG10290087; DEG10230154; DEG10020099; DEG10270120; DEG10380031; DEG10360202; DEG10190231; DEG10210136; DEG10250549; leuB 89 3-isopropylmalate dehydrogenase 12 DEG10240059; DEG10100480; DEG10130080; DEG10330110; DEG10270538; DEG10280489; DEG10350057; DEG10250586; DEG10260017; DEG10160108; DEG10050347; DEG10110075; leuC 80 3-isopropylmalate dehydratase large subunit 16 DEG10260002; DEG10120348; DEG10310107; DEG10270536; DEG10280512; DEG10240369; DEG10290068; DEG10360072; DEG10200458; DEG10250584; DEG10270272; DEG10100247; DEG10050348; DEG10350493; DEG10130078; DEG10250287; leuS 143 leucyl-tRNA synthetase 41 DEG10160148; DEG10330151; DEG10170264; DEG10060273; DEG10380169; DEG10250186; DEG10230300; DEG10200472; DEG10380024; DEG10320072; DEG10220171; DEG10050332; DEG10180113; DEG10280301; DEG10290105; DEG10370155; DEG10120199; DEG10360173; DEG10210040; DEG10210063; DEG10010218; DEG10240137; DEG10130366; DEG10350361; DEG10060222; DEG10140025; DEG10270182; DEG10270013; DEG10230180; DEG10020210; DEG10120117; DEG10340547; DEG10370028; DEG10100151; DEG10130395; DEG10070203; DEG10250012; DEG10190066; DEG10100005; DEG10210150; DEG10280074; lysA 84 diaminopimelate decarboxylase 9 DEG10280304; DEG10100192; DEG10080050; DEG10250231; DEG10280503; DEG10270226; DEG10340035; DEG10130332; DEG10200058; lysC 91 Aspartate kinase 16 DEG10050040; DEG10270648; DEG10360039; DEG10130174; DEG10030465; DEG10290290; DEG10280491; DEG10200111; DEG10240225; DEG10350268; DEG10150095; DEG10360147; DEG10250720; DEG10100583; DEG10080227; DEG10220008; metG 119 methionyl-tRNA synthetase 53 DEG10160148; DEG10330151; DEG10320184; DEG10170025; DEG10380169; DEG10010199; DEG10010010; DEG10020186; DEG10100151; DEG10140185; DEG10380024; DEG10320072; DEG10220171; DEG10170240; DEG10050332; DEG10180113; DEG10330081; DEG10280301; DEG10130122; DEG10240137; DEG10290105; DEG10190066; DEG10370053; DEG10010218; DEG10140025; DEG10020210; DEG10370155; DEG10250186; DEG10200182; DEG10050455; DEG10380050; DEG10120117; DEG10360173; DEG10270182; DEG10340547; DEG10060013; DEG10200472; DEG10230180; DEG10120199; DEG10220048; DEG10160079; DEG10180342; DEG10210040; DEG10350361; DEG10370028; DEG10190129; DEG10130395; DEG10110126; DEG10020031; DEG10170264; DEG10210150; DEG10290247; DEG10280074; metK 121 S-adenosylmethionine synthase 30 DEG10340012; DEG10130250; DEG10120331; DEG10290091; DEG10080031; DEG10230184; DEG10320238; DEG10190171; DEG10160205; DEG10250264; DEG10330208; DEG10170266; DEG10220389; DEG10010219; DEG10380159; DEG10240011; DEG10100226; DEG10030087; DEG10350014; DEG10060034; DEG10280374; DEG10020211; DEG10270253; DEG10180439; DEG10070146; 133 DEG10200012; DEG10210133; DEG10370145; DEG10110166; DEG10140274; miaA 92 tRNA dimethylallyltransferase 12 DEG10270492; DEG10130278; DEG10050031; DEG10330347; DEG10180589; DEG10100432; DEG10120240; DEG10280097; DEG10250530; DEG10200305; DEG10160342; DEG10290077; murA 100 UDP-N- acetylglucosamine 25 DEG10260011; DEG10080106; DEG10290344; DEG10200439; DEG10190189; DEG10340375; DEG10170295; DEG10270240; DEG10360236; DEG10320254; DEG10250246; DEG10070059; DEG10100209; DEG10320092; DEG10020231; DEG10010252; DEG10030507; DEG10120111; DEG10220236; DEG10160232; DEG10130110; DEG10230099; DEG10330235; DEG10200328; DEG10280119; ndk 162 nucleoside diphosphate kinase 9 DEG10150082; DEG10350113; DEG10030162; DEG10290209; DEG10240274; DEG10280006; DEG10200220; DEG10340412; DEG10180383; nth 73 Endonuclease III 1 DEG10050614; nusA 73 Transcription elongation protein 28 DEG10190183; DEG10010136; DEG10350217; DEG10080305; DEG10370178; DEG10330226; DEG10160223; DEG10110171; DEG10070137; DEG10250550; DEG10170163; DEG10340048; DEG10270507; DEG10220059; DEG10240294; DEG10060112; DEG10230194; DEG10100450; DEG10120366; DEG10360267; DEG10200010; DEG10130058; DEG10020136; DEG10380191; DEG10140070; DEG10290113; DEG10310049; DEG10030134; nusG 121 Transcription anti- termination protein NusG 20 DEG10200385; DEG10330266; DEG10030046; DEG10350412; DEG10270112; DEG10280259; DEG10240346; DEG10340490; DEG10310052; DEG10050240; DEG10250122; DEG10290021; DEG10230159; DEG10130047; DEG10190275; DEG10160263; DEG10360211; DEG10120340; DEG10080222; DEG10220333; obgE 122 GTPase ObgE 30 DEG10200195; DEG10190185; DEG10170235; DEG10240122; DEG10350373; DEG10210085; DEG10020183; DEG10200048; DEG10160228; DEG10140057; DEG10250476; DEG10110174; DEG10280434; DEG10070058; DEG10060316; DEG10270446; DEG10380157; DEG10130308; DEG10120388; DEG10290318; DEG10220166; DEG10180201; DEG10180473; DEG10100391; DEG10370143; DEG10020018; DEG10330231; DEG10070001; DEG10010194; DEG10140136; pgi 73 glucose-6-phosphate isomerase 16 DEG10210205; DEG10340391; DEG10020081; DEG10290311; DEG10380026; DEG10260101; DEG10070233; DEG10200034; DEG10220238; DEG10250171; DEG10230110; DEG10120162; DEG10170096; DEG10140040; DEG10100140; DEG10370030; pgk 194 phosphoglycerate kinase 23 DEG10130237; DEG10100233; DEG10010240; DEG10290093; DEG10140086; DEG10160203; DEG10230087; DEG10320237; DEG10340353; DEG10370204; DEG10170078; DEG10070030; DEG10330206; DEG10210041; DEG10220080; DEG10050168; DEG10060241; DEG10030088; DEG10180437; DEG10270262; DEG10250275; DEG10020071; DEG10190170; pheA 69 Prephenate dehydratase 5 DEG10310031; DEG10280509; DEG10130259; DEG10050414; DEG10250755; pheS 121 phenylalanyl-tRNA synthetase subunit alpha 29 DEG10060162; DEG10380088; DEG10010204; DEG10100278; DEG10360091; DEG10050468; DEG10200121; DEG10280242; DEG10190118; DEG10230086; DEG10130383; DEG10340352; DEG10320131; DEG10020101; DEG10290190; DEG10170120; DEG10330100; DEG10120202; DEG10160098; DEG10150140; DEG10210098; DEG10220370; DEG10350220; DEG10270315; DEG10080066; DEG10250344; DEG10030248; DEG10370087; DEG10140134; pnp 176 Polyribonucleotide nucleotidyltransferase 30 DEG10100275; DEG10330136; DEG10130272; DEG10200437; DEG10290226; DEG10350318; DEG10200007; DEG10180469; DEG10250540; DEG10370112; DEG10260064; DEG10250339; DEG10190076; DEG10210121; DEG10030375; DEG10240184; DEG10270500; DEG10320095; DEG10160134; DEG10340147; DEG10360265; DEG10380118; DEG10050436; DEG10020139; DEG10050181; DEG10350077; DEG10240084; DEG10360126; DEG10130071; DEG10290116; polA 73 DNA polymerase I 16 DEG10100274; DEG10380025; DEG10290388; DEG10370029; DEG10270313; DEG10340018; DEG10300115; DEG10160269; DEG10330273; DEG10060219; DEG10250338; DEG10130376; DEG10020195; DEG10140287; DEG10110207; 134 DEG10210008; ppa 76 inorganic pyrophosphatase 16 DEG10200011; DEG10130033; DEG10330351; DEG10180595; DEG10050058; DEG10140175; DEG10120212; DEG10030510; DEG10360179; DEG10320347; DEG10250708; DEG10060290; DEG10160346; DEG10270637; DEG10190292; DEG10240074; prfA 75 Peptide chain release factor 1 51 DEG10340132; DEG10160089; DEG10230286; DEG10340070; DEG10320161; DEG10360257; DEG10100500; DEG10050564; DEG10170297; DEG10220446; DEG10250614; DEG10130286; DEG10200238; DEG10220263; DEG10350279; DEG10320232; DEG10060216; DEG10330201; DEG10070056; DEG10140028; DEG10370075; DEG10170071; DEG10070038; DEG10020065; DEG10360152; DEG10150279; DEG10100198; DEG10270559; DEG10120325; DEG10380135; DEG10110165; DEG10230217; DEG10020235; DEG10010255; DEG10270231; DEG10120035; DEG10030423; DEG10210093; DEG10160198; DEG10370126; DEG10200114; DEG10250236; DEG10380074; DEG10210114; DEG10010242; DEG10240308; DEG10280423; DEG10330091; DEG10290326; DEG10130472; DEG10190105; proA 83 gamma-glutamyl phosphate reductase 5 DEG10220146; DEG10240139; DEG10130093; DEG10350360; DEG10280229; proS 74 prolyl-tRNA synthetase 44 DEG10030594; DEG10240124; DEG10030179; DEG10220196; DEG10010134; DEG10340293; DEG10220203; DEG10360094; DEG10350370; DEG10160093; DEG10160049; DEG10340211; DEG10380220; DEG10130123; DEG10110078; DEG10050493; DEG10180054; DEG10120270; DEG10290274; DEG10170161; DEG10120305; DEG10270508; DEG10230053; DEG10320127; DEG10230015; DEG10280231; DEG10210193; DEG10100451; DEG10360042; DEG10290210; DEG10150236; DEG10350221; DEG10190121; DEG10050246; DEG10320048; DEG10250551; DEG10020134; DEG10190044; DEG10330095; DEG10180289; DEG10200267; DEG10370210; DEG10330050; DEG10070124; prsA 143 Ribose-phosphate pyrophosphokinase 28 DEG10120236; DEG10240029; DEG10060045; DEG10160085; DEG10170027; DEG10250190; DEG10130356; DEG10010013; DEG10350025; DEG10380006; DEG10340271; DEG10220013; DEG10330087; DEG10210006; DEG10290330; DEG10100154; DEG10230034; DEG10050578; DEG10270187; DEG10360260; DEG10370006; DEG10180204; DEG10080122; DEG10190101; DEG10020034; DEG10320165; DEG10200079; DEG10140039; purA 128 Adenylo succinate synthetase 10 DEG10250063; DEG10130178; DEG10350117; DEG10270064; DEG10240279; DEG10100041; DEG10030539; DEG10280506; DEG10360284; DEG10220310; purD 81 Phosphoribosylamine-- glycine ligase 13 DEG10130292; DEG10330246; DEG10150294; DEG10050320; DEG10270137; DEG10250149; DEG10120148; DEG10030037; DEG10320260; DEG10100125; DEG10160243; DEG10360280; DEG10310164; purE 84 Phosphoribosyl amino imidazole carboxylase 4 DEG10340513; DEG10220210; DEG10100525; DEG10280351; purF 72 amidophosphoribosyltransferase 33 DEG10250758; DEG10100542; DEG10100607; DEG10150333; DEG10290391; DEG10120122; DEG10250670; DEG10020242; DEG10180545; DEG10270676; DEG10200027; DEG10370138; DEG10250431; DEG10190260; DEG10070011; DEG10100131; DEG10170302; DEG10240015; DEG10280494; DEG10130187; DEG10380152; DEG10100345; DEG10010067; DEG10360327; DEG10210196; DEG10270602; DEG10350017; DEG10250154; DEG10270149; DEG10310182; DEG10070113; DEG10320312; DEG10270393; purH 80 bifunctional 8 DEG10130291; DEG10270172; DEG10250177; DEG10260100; DEG10220182; DEG10340450; DEG10280056; DEG10100145; pyk 205 Pyruvate kinase 13 DEG10140083; DEG10300074; DEG10270310; DEG10060183; DEG10070152; DEG10380153; DEG10370139; DEG10170252; DEG10050566; DEG10020196; DEG10210090; DEG10250333; DEG10100271; pyrB 106 aspartate carbamoyltransferase 9 DEG10130179; DEG10130195; DEG10280190; DEG10100218; DEG10250256; DEG10100284; DEG10280094; DEG10270250; DEG10240054; 135 pyrD 85 Dihydroorotate dehydrogenase 2 6 DEG10260012; DEG10130186; DEG10350211; DEG10250401; DEG10300053; DEG10270373; pyrH 124 uridylate kinase 31 DEG10230125; DEG10130196; DEG10270519; DEG10290152; DEG10120043; DEG10330036; DEG10320035; DEG10360039; DEG10150095; DEG10070153; DEG10350258; DEG10110017; DEG10190030; DEG10240236; DEG10210146; DEG10370058; DEG10060353; DEG10340100; DEG10050387; DEG10310170; DEG10100458; DEG10030448; DEG10160035; DEG10380055; DEG10140073; DEG10200261; DEG10360147; DEG10220392; DEG10250560; DEG10180041; DEG10170157; recA 105 recombinase A 5 DEG10020142; DEG10280079; DEG10160189; DEG10080023; DEG10330191; rho 105 Transcription termination factor Rho 54 DEG10200477; DEG10030559; DEG10100394; DEG10380085; DEG10190182; DEG10310006; DEG10060328; DEG10170196; DEG10200416; DEG10350417; DEG10080206; DEG10120357; DEG10140097; DEG10350216; DEG10360316; DEG10180471; DEG10210080; DEG10240272; DEG10220347; DEG10380087; DEG10290395; DEG10340350; DEG10130028; DEG10230082; DEG10250245; DEG10140165; DEG10280104; DEG10250477; DEG10280102; DEG10290065; DEG10240356; DEG10270447; DEG10100207; DEG10060330; DEG10230195; DEG10120308; DEG10100196; DEG10270239; DEG10070182; DEG10020238; DEG10270230; DEG10210078; DEG10070184; DEG10360329; DEG10350110; DEG10140095; DEG10160253; DEG10250235; DEG10140079; DEG10330256; DEG10370086; DEG10370084; DEG10320314; DEG10030039; rnc 81 Ribonuclease III 11 DEG10180397; DEG10270525; DEG10100468; DEG10020121; DEG10320206; DEG10010120; DEG10190150; DEG10250569; DEG10080111; DEG10050006; DEG10290130; rpe 80 Ribulose-phosphate 3- epimerase 15 DEG10360029; DEG10070087; DEG10290060; DEG10050179; DEG10220096; DEG10130116; DEG10150022; DEG10200023; DEG10070227; DEG10180517; DEG10170139; DEG10280217; DEG10270256; DEG10250268; DEG10080276; rplA 115 50S ribosomal protein L1 20 DEG10290023; DEG10340488; DEG10230158; DEG10170041; DEG10380054; DEG10120338; DEG10060064; DEG10240344; DEG10030048; DEG10360209; DEG10150031; DEG10370057; DEG10250124; DEG10210147; DEG10020044; DEG10220331; DEG10130049; DEG10280261; DEG10140007; DEG10010025; rplB 126 50S ribosomal protein L2 30 DEG10010038; DEG10360199; DEG10030534; DEG10160303; DEG10370012; DEG10190226; DEG10180503; DEG10280180; DEG10330307; DEG10240333; DEG10120056; DEG10130429; DEG10050268; DEG10230151; DEG10110187; DEG10220417; DEG10380011; DEG10020275; DEG10060125; DEG10350402; DEG10270126; DEG10340478; DEG10200138; DEG10100106; DEG10290035; DEG10250137; DEG10210013; DEG10170331; DEG10140238; DEG10320286; rplC 124 50S ribosomal protein L3 30 DEG10130432; DEG10310224; DEG10060122; DEG10220420; DEG10030537; DEG10010035; DEG10370010; DEG10160300; DEG10190229; DEG10340481; DEG10280183; DEG10180506; DEG10290032; DEG10140241; DEG10330304; DEG10240336; DEG10120053; DEG10230153; DEG10110189; DEG10350404; DEG10380010; DEG10200135; DEG10360201; DEG10270124; DEG10020278; DEG10320289; DEG10210010; DEG10100103; DEG10250135; DEG10170334; rplD 113 50S ribosomal protein L4 30 DEG10130431; DEG10280182; DEG10060123; DEG10310223; DEG10030536; DEG10010036; DEG10370011; DEG10080262; DEG10160301; DEG10190228; DEG10340480; DEG10180505; DEG10330305; DEG10140240; DEG10250136; DEG10240335; DEG10120054; DEG10230152; DEG10110188; DEG10200136; DEG10020277; DEG10350403; DEG10210011; DEG10360200; DEG10270125; DEG10220419; DEG10320288; DEG10100104; DEG10290033; DEG10170333; rplE 123 50S ribosomal protein L5 27 DEG10240326; DEG10150044; DEG10120065; DEG10230149; DEG10360196; DEG10050277; DEG10020266; DEG10210020; DEG10340469; DEG10290044; DEG10100115; DEG10320277; DEG10200147; DEG10140229; DEG10130420; DEG10170322; DEG10010047; DEG10060134; DEG10110185; DEG10030525; DEG10350400; DEG10080256; DEG10160312; DEG10190217; DEG10280171; DEG10220409; DEG10330316; rplF 107 50S ribosomal protein 31 DEG10240324; DEG10170319; DEG10130417; DEG10180496; DEG10360194; DEG10230148; DEG10020263; DEG10120068; DEG10110184; DEG10340466; 136 L6 DEG10220406; DEG10370016; DEG10210022; DEG10270131; DEG10140226; DEG10320274; DEG10280168; DEG10100117; DEG10350399; DEG10150046; DEG10290047; DEG10060137; DEG10050280; DEG10010050; DEG10030522; DEG10380015; DEG10250143; DEG10160315; DEG10200150; DEG10190214; DEG10330319; rplI 126 50S ribosomal protein L9 5 DEG10030065; DEG10020006; DEG10140197; DEG10060075; DEG10010265; rplJ 100 50S ribosomal protein L10 23 DEG10170042; DEG10120337; DEG10240343; DEG10160261; DEG10060295; DEG10330264; DEG10350410; DEG10180569; DEG10290024; DEG10020045; DEG10140108; DEG10320329; DEG10200082; DEG10280380; DEG10030049; DEG10250127; DEG10100090; DEG10010026; DEG10310054; DEG10210112; DEG10360208; DEG10130050; DEG10190276; rplK 118 50S ribosomal protein L11 25 DEG10170040; DEG10120339; DEG10240345; DEG10060063; DEG10150030; DEG10290022; DEG10280260; DEG10160262; DEG10080221; DEG10100089; DEG10340489; DEG10350411; DEG10330265; DEG10020043; DEG10250123; DEG10200088; DEG10130048; DEG10320328; DEG10310053; DEG10140006; DEG10050166; DEG10030047; DEG10360210; DEG10220332; DEG10210148; rplL 111 50S ribosomal protein L7/L12 25 DEG10030050; DEG10060296; DEG10170043; DEG10120336; DEG10240342; DEG10150032; DEG10220329; DEG10160260; DEG10110211; DEG10330263; DEG10340486; DEG10290025; DEG10140107; DEG10020046; DEG10310055; DEG10200083; DEG10100091; DEG10010027; DEG10360207; DEG10210113; DEG10180570; DEG10130051; DEG10190277; DEG10320330; DEG10280379; rplM 121 50S ribosomal protein L13 28 DEG10240144; DEG10290341; DEG10100544; DEG10130371; DEG10340425; DEG10060339; DEG10320258; DEG10360234; DEG10350355; DEG10080010; DEG10020248; DEG10330240; DEG10280497; DEG10310194; DEG10050527; DEG10250673; DEG10180481; DEG10190194; DEG10170306; DEG10140190; DEG10010064; DEG10220339; DEG10150258; DEG10120287; DEG10160237; DEG10210189; DEG10200175; DEG10030117; rplN 112 50S ribosomal protein L14 24 DEG10290042; DEG10240328; DEG10020268; DEG10120063; DEG10050275; DEG10180500; DEG10100113; DEG10200145; DEG10320279; DEG10130422; DEG10170324; DEG10150042; DEG10060132; DEG10010045; DEG10160310; DEG10030527; DEG10340471; DEG10220411; DEG10080258; DEG10280173; DEG10140231; DEG10330314; DEG10210019; DEG10190219; rplO 102 50S ribosomal protein L15 26 DEG10290051; DEG10170315; DEG10240320; DEG10130413; DEG10010054; DEG10180493; DEG10060140; DEG10360192; DEG10030518; DEG10230146; DEG10340462; DEG10220402; DEG10320270; DEG10350397; DEG10140223; DEG10280164; DEG10330323; DEG10150049; DEG10120072; DEG10050284; DEG10110181; DEG10160319; DEG10080253; DEG10020259; DEG10200153; DEG10190210; rplP 102 50S ribosomal protein L16 28 DEG10290039; DEG10360197; DEG10120060; DEG10150039; DEG10030530; DEG10050272; DEG10370015; DEG10160307; DEG10250140; DEG10100110; DEG10060129; DEG10190222; DEG10200142; DEG10340474; DEG10170327; DEG10240330; DEG10130425; DEG10010042; DEG10110186; DEG10380013; DEG10220413; DEG10020271; DEG10270128; DEG10140234; DEG10210017; DEG10280176; DEG10320282; DEG10330311; rplQ 115 50S ribosomal protein L17 22 DEG10150053; DEG10330330; DEG10060149; DEG10290057; DEG10080248; DEG10250675; DEG10030511; DEG10160326; DEG10190204; DEG10240312; DEG10170307; DEG10130407; DEG10360186; DEG10010063; DEG10050289; DEG10120079; DEG10020251; DEG10370024; DEG10140214; DEG10210030; DEG10320264; DEG10200159; rplR 98 50S ribosomal protein L18 24 DEG10240323; DEG10170318; DEG10130416; DEG10010051; DEG10020262; DEG10120069; DEG10220405; DEG10340465; DEG10210023; DEG10140225; DEG10320273; DEG10100118; DEG10280167; DEG10150047; DEG10060138; DEG10330320; DEG10290048; DEG10180495; DEG10050281; DEG10030521; DEG10310216; DEG10160316; DEG10200151; DEG10190213; rplS 114 50S ribosomal protein 19 DEG10130446; DEG10050095; DEG10280270; DEG10240192; DEG10010126; DEG10160179; DEG10100463; DEG10060360; DEG10330181; DEG10020127; 137 L19 DEG10180403; DEG10210125; DEG10030113; DEG10140172; DEG10200031; DEG10290137; DEG10170151; DEG10320213; DEG10190155; rplT 110 50S ribosomal protein L20 23 DEG10030597; DEG10150139; DEG10060166; DEG10010205; DEG10220199; DEG10020189; DEG10360092; DEG10340214; DEG10200122; DEG10190119; DEG10130384; DEG10080019; DEG10170244; DEG10320130; DEG10160096; DEG10290213; DEG10050475; DEG10120267; DEG10140094; DEG10210134; DEG10250343; DEG10180287; DEG10330098; rplU 92 50S ribosomal protein L21 19 DEG10190187; DEG10220350; DEG10160230; DEG10320252; DEG10130364; DEG10170238; DEG10100393; DEG10240120; DEG10060197; DEG10280432; DEG10340142; DEG10200050; DEG10330233; DEG10120165; DEG10010196; DEG10020185; DEG10030076; DEG10290320; DEG10140127; rplV 101 50S ribosomal protein L22 24 DEG10220415; DEG10060127; DEG10150037; DEG10030532; DEG10050270; DEG10080261; DEG10370013; DEG10160305; DEG10190224; DEG10330309; DEG10200140; DEG10130427; DEG10120058; DEG10010040; DEG10170329; DEG10210015; DEG10020273; DEG10340476; DEG10290037; DEG10140236; DEG10280178; DEG10250138; DEG10320284; DEG10100108; rplW 94 50S ribosomal protein L23 23 DEG10130430; DEG10060124; DEG10310222; DEG10280181; DEG10010037; DEG10160302; DEG10190227; DEG10180504; DEG10030535; DEG10330306; DEG10240334; DEG10120055; DEG10050267; DEG10200137; DEG10020276; DEG10340479; DEG10220418; DEG10290034; DEG10100105; DEG10210012; DEG10170332; DEG10140239; DEG10320287; rplX 81 50S ribosomal protein L24 25 DEG10240327; DEG10180499; DEG10120064; DEG10050276; DEG10020267; DEG10200146; DEG10320278; DEG10100114; DEG10290043; DEG10130421; DEG10150043; DEG10170323; DEG10060133; DEG10010046; DEG10310219; DEG10380014; DEG10030526; DEG10340470; DEG10080257; DEG10220410; DEG10160311; DEG10280172; DEG10140230; DEG10330315; DEG10190218; rpmA 86 50S ribosomal protein L27 21 DEG10190186; DEG10200049; DEG10160229; DEG10170236; DEG10060199; DEG10210107; DEG10220349; DEG10320251; DEG10240121; DEG10100392; DEG10290319; DEG10150274; DEG10280433; DEG10340141; DEG10120166; DEG10140126; DEG10330232; DEG10010195; DEG10020184; DEG10030077; DEG10080051; rpmE 78 50S ribosomal protein L31 11 DEG10240271; DEG10120084; DEG10200400; DEG10050257; DEG10010256; DEG10340354; DEG10080086; DEG10100197; DEG10280069; DEG10170298; DEG10020237; rpmH 68 50S ribosomal protein L34 17 DEG10160274; DEG10240350; DEG10210201; DEG10020302; DEG10200108; DEG10080287; DEG10010271; DEG10190256; DEG10120010; DEG10140052; DEG10330278; DEG10060376; DEG10030002; DEG10050353; DEG10320310; DEG10290003; DEG10170351; rpoA 177 DNA-directed RNA polymerase subunit alpha 30 DEG10290056; DEG10060148; DEG10100546; DEG10230142; DEG10250676; DEG10030512; DEG10210029; DEG10160325; DEG10350394; DEG10190205; DEG10110177; DEG10240313; DEG10130408; DEG10120307; DEG10170308; DEG10330329; DEG10360187; DEG10010062; DEG10380019; DEG10050288; DEG10120078; DEG10340455; DEG10020252; DEG10270605; DEG10140215; DEG10370023; DEG10220394; DEG10280159; DEG10320265; DEG10200158; rpoB 182 DNA-directed RNA polymerase subunit beta 31 DEG10170044; DEG10030051; DEG10120335; DEG10240341; DEG10150033; DEG10220328; DEG10310056; DEG10110212; DEG10330262; DEG10380021; DEG10270116; DEG10140206; DEG10020047; DEG10290026; DEG10250129; DEG10200084; DEG10060281; DEG10050165; DEG10010028; DEG10280378; DEG10230156; DEG10100093; DEG10360206; DEG10350409; DEG10370026; DEG10210032; DEG10160259; DEG10180571; DEG10130052; DEG10190278; DEG10320331; rpoC 148 DNA-directed RNA polymerase subunit beta 28 DEG10170045; DEG10030052; DEG10140205; DEG10120334; DEG10240340; DEG10180572; DEG10220327; DEG10110213; DEG10380022; DEG10330261; DEG10270117; DEG10290027; DEG10020048; DEG10200085; DEG10060280; DEG10100094; DEG10010029; DEG10230155; DEG10360205; DEG10350408; DEG10370027; DEG10160258; DEG10210033; DEG10130053; DEG10070226; 138 DEG10250130; DEG10190279; DEG10320332; rpsA 111 30S ribosomal protein S1 33 DEG10100275; DEG10330136; DEG10130272; DEG10200437; DEG10290226; DEG10350318; DEG10200007; DEG10180469; DEG10250540; DEG10370112; DEG10260064; DEG10250339; DEG10190076; DEG10170095; DEG10210121; DEG10030375; DEG10290116; DEG10270500; DEG10310113; DEG10320095; DEG10160134; DEG10110224; DEG10340147; DEG10360265; DEG10380118; DEG10050436; DEG10020139; DEG10050181; DEG10350077; DEG10240084; DEG10360126; DEG10130071; DEG10240184; rpsB 138 30S ribosomal protein S2 27 DEG10140202; DEG10290150; DEG10120041; DEG10100460; DEG10330034; DEG10200263; DEG10030450; DEG10340423; DEG10210207; DEG10180039; DEG10150094; DEG10050330; DEG10240234; DEG10060052; DEG10010129; DEG10130265; DEG10220337; DEG10230259; DEG10080317; DEG10380224; DEG10190028; DEG10160033; DEG10020131; DEG10370216; DEG10320033; DEG10250562; DEG10170155; rpsC 118 30S ribosomal protein S3 30 DEG10330310; DEG10360198; DEG10030531; DEG10150038; DEG10060128; DEG10050271; DEG10080260; DEG10160306; DEG10370014; DEG10190223; DEG10200141; DEG10130426; DEG10240331; DEG10120059; DEG10010041; DEG10170328; DEG10230150; DEG10380012; DEG10320283; DEG10340475; DEG10350401; DEG10270127; DEG10020272; DEG10220414; DEG10140235; DEG10210016; DEG10280177; DEG10250139; DEG10290038; DEG10100109; rpsD 128 30S ribosomal protein S4 28 DEG10290055; DEG10370222; DEG10140283; DEG10230143; DEG10060252; DEG10250677; DEG10030513; DEG10020203; DEG10160324; DEG10350395; DEG10190206; DEG10100547; DEG10240314; DEG10200336; DEG10130409; DEG10010215; DEG10330328; DEG10120077; DEG10050287; DEG10210213; DEG10340456; DEG10360188; DEG10270606; DEG10380231; DEG10280132; DEG10220395; DEG10320266; DEG10170258; rpsE 127 30S ribosomal protein S5 30 DEG10240322; DEG10170317; DEG10010052; DEG10130415; DEG10360193; DEG10230147; DEG10020261; DEG10250144; DEG10220404; DEG10370017; DEG10340464; DEG10210024; DEG10270132; DEG10320272; DEG10140224; DEG10100119; DEG10280166; DEG10350398; DEG10060139; DEG10330321; DEG10180494; DEG10290049; DEG10050282; DEG10120070; DEG10030520; DEG10110183; DEG10080254; DEG10160317; DEG10200152; DEG10190212; rpsF 101 30S ribosomal protein S6 21 DEG10020019; DEG10130287; DEG10120223; DEG10050174; DEG10030062; DEG10240257; DEG10220127; DEG10180591; DEG10210049; DEG10200208; DEG10290338; DEG10320346; DEG10170017; DEG10340400; DEG10250015; DEG10080236; DEG10310142; DEG10010269; DEG10160343; DEG10360283; DEG10330348; rpsG 118 30S ribosomal protein S7 28 DEG10170047; DEG10210199; DEG10150034; DEG10280186; DEG10110191; DEG10130145; DEG10370035; DEG10010031; DEG10340484; DEG10140161; DEG10290029; DEG10320292; DEG10220423; DEG10030060; DEG10240338; DEG10120050; DEG10060070; DEG10330301; DEG10160297; DEG10080219; DEG10100098; DEG10050189; DEG10380030; DEG10360203; DEG10200382; DEG10190232; DEG10350407; DEG10020050; rpsH 111 30S ribosomal protein S8 26 DEG10240325; DEG10130418; DEG10180497; DEG10120067; DEG10360195; DEG10050279; DEG10020264; DEG10290046; DEG10250142; DEG10220407; DEG10210021; DEG10270130; DEG10140227; DEG10280169; DEG10320275; DEG10100116; DEG10200149; DEG10010049; DEG10170320; DEG10060136; DEG10340467; DEG10310217; DEG10030523; DEG10160314; DEG10190215; DEG10330318; rpsI 129 30S ribosomal protein S9 22 DEG10130372; DEG10240145; DEG10050526; DEG10380217; DEG10250672; DEG10360233; DEG10020247; DEG10320257; DEG10290340; DEG10060338; DEG10170305; DEG10190193; DEG10180480; DEG10010065; DEG10160236; DEG10340424; DEG10220338; DEG10140191; DEG10200176; DEG10120288; DEG10030118; DEG10330239; rpsJ 119 30S ribosomal protein S10 23 DEG10130433; DEG10030538; DEG10060121; DEG10150035; DEG10220421; DEG10010034; DEG10280184; DEG10340482; DEG10180507; DEG10330303; DEG10320290; DEG10140242; DEG10240337; DEG10120052; DEG10080263; DEG10160299; DEG10200134; DEG10190230; DEG10020279; DEG10370009; 139 DEG10100102; DEG10210009; DEG10290031; rpsK 129 30S ribosomal protein S11 26 DEG10150052; DEG10290054; DEG10100548; DEG10060147; DEG10130410; DEG10310204; DEG10030514; DEG10210028; DEG10160323; DEG10190207; DEG10110178; DEG10240315; DEG10280160; DEG10330327; DEG10170309; DEG10120076; DEG10050286; DEG10010061; DEG10340457; DEG10360189; DEG10020253; DEG10140216; DEG10370022; DEG10220396; DEG10320267; DEG10200157; rpsL 174 30S ribosomal protein S12 28 DEG10120049; DEG10170046; DEG10370034; DEG10060069; DEG10030059; DEG10010030; DEG10280187; DEG10080220; DEG10220424; DEG10270118; DEG10130144; DEG10050190; DEG10340485; DEG10140162; DEG10310057; DEG10290028; DEG10020049; DEG10320293; DEG10240339; DEG10100097; DEG10160296; DEG10330300; DEG10210200; DEG10360204; DEG10200383; DEG10190233; DEG10250131; DEG10180510; rpsM 117 30S ribosomal protein S13 27 DEG10170310; DEG10150051; DEG10290053; DEG10050285; DEG10180491; DEG10130411; DEG10360190; DEG10250678; DEG10030515; DEG10160322; DEG10060146; DEG10240316; DEG10190208; DEG10330326; DEG10120075; DEG10340458; DEG10380018; DEG10010060; DEG10270607; DEG10020254; DEG10280161; DEG10370021; DEG10220397; DEG10200156; DEG10140217; DEG10230144; DEG10320268; rpsN 102 30S ribosomal protein S14 20 DEG10180498; DEG10010048; DEG10130419; DEG10150045; DEG10170321; DEG10320276; DEG10290045; DEG10120066; DEG10220408; DEG10340468; DEG10210218; DEG10160313; DEG10190216; DEG10280170; DEG10140228; DEG10050278; DEG10330317; DEG10200148; DEG10030524; DEG10020265; rpsO 125 30S ribosomal protein S15 18 DEG10010138; DEG10340521; DEG10150285; DEG10180470; DEG10050538; DEG10240083; DEG10170168; DEG10160220; DEG10200008; DEG10140125; DEG10060344; DEG10080177; DEG10290115; DEG10130070; DEG10330223; DEG10030137; DEG10120152; DEG10220366; rpsP 103 30S ribosomal protein S16 21 DEG10050096; DEG10320216; DEG10160182; DEG10030110; DEG10010124; DEG10190157; DEG10020125; DEG10170148; DEG10120330; DEG10330184; DEG10060362; DEG10130449; DEG10210127; DEG10380104; DEG10180406; DEG10140174; DEG10310092; DEG10240189; DEG10290134; DEG10340315; DEG10150088; rpsQ 93 30S ribosomal protein S17 25 DEG10160309; DEG10020269; DEG10120062; DEG10310220; DEG10050274; DEG10250141; DEG10180501; DEG10190220; DEG10100112; DEG10200144; DEG10170325; DEG10130423; DEG10290041; DEG10140232; DEG10150041; DEG10060131; DEG10010044; DEG10030528; DEG10220412; DEG10340472; DEG10270129; DEG10280174; DEG10320280; DEG10210018; DEG10330313; rpsR 111 30S ribosomal protein S18 18 DEG10280019; DEG10030064; DEG10170019; DEG10120222; DEG10050172; DEG10240259; DEG10340399; DEG10060074; DEG10330350; DEG10130288; DEG10080234; DEG10010267; DEG10200207; DEG10020021; DEG10210051; DEG10160345; DEG10290336; DEG10190290; rpsS 108 30S ribosomal protein S19 23 DEG10010039; DEG10060126; DEG10150036; DEG10030533; DEG10290036; DEG10160304; DEG10180502; DEG10190225; DEG10330308; DEG10240332; DEG10120057; DEG10130428; DEG10050269; DEG10020274; DEG10220416; DEG10340477; DEG10200139; DEG10210014; DEG10100107; DEG10140237; DEG10170330; DEG10280179; DEG10320285; rpsT 68 30S ribosomal protein S20 17 DEG10340056; DEG10200004; DEG10100385; DEG10150273; DEG10290309; DEG10080009; DEG10320003; DEG10180004; DEG10120017; DEG10140254; DEG10020175; DEG10010184; DEG10130204; DEG10280535; DEG10240055; DEG10310208; DEG10030144; ruvA 70 Holliday junction DNA helicase subunit RuvA 6 DEG10350352; DEG10170234; DEG10220179; DEG10030355; DEG10300026; DEG10070007; ruvB 83 Holliday junction DNA helicase subunit RuvB 12 DEG10170233; DEG10160083; DEG10350353; DEG10250510; DEG10340021; DEG10330085; DEG10210007; DEG10070009; DEG10080186; DEG10110113; DEG10300027; DEG10220302; 140 secA 103 Preprotein translocase subunit SecA 29 DEG10270340; DEG10030473; DEG10010243; DEG10150247; DEG10120164; DEG10230302; DEG10360217; DEG10160023; DEG10250638; DEG10370198; DEG10200375; DEG10130107; DEG10170070; DEG10140027; DEG10020064; DEG10060054; DEG10330024; DEG10290357; DEG10240117; DEG10100514; DEG10270575; DEG10250368; DEG10380210; DEG10220292; DEG10190021; DEG10180026; DEG10020297; DEG10320024; DEG10210052; secY 134 Preprotein translocase subunit secY 30 DEG10290052; DEG10170314; DEG10130412; DEG10180492; DEG10010055; DEG10060141; DEG10360191; DEG10230145; DEG10030517; DEG10220401; DEG10340461; DEG10250145; DEG10160320; DEG10210025; DEG10270133; DEG10350396; DEG10140222; DEG10240319; DEG10190209; DEG10280163; DEG10330324; DEG10120073; DEG10380016; DEG10110180; DEG10080252; DEG10020258; DEG10370019; DEG10200154; DEG10320269; DEG10100121; serS 144 seryl-tRNA synthetase 32 DEG10030229; DEG10220348; DEG10370181; DEG10330138; DEG10100605; DEG10170006; DEG10200296; DEG10270673; DEG10010006; DEG10250754; DEG10020005; DEG10190075; DEG10230295; DEG10240174; DEG10070134; DEG10290216; DEG10150149; DEG10320091; DEG10310039; DEG10130362; DEG10340140; DEG10160136; DEG10060004; DEG10350327; DEG10360084; DEG10120157; DEG10380193; DEG10080295; DEG10180145; DEG10280108; DEG10140016; DEG10210174; smpB 97 SsrA-binding protein/SmpB superfamily 8 DEG10060046; DEG10220298; DEG10340345; DEG10300083; DEG10170083; DEG10140135; DEG10050343; DEG10020075; sodA 162 Manganese superoxide dismutase 7 DEG10120355; DEG10130191; DEG10250756; DEG10270675; DEG10350056; DEG10150244; DEG10360215; thrS 160 Threonyl-tRNA synthetase 29 DEG10030594; DEG10060308; DEG10120270; DEG10220196; DEG10360094; DEG10380064; DEG10100421; DEG10080016; DEG10340211; DEG10130387; DEG10250516; DEG10210142; DEG10110078; DEG10050493; DEG10170248; DEG10290210; DEG10160093; DEG10200074; DEG10230015; DEG10140296; DEG10350221; DEG10020192; DEG10190121; DEG10070202; DEG10270481; DEG10330095; DEG10180289; DEG10370065; DEG10320127; thyA 90 Thymidylate synthase 23 DEG10230105; DEG10150011; DEG10210109; DEG10270498; DEG10360014; DEG10070139; DEG10240193; DEG10280412; DEG10030143; DEG10290127; DEG10180428; DEG10250538; DEG10340384; DEG10130088; DEG10170185; DEG10220458; DEG10350305; DEG10320229; DEG10120280; DEG10160195; DEG10200309; DEG10330198; DEG10020157; tig 148 trigger factor Tig 1 DEG10300030; tkt 90 transketolase 20 DEG10220364; DEG10180438; DEG10170174; DEG10240134; DEG10350363; DEG10360020; DEG10130437; DEG10330207; DEG10050368; DEG10010170; DEG10100237; DEG10270265; DEG10290092; DEG10250277; DEG10200450; DEG10160204; DEG10020147; DEG10140193; DEG10010146; DEG10280265; topA 81 DNA topoisomerase I 22 DEG10240001; DEG10380140; DEG10220157; DEG10250713; DEG10020129; DEG10320157; DEG10050491; DEG10110077; DEG10290253; DEG10010128; DEG10130083; DEG10100575; DEG10180221; DEG10200333; DEG10360108; DEG10190108; DEG10060097; DEG10070067; DEG10210116; DEG10140176; DEG10270641; DEG10170154; tpiA 120 triosephosphate isomerase 20 DEG10070195; DEG10120349; DEG10360269; DEG10380070; DEG10230264; DEG10060350; DEG10100234; DEG10220136; DEG10010239; DEG10170079; DEG10050225; DEG10270263; DEG10130057; DEG10140169; DEG10250276; DEG10340409; DEG10020072; DEG10240085; DEG10210091; DEG10200247; trmD 136 tRNA (guanine-N(1)-)- methyltransferase 24 DEG10100464; DEG10060361; DEG10380105; DEG10320214; DEG10240191; DEG10160180; DEG10020126; DEG10330182; DEG10210126; DEG10190156; DEG10180404; DEG10110150; DEG10130447; DEG10120328; DEG10270523; DEG10010125; DEG10350309; DEG10370101; DEG10030112; DEG10140173; DEG10070040; DEG10290136; DEG10170150; DEG10250565; trpC2 68 Indole-3-glycerol phosphate synthase 5 DEG10100267; DEG10280366; DEG10250329; DEG10270306; DEG10130297; 141 trpD 88 Anthranilate phosphoribosyl transferase 5 DEG10100339; DEG10280367; DEG10130296; DEG10050501; DEG10250424; trpE 72 anthranilate synthase 15 DEG10100265; DEG10250633; DEG10100378; DEG10100150; DEG10250185; DEG10130114; DEG10110107; DEG10270181; DEG10280233; DEG10280391; DEG10130045; DEG10050115; DEG10250328; DEG10250463; DEG10270305; trpG 68 Anthranilate synthase component II 13 DEG10270597; DEG10080069; DEG10130295; DEG10130018; DEG10270006; DEG10050500; DEG10250005; DEG10280368; DEG10250666; DEG10050421; DEG10280051; DEG10360157; DEG10100534; trpS 146 tryptophanyl-tRNA synthetase 27 DEG10190234; DEG10100529; DEG10060101; DEG10340429; DEG10250658; DEG10360235; DEG10350275; DEG10220010; DEG10370227; DEG10170101; DEG10380240; DEG10260068; DEG10290061; DEG10120321; DEG10130345; DEG10030541; DEG10020084; DEG10140295; DEG10080239; DEG10210217; DEG10200016; DEG10270590; DEG10230255; DEG10280420; DEG10180516; DEG10070244; DEG10010091; truA 127 tRNA pseudouridine synthase A 4 DEG10250674; DEG10100545; DEG10050599; DEG10060153; truB 102 tRNA pseudouridine synthase B 2 DEG10340009; DEG10050460; trxA1 72 Thioredoxin 10 DEG10220004; DEG10060099; DEG10030730; DEG10010202; DEG10150321; DEG10180400; DEG10130381; DEG10110205; DEG10290070; DEG10170122; trxB 97 Thioredoxin reductase 19 DEG10100609; DEG10060083; DEG10240290; DEG10270683; DEG10010230; DEG10130147; DEG10200359; DEG10050440; DEG10240172; DEG10180527; DEG10220259; DEG10140281; DEG10140297; DEG10010241; DEG10020096; DEG10170073; DEG10250766; DEG10070179; DEG10210167; tsf 160 elongation factor Ts 29 DEG10290151; DEG10010130; DEG10120042; DEG10330035; DEG10340422; DEG10370217; DEG10210206; DEG10380225; DEG10350259; DEG10230260; DEG10320034; DEG10110016; DEG10240235; DEG10270520; DEG10060352; DEG10130264; DEG10220336; DEG10310156; DEG10100459; DEG10030449; DEG10160034; DEG10190029; DEG10020132; DEG10200262; DEG10140201; DEG10250561; DEG10360148; DEG10180040; DEG10170156; tuf 144 Elongation factor Tu 88 DEG10150286; DEG10020086; DEG10210198; DEG10100449; DEG10350413; DEG10270224; DEG10180508; DEG10180509; DEG10250229; DEG10340483; DEG10110170; DEG10340049; DEG10100190; DEG10120051; DEG10060114; DEG10120365; DEG10360266; DEG10350405; DEG10350406; DEG10180471; DEG10340492; DEG10250133; DEG10250132; DEG10020174; DEG10020137; DEG10030135; DEG10230160; DEG10110190; DEG10280185; DEG10160222; DEG10180568; DEG10370036; DEG10280414; DEG10320291; DEG10120342; DEG10220335; DEG10160298; DEG10230195; DEG10200381; DEG10320249; DEG10140071; DEG10290030; DEG10130059; DEG10020051; DEG10020052; DEG10190182; DEG10170048; DEG10170049; DEG10010137; DEG10350216; DEG10220422; DEG10010032; DEG10010033; DEG10270119; DEG10130160; DEG10140160; DEG10330302; DEG10170166; DEG10270506; DEG10100099; DEG10220276; DEG10050188; DEG10220060; DEG10140150; DEG10380190; DEG10290114; DEG10260053; DEG10130313; DEG10060366; DEG10330225; DEG10050236; DEG10130146; DEG10370177; DEG10200009; DEG10240293; DEG10370071; DEG10070012; DEG10100100; DEG10060071; DEG10290087; DEG10230154; DEG10020099; DEG10270120; DEG10380031; DEG10360202; DEG10190231; DEG10210136; DEG10250549; tyrS 133 tyrosyl-tRNA synthetase 29 DEG10120232; DEG10340017; DEG10160102; DEG10310168; DEG10140182; DEG10020205; DEG10060368; DEG10280202; DEG10180273; DEG10230185; DEG10270322; DEG10350354; DEG10380020; DEG10190114; DEG10200237; DEG10070234; DEG10250355; DEG10030129; DEG10290124; DEG10170260; DEG10010216; DEG10330105; DEG10050579; DEG10320142; DEG10220064; DEG10370025; DEG10100291; DEG10210031; DEG10130004; upp 74 uracil phosphoribosyltransferase 3 DEG10060021; DEG10020233; DEG10140121; 142 uvrB 119 UvrABC system protein B 10 DEG10050448; DEG10060055; DEG10250631; DEG10270219; DEG10050452; DEG10020066; DEG10270573; DEG10050321; DEG10020228; DEG10010112; uvrC 90 UvrABC system protein C 6 DEG10060174; DEG10100231; DEG10250271; DEG10020104; DEG10270258; DEG10180295; valS 185 valyl-tRNA synthetase 95 DEG10330352; DEG10290306; DEG10160148; DEG10240062; DEG10230300; DEG10340272; DEG10280301; DEG10130006; DEG10270288; DEG10370053; DEG10170264; DEG10210040; DEG10030147; DEG10220095; DEG10060284; DEG10120199; DEG10020031; DEG10070203; DEG10220258; DEG10180006; DEG10170025; DEG10310102; DEG10010110; DEG10020186; DEG10120108; DEG10170240; DEG10360173; DEG10320349; DEG10210063; DEG10250186; DEG10240137; DEG10380176; DEG10270449; DEG10130366; DEG10120038; DEG10030503; DEG10080214; DEG10360249; DEG10230180; DEG10020210; DEG10060222; DEG10280409; DEG10370028; DEG10360169; DEG10020112; DEG10290288; DEG10280074; DEG10330151; DEG10380169; DEG10050502; DEG10160347; DEG10350060; DEG10320072; DEG10220171; DEG10370155; DEG10210162; DEG10250479; DEG10060273; DEG10350223; DEG10330003; DEG10010218; DEG10150271; DEG10270182; DEG10060013; DEG10230038; DEG10190293; DEG10140090; DEG10380050; DEG10350361; DEG10100396; DEG10320005; DEG10370162; DEG10140258; DEG10200097; DEG10180599; DEG10200472; DEG10150075; DEG10100151; DEG10160003; DEG10310141; DEG10380024; DEG10050332; DEG10180113; DEG10280125; DEG10290105; DEG10200161; DEG10190066; DEG10140025; DEG10010199; DEG10340148; DEG10340547; DEG10130395; DEG10110002; DEG10170133; DEG10250305; ychF 89 GTP-binding protein YchF 30 DEG10190185; DEG10170235; DEG10240122; DEG10350373; DEG10210085; DEG10020183; DEG10200048; DEG10160228; DEG10270446; DEG10250476; DEG10110174; DEG10280434; DEG10070058; DEG10060316; DEG10220166; DEG10380157; DEG10290318; DEG10120388; DEG10130308; DEG10020018; DEG10280250; DEG10180201; DEG10180473; DEG10100391; DEG10370143; DEG10140057; DEG10330231; DEG10070001; DEG10010194; DEG10140136; References: 1 Zhang, R., Ou, H. Y. & Zhang, C. T. DEG: a database of essential genes. Nucleic acids research 32, D271-D272 (2004). 143 3.1.8.14 – Alignment output for 181 essential proteins agains five hosts Supplementary Material S14: Alignment output for 181 essential proteins against five hosts. Este material contém o resultado do alinhamento feito pelo programa Blastp das 181 proteínas essenciais contra os hospedeiros naturais Ovis aries, Capra hircus, Bos taurus, Equus caballus e Homo sapiens; disponibilizado no CD que acompanha esta tese. 144 3.1.8.15 – Essential proteins homology against hosts 145 146 147 148 149 3.2 - Label-free proteomic analysis to confirm the predicted proteome of Corynebacterium pseudotuberculosis under nitrosative stress mediated by nitric oxide Wanderson M. Silva, Rodrigo D. Carvalho, Siomar C. Soares, Isabela F. S. Bastos, Edson Luiz Folador, Gustavo H. M. F. Souza, Yves Le Loir, Anderson Miyoshi, Artur Silva e Vasco Azevedo No trablalho experimental de proteômica comparativa conduzido pelo Dr. Wanderson M. Silva, quando comparado uma amostra da linhagem 1002 de C. pseudotuberculosis submetida a estresse nitrosativo com uma amostra controle, foram identificadas proteínas diferencialmente expressas. Em posse das redes de interação, neste trabalho foi criado uma subrede, contendo as interaçãoes entre um conjunto específico de proteínas. Assim, a rede de interação parcial para a linhagem 1002 foi formada pela interação entre as proteínas diferencialmente expressas somadas às proteínas exclusivamente expressas na condição de estresse. A rede de interação propiciou uma visão sistêmica das proteínas envolvidas na resposta ao estresse nitrosativo e, junto com outros experimentos, auxiliou na interpretação dos mecanismos biológicos que permite a resistência e sobrevivência de C. pseudotuberculosis quando exposta à condição de stresse. O artigo referente a este trabalho foi publicado na revista BMC Genomics em dezembro de 2014, tendo DOI número 10.1186/1471-2164-15-1065, estando também disponível no endereço eletrônico http://www.biomedcentral.com/1471-2164/15/1065. 150 3.2.1 - Backgound 151 3.2.2 - Methods 152 3.2.3 - Results 153 154 155 3.2.4 - Discussion 156 157 158 159 160 161 162 3.2.5 - Conclusions 163 3.2.6 - References 164 165 4 - Discussão Geral Como resultado do trabalho desenvolvido nesta tese, obtivemos dois resultados principais. O primeiro resultado foi a validação de uma metodologia genérica para a predição de interação proteína-proteína, descrito no capítulo de metodologia. O segundo resultado foi obtido com a aplicação desta metodologia validada para a predição das redes de interação para nove linhagens de Corynebacterium pseudotuberculosis biovar ovis. No primeiro trabalho, objetivamos identificar e validar métricas, extraídas dos valores dos alinhamentos feito pelo BLASTp, que pudessem ser usadas para diferenciar interações falsas e positivas. Para isto, usamos a base de dados pública DIP, contendo interações experimentais e curadas, como padrão ouro. Usamos também as bases de dados públicas (pDB) String, Intact e Psibase para mapearmos as interações. Assim, usando o programa BLASTp e as sequências de aminoácidos cada interação em formato FASTA, fizemos o alinhamento recíproco, mapeamos e transferimos as interações encontradas nas pDB para DIP. Sendo DIP nosso padrão ouro, contabilizamos estatisticamente as interações falsas e verdadeiras. Como DIP contém somente interações verdadeiras, o conjunto de interações negativas foi criado com identificadores da mesma base de dados, contendo em proporção de cinco vezes a quantidade de interações positivas, criadas aleatoriamente. Para isto, geramos dois conjuntos de dados distintos para serem avaliados, ambos contendo os alinhamentos recíprocos entre as pDB e DIP, gerados pelo BLASTp. No primeiro conjunto de alinhamentos, somente o primeiro alinhamento do BLASTp foi considerado, justificado pela maior probabilidade de ser uma proteína homóloga. No segundo conjunto de alinhamentos, foram considerados os 20 primeiros alinhamentos do BLASTp, visando assim, identificar outros alinhamentos entre proteínas homólogas. Para ambos conjuntos de dados, os valores dos alinhamentos retornados pelo BLASTp foram recuperados, sendo eles o score, e-value, bitscore, similaridade, identidade e cobertura. Adicionalmente, geramos subconjuntos com combinações dos valores obtidos dos alinhamentos feitos com o BLASTp. Assim, no total foram gerados 42 subconjuntos distintos de predições a serem avaliados (dois conjuntos de dados com sete métricas para três pDB). Cada subconjunto ou combinação destes foram submetidos a avaliação com a curva Receiver Operaing Characteristic (ROC), visando identificar a métrica com maior Area Under Curve (AUC) que pudesse melhor diferenciar as interações verdadeiras das falsas. Assim, nós identificamos, para cada pDB, os valores retornados do alinhamento feito pelo BLASTp que melhor contribuem para as predições. 166 A combinação dos valores de identidade e cobertura extraídos dos alinhamentos compuseram a melhor métrica, correspondendo a um AUC de 0,96 para pDB individual e um AUC de 0,93 para a combinação de pDB. O ponto de corte de 0,70 para a métrica identidade vezes cobertura, corresponde à especificidade de 0,95 e sensibilidade de 0,90, demostrando que nosso método prediz eficientemente as interações proteína-proteína. Adicionalmente, em vez de usarmos somente o primeiro alinhamento do BLASTp, nós usamos os 20 primeiros alinhamentos, aumentando a quantidade de pares de interação preditos e a cobertura na rede de interação. Consequentemente, aumentamos também exponencialmente a quantidade de alinhamentos e pares de interação para serem manipulados e tratados. Ao usar mais que um alinhamento do BLASTp, gera-se redundância de pares de interação preditos entre as pDB e entre as proteínas homólogas contidas dentre os 20 alinhamentos do BLASTp. Sob o ponto de vista tecnológico esta quantidade de dados não útil pode gerar problemas, exigindo computadores mais potentes ou algoritmo mais eficiente para o processamento. No segundo trabalho, aplicamos esta metodologia com as métricas validadas para gerar as redes de interação para nove linhagens do biovar ovis de C. pseudotuverculosis (Cp). Assim, seguindo a metodologia, executamos o alinhamento recíproco entre as nove linhagens de Cp contra as pDBs, identificamos os pares de interação e usamos os valores de identidade vezes cobertura extraídos dos alinhamentos do BLASTP para calcular a métrica e gerar as redes de interações. Como resultado, foram preditos aproximadamente 16.000 pares de interação para cada linhagem de Cp, sendo ~99% mapeado do gênero Corynebacterium, ou seja, de um organismo filogeneticamente próximo, aumentando biologicamente a probabilidade que as interações preditas realmente ocorram em Cp. Destes pares de interação preditos, 15.495 são conservados entre as nove linhages do biovar ovis de Cp. Este conjunto de interações conservadas foi usado para fazer análise dos clusteres e identificação de proteínas essenciais. Antes, porém, nos preocupamos em validar as redes de interação preditas e verificar se possuíam características de redes biológicas. Submetemos então as redes de interação preditas para validação quanto a menor caminho (Shortest Path) e verificar se o grau de interação seguia uma distribuição livre de escala (Scale Free) com aproximação à lei de poderes (Power Law). Ambas análises topológicas sugerem que todas as redes de interação preditas possuem característica pertencentes às redes biológicas. 167 Adicionalmente, foi verificado se as redes de interação preditas tinham alguma chance de serem geradas aleatoriamente. Assim, submetemos as redes de interação geradas ao teste de distribuição normal denominado Shapiro-Wilk teste, qual descartou definitivamente a probabilidade que as redes de interação tivessem uma distribuição normal, obtendo um p- value < 2.2e-16 (Shapiro e Wilk, 1965). Ainda, comparamos as redes de interação preditas contra redes de interação geradas aleatoriamente. Nesta comparação, os valores do Coeficiente de Clusterização, Correlção e R2 obtidos são extremamente diferentes entre os dois tipos de redes, sugerindo que as redes preditas não foram formadas por interações expúrias ou aleatórias, possuindo um viés biológico, possivelmente devido à pressão evolucionária exercida sobre estas interações no organismo. Em tempo, o alto valor do Coeficiente de Clusterização sugere uma auto organização nas célula de Cp motivada pelas interações (Galeota et al., 2015). Seguros de estarmos analisando redes de interação biológicas, procedemos com a análise dos clusteres de proteínas e das proteínas essenciais. Dentre os clustes encontrados, selecionamos cinco com maior quantidade de proteínas para serem analisados com suporte da literatura, sendo eles principalmente formados por proteínas Ribossomais e de RNA Polimerase, Sistema de transporte de Oligopeptídeos, Biosintese de Cobalamina, Aquisição de Ferro e regulação intracelular e, Divisão celular e biossíntese da parede celular. Ao analisar os clusters, o viés biológico exercido sobre estes e as interações, é identificado e apoiado pela descrição na literatura e caracterização por métodos experimentais, mesmo que em outros organismos filogeneticamente próximos. Este conhecimento a nível de biologia de sistemas, obtidos na literatura, pode então ser transferido, via rede de interação, para Cp, possibilitanto melhor entendimento do organismo. Da mesma forma, a falta de informação na literatura sobre algumas interações, faz das redes de interação proteína- proteína uma importante ferramenta para melhor analisar e entender o comportamento celular de Cp, permitindo levantar novas hipóteses e direcionar novos experimentos em laboratório, visando testar a drogabilidade e essencialidade destas proteínas e interações. Entre as 15.495 interações conservadas nas nove redes de interação preditas para Cp, considerando principalmente o grau de interação, 181 proteínas essenciais foram identificadas (Khuri e Wuchty, 2015); participando principalmente no metabolismo de carbono, envelope celular e síntese da parede celular, biossíntese de nucleotídeos, enovelamente, translocação, formação do ribossomo, fatores de transcrição, síntese de tRNA, metabolismo de RNA e, via metabólica respiratória. Dentre estas proteínas essencias, somente a DNA repair (RecN) não foi identificada como essencial na base de dados DEG. 168 Enquanto a maioria das proteínas essenciais possuem mais proteínas em mais de 20 organismos de DEG, outras três proteínas essenciais em Cp tiveram homologia com apenas um organismo de DEG: Catalase (KatA), Endonuclease III (Nth) and Trigger factor Tig (Tig). Isto pode ser explicado pelo fato de que a essencialidade nem sempre é conservada entre as espécies (Caufield et al., 2015). Dentre as proteínas essenciais 41 não tiveram homologia contra seus hospedeiros, sendo boas candidatas para uso em diagnóstico ou alvos para drogas. Além da identificação de clusteres e proteínas essenciais, as redes de interação podem ser usadas em conjunto com outras técnicas experimentais para auxiliar na interpretação dos resultados. Assim, em posse da rede de interação proteína-proteína gerada para a linhagem 1002 de C. pseudotuberculosis, foram identificadas as interações entre as proteínas com baixa e alta expressão, bem como as proteínas exclusivamente expressas, quando submetidas a stresse nitrosativo. A visão sistêmica das proteínas envolvidas na condição de estresse, propiciada pela rede de interação, auxiliou na interpretação dos resultados do experimento de proteômica comparativa. Ao analisar as redes de interação com mais atenção aos detalhes e considerando os resultados obtidos durante o desenvolvimento desta tese, é perceptível que muitos outros trabalhos derivados ou somados às redes de interação poderão ser desenvolvidos, sejam eles de natureza experimental ou computacional. 169 5 - Conclusão e Perspectivas Neste trabalho, analisamos e validamos um conjunto de métricas capaz de mapear com eficiência interações ortólogas de bases de dados públicas, aumentando inclusive a cobertura em uma rede de interação. Pela primeira vez usamos esta metodologia validada para mapear as interação proteína-proteína para nove linhagens do biovar ovis de C. pseudotuberculosis. Adicionalmente, geramos a rede de interação dos genes diferencialmente e exclusivamente expressos para auxiliar na interpretação dos resultados gerados por experimento de proteomica comparativa. Mais importante que a validação estatística aplicada sobre as redes preditas, evidenciando que possuem características de redes biológicas, são as evidências biológicas encontradas, apoiadas pela literatura, na análise dos clusteres e proteínas essenciais. Assim, o método para predição de redes de interação proteína-proteína se mostra uma importante ferramenta para biólogos estudarem e entenderem os organismos de interesse a nível de biologia de sistemas, bem como, uma valiosa ferramenta para a predição de proteínas essenciais, com potencial uso em diagnóstico ou como alvos para drogas. Neste trabalho, além das 181 proteínas essenciais preditas, existem aproximadamente 15.000 interações conservadas entre as nove linhagens para ser exploradas experimentalmente e gerar trabalhos futuros. Dentre algumas perpectivas de trabalhos, experimentais ou computacionais, podemos citar: - Estudar os clusteres e interações identificadas que tenham relevância biológica visando entender melhor C. pseudotuberculosis e sua patogenicidade, direcionando novos trabalhos em laboratório (Marsh et al., 2013); - Testar experimentalmente as proteínas essenciais identificadas; - Re-anotar as proteínas hipotéticas baseado na função de seus parceiros de interação encontrados na rede (Peng et al., 2014; Hao et al., 2015); - Desenvolver uma base de dados pública e disponibilizar as interações preditas para C. pseudotuberculosis, bem como uma forma eficiente e amigável para sua visualização; - Aplicar a metodologia desenvolvida na predição das redes de interação proteína-proteína de outros organismos de interesse biotecnológico; 170 - Considerando as montagens geradas para os novos genomas sequenciados do biovar equi, fazer a predição de interação proteína-proteína e comparar as diferenças e semelhanças com o biovar ovis. - Cruzar as redes de interação com os dados gerados por experimentos de RNA-Seq, SNPs, proteomica ou outros experimentos biológicos, visando extrair informação e entender como estas proteínas interagem e cooperam nas condições testadas. Neste sentido, em colaboração com o Dr. Wanderson Marques Silva, esta em adamento um trabalho para caracterizar o proteoma total das linhagens 1002 (biovar ovis) e 258 (biovar equi) e explorar as diferenças entre os dois biovares que possam fornecer dados a respeito da biologia deste patógeno. Experimentos com proteômica já foram feitos e foram caracterizadas aproximadamente 1.321 proteínas de C. pseudotuberculosis. Estas proteínas serão analisadas nas redes de interação considerando o nível de expressão e se pertencem ao interactoma central ou específico de cada biovar. clxxi Bibliografia ABEBE, D.; SISAY TESSEMA, T. Determination of Corynebacterium pseudotuberculosis prevalence and antimicrobial susceptibility pattern of isolates from lymph nodes of sheep and goat at organic export abattoir, Modjo, Ethiopia. Letters in Applied Microbiology, 2015. ISSN 1472-765X. ADÉKAMBI, T.; DRANCOURT, M.; RAOULT, D. The rpoB gene as a tool for clinical microbiologists. Trends in microbiology, v. 17, n. 1, p. 37-45, 2009. ISSN 0966-842X. ALLEN, C. E.; SCHMITT, M. P. Novel hemin binding domains in the Corynebacterium diphtheriae HtaA protein interact with hemoglobin and are critical for heme iron utilization by HtaA. Journal of bacteriology, v. 193, n. 19, p. 5374-5385, 2011. ISSN 0021-9193. ANH, N. H. et al. Discovery of pathways in protein-protein interaction networks using a genetic algorithm. Data & Knowledge Engineering, 2015. ISSN 0169-023X. ASSENOV, Y. et al. Computing topological parameters of biological networks. Bioinformatics, v. 24, n. 2, p. 282-284, 2008. ISSN 1367-4803. BAIRD, G. J.; FONTAINE, M. C. Corynebacterium pseudotuberculosis and its Role in Ovine Caseous Lymphadenitis. Journal of comparative pathology, v. 137, n. 4, p. 179-210, 2007. ISSN 0021-9975. BARABÁSI, A. L.; OLTVAI, Z. N. Network biology: understanding the cell's functional organization. Nature Reviews Genetics, v. 5, n. 2, p. 101-113, 2004. ISSN 1471-0056. BARH, D. et al. Conserved host–pathogen PPIs Globally conserved inter-species bacterial PPIs based conserved host-pathogen interactome derived novel target in C. pseudotuberculosis, C. diphtheriae, M. tuberculosis, C. ulcerans, Y. pestis, and E. coli targeted by Piper betel compounds. Integrative Biology, v. 5, n. 3, p. 495-509, 2013. BETUL, K.; ERIC, A. Experimental evolution of protein-protein interaction networks. Biochemical Journal, v. 453, n. 3, p. 311-319, 2013. ISSN 1470-8728. BRAIBANT, M.; GILOT, P. The ATP binding cassette (ABC) transport systems of Mycobacterium tuberculosis. FEMS microbiology reviews, v. 24, n. 4, p. 449-467, 2000. ISSN 1574-6976. BRAUN, P.; GINGRAS, A. C. History of protein–protein interactions: From egg‐white to complex networks. Proteomics, v. 12, n. 10, p. 1478-1498, 2012. ISSN 1615-9861. clxxii BROWN, S. D. et al. Molecular dynamics of the Shewanella oneidensis response to chromate stress. Molecular & Cellular Proteomics, v. 5, n. 6, p. 1054-1071, 2006. ISSN 1535-9476. BUSS, J. et al. A Multi-layered Protein Network Stabilizes the Escherichia coli FtsZ-ring and Modulates Constriction Dynamics. 2015. ISSN 1553-7404. BUTLER, W.; AHEARN, D.; KILBURN, J. High-performance liquid chromatography of mycolic acids as a tool in the identification of Corynebacterium, Nocardia, Rhodococcus, and Mycobacterium species. Journal of clinical microbiology, v. 23, n. 1, p. 182-185, 1986. ISSN 0095-1137. CAMACHO, C. et al. BLAST+: architecture and applications. BMC bioinformatics, v. 10, n. 1, p. 421, 2009. ISSN 1471-2105. CARBALLIDO-LÓPEZ, R.; ERRINGTON, J. A dynamic bacterial cytoskeleton. Trends in cell biology, v. 13, n. 11, p. 577-583, 2003. ISSN 0962-8924. CASTRO-ROA, D.; ZENKIN, N. In vitro experimental system for analysis of transcription– translation coupling. Nucleic acids research, v. 40, n. 6, p. e45-e45, 2012. ISSN 0305- 1048. CAUFIELD, J. H. et al. Protein Complexes in Bacteria. PLOS Computational Biology, v. 11, n. 2, 2015. ISSN 1553-734X. CERDEIRA, L. T. et al. Whole-genome sequence of Corynebacterium pseudotuberculosis PAT10 strain isolated from sheep in Patagonia, Argentina. Journal of bacteriology, v. 193, n. 22, p. 6420-6421, 2011. ISSN 0021-9193. COENYE, T.; VANDAMME, P. Organisation of the S10, spc and alpha ribosomal protein gene clusters in prokaryotic genomes. FEMS microbiology letters, v. 242, n. 1, p. 117-126, 2005. ISSN 0378-1097. COLOM-CADENA, A. et al. Management of a caseous lymphadenitis outbreak in a new Iberian ibex (Capra pyrenaica) stock reservoir. Acta Veterinaria Scandinavica, v. 56, n. 1, p. 83, 2014. ISSN 1751-0147. CONTRERAS, H. et al. Heme uptake in bacterial pathogens. Current opinion in chemical biology, v. 19, p. 34-41, 2014. ISSN 1367-5931. CORRENTI, C.; STRONG, R. K. Mammalian siderophores, siderophore-binding lipocalins, and the labile iron pool. Journal of Biological Chemistry, v. 287, n. 17, p. 13524-13531, 2012. ISSN 0021-9258. clxxiii CROFT, M. T. et al. Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature, v. 438, n. 7064, p. 90-93, 2005. ISSN 0028-0836. CUI, T.; HE, Z.-G. Improved understanding of pathogenesis from protein interactions in Mycobacterium tuberculosis. Expert review of proteomics, n. 0, p. 1-11, 2014. ISSN 1478- 9450. CUTLER, R. G. Oxidative stress and aging: catalase is a longevity determinant enzyme. Rejuvenation research, v. 8, n. 3, p. 138-140, 2005. ISSN 1549-1684. DAI, Q.-G. et al. CPL: Detecting Protein Complexes by Propagating Labels on Protein- Protein Interaction Network. Journal of Computer Science and Technology, v. 29, n. 6, p. 1083-1093, 2014. ISSN 1000-9000. DALL, H. P. et al. Omics profiles used to evaluate the gene expression of Exiguobacterium antarcticum B7 during cold adaptation. BMC genomics, v. 15, n. 1, p. 986, 2014. ISSN 1471-2164. DE LAS RIVAS, J.; FONTANILLO, C. Protein–protein interaction networks: unraveling the wiring of molecular machines within the cell. Briefings in Functional Genomics, 2012. ISSN 2041-2649. DEN BLAAUWEN, T.; ANDREU, J. M.; MONASTERIO, O. Bacterial cell division proteins as antibiotic targets. Bioorganic chemistry, v. 55, p. 27-38, 2014. ISSN 0045-2068. DEUERLING, E. et al. Trigger factor and DnaK cooperate in folding of newly synthesized proteins. Nature, v. 400, n. 6745, p. 693-696, 1999. ISSN 0028-0836. DORELLA, F. A. et al. Corynebacterium pseudotuberculosis: microbiology, biochemical properties, pathogenesis and molecular studies of virulence. Veterinary research, v. 37, n. 2, p. 201-218, 2006. ISSN 0928-4249. EISEN, J. A.; HANAWALT, P. C. A phylogenomic study of DNA repair genes, proteins, and processes. Mutation Research/DNA Repair, v. 435, n. 3, p. 171-213, 1999. ISSN 0921- 8777. EL ZOEIBY, A.; SANSCHAGRIN, F.; LEVESQUE, R. C. Structure and function of the Mur enzymes: development of novel inhibitors. Molecular microbiology, v. 47, n. 1, p. 1-12, 2003. ISSN 1365-2958. ERRINGTON, J.; DANIEL, R. A.; SCHEFFERS, D.-J. Cytokinesis in bacteria. Microbiology and Molecular Biology Reviews, v. 67, n. 1, p. 52-65, 2003. ISSN 1092-2172. clxxiv ESTRADA, E. Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics, v. 6, n. 1, p. 35-40, 2006. ISSN 1615-9861. FLÓREZ, A. et al. Protein network prediction and topological analysis in Leishmania major as a tool for drug target selection. BMC bioinformatics, v. 11, n. 1, p. 484, 2010. ISSN 1471-2105. FOLADOR, E. L. et al. An improved interolog mapping-based computational prediction of protein-protein interactions with increased network coverage. Integrative Biology, 2014. FRANCESCHINI, A. et al. STRING v9. 1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research, v. 41, n. D1, p. D808-D815, 2013. ISSN 0305-1048. FRANKENBERG, N.; MOSER, J.; JAHN, D. Bacterial heme biosynthesis and its biotechnological application. Applied microbiology and biotechnology, v. 63, n. 2, p. 115- 127, 2003. ISSN 0175-7598. GALEOTA, E. et al. The hierarchical organization of natural protein interaction networks confers self-organization properties on pseudocells. BMC Systems Biology, v. 9, n. Suppl 3, p. S3, 2015. ISSN 1752-0509. GARMA, L. et al. How Many Protein-Protein Interactions Types Exist in Nature? PloS one, v. 7, n. 6, p. e38913, 2012. ISSN 1932-6203. GONZALEZ, M. W.; KANN, M. G. Protein interactions and disease. PLoS computational biology, v. 8, n. 12, p. e1002819, 2012. ISSN 1553-7358. GOWTHAMAN, R.; LYSKOV, S.; KARANICOLAS, J. DARC 2.0: Improved Docking and Virtual Screening at Protein Interaction Sites. PloS one, v. 10, n. 7, p. e0131612, 2015. ISSN 1932-6203. GÓRSKA, A.; SLODERBACH, A.; MARSZAŁŁ, M. P. Siderophore–drug complexes: potential medicinal applications of the ‘Trojan horse’strategy. Trends in pharmacological sciences, v. 35, n. 9, p. 442-449, 2014. ISSN 0165-6147. HADDADIN, F. A. T.; HARCUM, S. W. Transcriptome profiles for high‐cell‐density recombinant and wild‐type Escherichia coli. Biotechnology and bioengineering, v. 90, n. 2, p. 127-153, 2005. ISSN 1097-0290. HAN, J.-D. J. et al. Evidence for dynamically organized modularity in the yeast protein– protein interaction network. Nature, v. 430, n. 6995, p. 88-93, 2004. ISSN 0028-0836. clxxv HAO, T. et al. Function Annotation of Proteins in Eriocheir sinensis Based on the Protein- Protein Interaction Network. The Proceedings of the Third International Conference on Communications, Signal Processing, and Systems, 2015, Springer. p.831-837. HARIHARAN, H. et al. Serological detection of caseous lymphadenitis in sheep and goats using a commercial ELISA in Grenada, West Indies. 2014. HASSAN, S. S. et al. Complete genome sequence of Corynebacterium pseudotuberculosis biovar ovis strain P54B96 isolated from antelope in South Africa obtained by Rapid Next Generation Sequencing Technology. Standards in genomic sciences, v. 7, n. 2, p. 189, 2012. ______. Proteome scale comparative modeling for conserved drug and vaccine targets identification in Corynebacterium pseudotuberculosis. BMC genomics, v. 15, n. Suppl 7, p. S3, 2014. ISSN 1471-2164. HELDT, D. et al. Aerobic synthesis of vitamin B12: ring contraction and cobalt chelation. Biochemical Society Transactions, v. 33, n. 4, p. 815-819, 2005. ISSN 0300-5127. HERMJAKOB, H. et al. IntAct: an open source molecular interaction database. Nucleic acids research, v. 32, n. suppl 1, p. D452-D455, 2004. ISSN 0305-1048. HIRON, A. et al. Only one of four oligopeptide transport systems mediates nitrogen nutrition in Staphylococcus aureus. Journal of bacteriology, v. 189, n. 14, p. 5119-5129, 2007. ISSN 0021-9193. HÄUSER, R. et al. A Second-generation Protein–Protein Interaction Network of Helicobacter pylori. Molecular & Cellular Proteomics, v. 13, n. 5, p. 1318-1329, 2014. ISSN 1535-9476. HÉMOND, V. et al. Lymphadénite axillaire à< i> Corynebacterium pseudotuberculosis chez une patiente de 63 ans. Médecine et maladies infectieuses, v. 39, n. 2, p. 136-139, 2009. ISSN 0399-077X. IKEDA, M. Towards bacterial strains overproducing L-tryptophan and other aromatics by metabolic engineering. Applied microbiology and biotechnology, v. 69, n. 6, p. 615-626, 2006. ISSN 0175-7598. IVANOVIĆ, S. et al. Caseous lymphadenitis in goats. Biotechnology in Animal Husbandry, v. 25, n. 5-6-2, p. 999-1007, 2009. ISSN 1450-9156. JEONG, H. et al. Lethality and centrality in protein networks. arXiv preprint cond- mat/0105306, 2001. clxxvi JONES, M. M. et al. Role of the Oligopeptide Permease ABC Transporter of Moraxella catarrhalis in Nutrient Acquisition and Persistence in the Respiratory Tract. Infection and immunity, v. 82, n. 11, p. 4758-4766, 2014. ISSN 0019-9567. JUNG, B. Y. et al. Serology and clinical relevance of Corynebacterium pseudotuberculosis in native Korean goats (Capra hircus coreanae). Tropical Animal Health and Production, p. 1-5, ISSN 0049-4747. ______. Serology and clinical relevance of Corynebacterium pseudotuberculosis in native Korean goats (Capra hircus coreanae). Tropical animal health and production, v. 47, n. 4, p. 657-661, 2015. ISSN 0049-4747. KHURI, S.; WUCHTY, S. Essentiality and centrality in protein interaction networks revisited. BMC Bioinformatics, v. 16, n. 1, p. 109, 2015. ISSN 1471-2105. KOHL, M.; WIESE, S.; WARSCHEID, B. Cytoscape: software for visualization and analysis of biological networks. In: (Ed.). Data Mining in Proteomics: Springer, 2011. p.291-303. ISBN 1607619865. KUNKLE, C. A.; SCHMITT, M. P. Analysis of a DtxR-regulated iron transport and siderophore biosynthesis gene cluster in Corynebacterium diphtheriae. Journal of bacteriology, v. 187, n. 2, p. 422-433, 2005. ISSN 0021-9193. KÖSTER, W. ABC transporter-mediated uptake of iron, siderophores, heme and vitamin B 12. Research in microbiology, v. 152, n. 3, p. 291-301, 2001. ISSN 0923-2508. LAGE, K. Protein-protein interactions and genetic diseases: The Interactome. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease, 2014. ISSN 0925-4439. LI, H. et al. A Computational Method to Identify Druggable Binding Sites That Target Protein- Protein Interactions. 2014. LI, M. et al. A new essential protein discovery method based on the integration of protein- protein interaction and gene expression data. BMC systems biology, v. 6, n. 1, p. 15, 2012. ISSN 1752-0509. LIU, Z.-P. et al. Inferring a protein interaction map of Mycobacterium tuberculosis based on sequences and interologs. BMC bioinformatics, v. 13, n. Suppl 7, p. S6, 2012. ISSN 1471- 2105. LO, Y. et al. Reconstructing genome-wide protein-protein interaction networks using multiple strategies with homologous mapping. PloS one, v. 10, n. 1, p. e0116347, 2015. ISSN 1932- 6203. clxxvii LOPES, T. et al. Complete Genome Sequence of Corynebacterium pseudotuberculosis Strain Cp267, Isolated from a Llama. Journal of bacteriology, v. 194, n. 13, p. 3567-3568, 2012. ISSN 0021-9193. LUO, H. et al. DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements. Nucleic acids research, v. 42, n. D1, p. D574-D580, 2014. ISSN 0305-1048. LUTKENHAUS AND, J.; ADDINALL, S. Bacterial cell division and the Z ring. Annual review of biochemistry, v. 66, n. 1, p. 93-116, 1997. ISSN 0066-4154. MARSH, J. A. et al. Protein complexes are under evolutionary selection to assemble via ordered pathways. Cell, v. 153, n. 2, p. 461-470, 2013. ISSN 0092-8674. MARTÍN, J. F. et al. Ribosomal RNA and ribosomal proteins in corynebacteria. J. Biotechnol, v. 104, p. 41-53, 2003. MCGARY, K.; NUDLER, E. RNA polymerase and the ribosome: the close relationship. Current opinion in microbiology, v. 16, n. 2, p. 112-117, 2013. ISSN 1369-5274. MILSE, J. et al. Transcriptional response of Corynebacterium glutamicum ATCC 13032 to hydrogen peroxide stress and characterization of the OxyR regulon. Journal of biotechnology, v. 190, p. 40-54, 2014. ISSN 0168-1656. MIRA, C. et al. Epidemiological and Histopathological Studies on Caseous Lymphadenitis in Slaughtered Goats in Algeria. lung, v. 6, p. 26.5, 2014. MITRA, A. Biology, Genetic Aspects, and Oxidative Stress Response of Streptomyces and Strategies for Bioremediation of Toxic Metals. Microbial Biodegradation and Bioremediation, p. 287, 2014. ISSN 0128004827. MONNET, V. Bacterial oligopeptide-binding proteins. Cellular and Molecular Life Sciences CMLS, v. 60, n. 10, p. 2100-2114, 2003. ISSN 1420-682X. MOORE, S.; WARREN, M. The anaerobic biosynthesis of vitamin B12. Biochemical Society Transactions, v. 40, n. 3, p. 581, 2012. ISSN 0300-5127. MORA, A.; DONALDSON, I. M. Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction. BMC bioinformatics, v. 13, n. 1, p. 294, 2012. ISSN 1471-2105. MORAES, P. M. et al. Characterization of the Opp Peptide Transporter of Corynebacterium pseudotuberculosis and Its Role in Virulence and Pathogenicity. BioMed research international, v. 2014, 2014. ISSN 2314-6133. clxxviii MORRIS, J. H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC bioinformatics, v. 12, n. 1, p. 436, 2011. ISSN 1471-2105. MOSCA, R. et al. Towards a detailed atlas of protein–protein interactions. Current opinion in structural biology, v. 23, n. 6, p. 929-940, 2013. ISSN 0959-440X. MULDER, N. J. et al. Using biological networks to improve our understanding of infectious diseases. Computational and Structural Biotechnology Journal, 2014. ISSN 2001-0370. NAIDER, F.; BECKER, J. M. Multiplicity of oligopeptide transport systems in Escherichia coli. Journal of bacteriology, v. 122, n. 3, p. 1208-1215, 1975. ISSN 0021-9193. NELSON, D.; COX, M. Lehninger, Princípios de Bioquímica. Sarvier, v. 3ª edição, São Paulo, p. 202, 2002. ORCHARD, S. et al. Protein interaction data curation: the International Molecular Exchange (IMEx) consortium. Nature methods, v. 9, n. 4, p. 345-350, 2012. ISSN 1548-7091. OREIBY, A. et al. Caseous lymphadenitis in small ruminants in Egypt. Tierärztliche Praxis Großtiere, v. 42, n. 5, p. 271-277, 2014. ISSN 1434-1220. OSMAN, A. Y. et al. Caseous Lymphadenitis in a Goat: A Case Report. International Journal of Livestock Research, v. 5, n. 3, p. 128-132, 2015. PARK, H.-S. et al. Transcriptomic analysis of Corynebacterium glutamicum in the response to the toxicity of furfural present in lignocellulosic hydrolysates. Process Biochemistry, 2014. ISSN 1359-5113. PELAY‐GIMENO, M. et al. Structure‐Based Design of Inhibitors of Protein–Protein Interactions: Mimicking Peptide Binding Epitopes. Angewandte Chemie International Edition, 2015. ISSN 1521-3773. PENG, W. et al. Improving protein function prediction using domain and protein complexes in PPI networks. BMC systems biology, v. 8, n. 1, p. 35, 2014. ISSN 1752-0509. PETHICK, F. E. et al. Complete Genome Sequences of Corynebacterium pseudotuberculosis Strains 3/99-5 and 42/02-A, Isolated from Sheep in Scotland and Australia, Respectively. Journal of Bacteriology, v. 194, n. 17, p. 4736-4737, 2012. ISSN 0021-9193. PINTO, A. C. et al. Differential transcriptional profile of Corynebacterium pseudotuberculosis in response to abiotic stresses. BMC genomics, v. 15, n. 1, p. 14, 2014. ISSN 1471-2164. clxxix RESENDE, B. et al. DNA repair in Corynebacterium model. Gene, v. 482, n. 1, p. 1-7, 2011. ISSN 0378-1119. REZENDE, A. M. et al. Computational Prediction of Protein-Protein Interactions in Leishmania Predicted Proteomes. PloS one, v. 7, n. 12, p. e51304, 2012. ISSN 1932-6203. RODIONOV, D. A. et al. Comparative genomics of the vitamin B12 metabolism and regulation in prokaryotes. Journal of Biological Chemistry, v. 278, n. 42, p. 41148-41159, 2003. ISSN 0021-9258. RONDON, M. R.; TRZEBIATOWSKI, J. R.; ESCALANTE-SEMERENA, J. C. Biochemistry and molecular genetics of cobalamin biosynthesis. Progress in nucleic acid research and molecular biology, v. 56, p. 347-384, 1996. ISSN 0079-6603. ROSTAS, K. et al. Nucleotide sequence and LexA regulation of the Escherichia coli recN gene. Nucleic acids research, v. 15, n. 13, p. 5041-5049, 1987. ISSN 0305-1048. ROTH, J.; LAWRENCE, J.; BOBIK, T. Cobalamin (coenzyme B12): synthesis and biological significance. Annual Reviews in Microbiology, v. 50, n. 1, p. 137-181, 1996. ISSN 0066- 4227. ROYSTON, J. An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, p. 115-124, 1982. ISSN 0035-9254. RUIZ, J. C. et al. Evidence for reductive genome evolution and lateral acquisition of virulence functions in two Corynebacterium pseudotuberculosis strains. PLoS One, v. 6, n. 4, p. e18551, 2011. ISSN 1932-6203. SAHBANI, S. K. et al. The relative contributions of DNA strand breaks, base damage and clustered lesions to the loss of DNA functionality induced by ionizing radiation. Radiation research, v. 181, n. 1, p. 99-110, 2014. ISSN 0033-7587. SAITO, Y. et al. Characterization of endonuclease III (nth) and endonuclease VIII (nei) mutants of Escherichia coli K-12. Journal of bacteriology, v. 179, n. 11, p. 3783-3785, 1997. ISSN 0021-9193. SAKMANOĞLU, A. et al. Identification and antimicrobial susceptibility of Corynebacterium pseudotuberculosis isolated from sheep. Eurasian Journal of Veterinary Sciences, v. 31, n. 2, p. 116-121, 2015. ISSN 1309-6958. SANTAROSA, B. P. et al. MENINGOENCEFALITE SUPURATIVA POR Corynebacterium pseudotuberculosis EM CABRA COM LINFADENITE CASEOSA: RELATO DE CASO. Veterinária e Zootecnia, v. 21, n. 4, p. 537-542, 2015. ISSN 2178-3764. clxxx SCHALK, I. J. Innovation and Originality in the Strategies Developed by Bacteria To Get Access to Iron. Chembiochem, v. 14, n. 3, p. 293-294, 2013. ISSN 1439-7633. SCOTT, A.; ROESSNER, C. Biosynthesis of cobalamin (vitamin B (12)). Biochemical Society Transactions, v. 30, n. 4, p. 613-620, 2002. ISSN 0300-5127. SELIM, S. Oedematous skin disease of buffalo in Egypt. Journal of Veterinary Medicine, Series B, v. 48, n. 4, p. 241-258, 2001. ISSN 1439-0450. SERAFINI, D. M.; SCHELLHORN, H. E. Endonuclease III and endonuclease IV protect Escherichia coli from the lethal and mutagenic effects of near-UV irradiation. Canadian journal of microbiology, v. 45, n. 7, p. 632-637, 1999. ISSN 0008-4166. SEYFFERT, N. et al. High seroprevalence of caseous lymphadenitis in Brazilian goat herds revealed by< i> Corynebacterium pseudotuberculosis secreted proteins-based ELISA. Research in veterinary science, v. 88, n. 1, p. 50-55, 2010. ISSN 0034-5288. SHANNON, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research, v. 13, n. 11, p. 2498-2504, 2003. ISSN 1088-9051. SHAPIRO, S. S.; WILK, M. B. An analysis of variance test for normality (complete samples). Biometrika, p. 591-611, 1965. ISSN 0006-3444. SHARAN, R. et al. Conserved patterns of protein interaction in multiple species. Proceedings of the National Academy of Sciences of the United States of America, v. 102, n. 6, p. 1974-1979, 2005. ISSN 0027-8424. SHELDON, J. R.; HEINRICHS, D. E. Recent developments in understanding the iron acquisition strategies of gram positive pathogens. FEMS microbiology reviews, p. fuv009, 2015. ISSN 1574-6976. SHENG, C. et al. State-of-the-art strategies for targeting protein–protein interactions by small-molecule inhibitors. Chemical Society Reviews, 2015. SILVA, A. et al. Complete genome sequence of Corynebacterium pseudotuberculosis I19, a strain isolated from a cow in Israel with bovine mastitis. Journal of bacteriology, v. 193, n. 1, p. 323-324, 2011. ISSN 0021-9193. SILVA, W. M. et al. Label-free proteomic analysis to confirm the predicted proteome of Corynebacterium pseudotuberculosis under nitrosative stress mediated by nitric oxide. BMC genomics, v. 15, n. 1, p. 1065, 2014. ISSN 1471-2164. clxxxi SMID, E. J.; PLAPP, R.; KONINGS, W. Peptide uptake is essential for growth of Lactococcus lactis on the milk protein casein. Journal of bacteriology, v. 171, n. 11, p. 6135-6140, 1989. ISSN 0021-9193. SMITH, J. L. The physiological role of ferritin-like compounds in bacteria. Critical reviews in microbiology, v. 30, n. 3, p. 173-185, 2004. ISSN 1040-841X. SOARES, S. C. et al. The pan-genome of the animal pathogen Corynebacterium pseudotuberculosis reveals differences in genome plasticity between the biovar ovis and equi strains. PloS one, v. 8, n. 1, p. e53818, 2013. ISSN 1932-6203. SONGER, J. G. et al. Biochemical and genetic characterization of Corynebacterium pseudotuberculosis. American journal of veterinary research, v. 49, n. 2, p. 223-226, 1988. ISSN 0002-9645. STELZL, U. et al. Ribosomal proteins: role in ribosomal functions. eLS, 2001. ISSN 047001590X. SUKHODOLETS, M. V.; GARGES, S. Interaction of Escherichia coli RNA polymerase with the ribosomal protein S1 and the Sm-like ATPase Hfq. Biochemistry, v. 42, n. 26, p. 8022- 8034, 2003. ISSN 0006-2960. TANG, Y. et al. CytoNCA: A cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems, 2014. ISSN 0303-2647. TAYLOR, I. W.; WRANA, J. L. Protein interaction networks in medicine and disease. Proteomics, v. 12, n. 10, p. 1706-1716, 2012. ISSN 1615-9861. TEIXEIRA, D. et al. The tufB–secE–nusG–rplKAJL–rpoB gene cluster of the liberibacters: sequence comparisons, phylogeny and speciation. International Journal of Systematic and Evolutionary Microbiology, v. 58, n. 6, p. 1414-1421, 2008. ISSN 1466-5026. TROST, E. et al. The complete genome sequence of Corynebacterium pseudotuberculosis FRC41 isolated from a 12-year-old girl with necrotizing lymphadenitis reveals insights into gene-regulatory networks contributing to virulence. BMC genomics, v. 11, n. 1, p. 728, 2010. ISSN 1471-2164. VAN DONGEN, S. A cluster algorithm for graphs. Report-Information systems, n. 10, p. 1- 40, 2000. ISSN 1386-3681. VILLOUTREIX, B. O. et al. Drug‐ and Opportunities for Drug Discovery and Chemical Biology. Molecular informatics, v. 33, n. 6‐7, p. 414-437, 2014. ISSN 1868-1751. clxxxii VOIGT, K. et al. Eradication of caseous lymphadenitis under extensive management conditions on a Scottish hill farm. Small Ruminant Research, 2012. ISSN 0921-4488. VOLLMER, W.; BLANOT, D.; DE PEDRO, M. A. Peptidoglycan structure and architecture. FEMS microbiology reviews, v. 32, n. 2, p. 149-167, 2008. ISSN 1574-6976. WALTER, B. M. et al. The LexA regulated genes of the Clostridium difficile. BMC microbiology, v. 14, n. 1, p. 88, 2014. ISSN 1471-2180. WANDERSMAN, C.; DELEPELAIRE, P. Bacterial iron sources: from siderophores to hemophores. Annu. Rev. Microbiol., v. 58, p. 611-647, 2004. ISSN 0066-4227. WANG, J. et al. Recent advances in clustering methods for protein interaction networks. BMC genomics, v. 11, n. Suppl 3, p. S10, 2010. ISSN 1471-2164. WETIE, A. G. N. et al. Protein–protein interactions: switch from classical methods to proteomics and bioinformatics-based approaches. Cellular and Molecular Life Sciences, v. 71, n. 2, p. 205-228, 2014. ISSN 1420-682X. WETIE, N. et al. Investigation of stable and transient protein–protein interactions: Past, present, and future. Proteomics, 2013. ISSN 1615-9861. WILLIAMSON, P.; NAIRN, M. E. Lesions caused by Corynebacterium pseudotuberculosis in the scrotum of rams. Australian Veterinary Journal, v. 56, n. 10, p. 496-498, 1980. ISSN 1751-0813. WINDSOR, P. A. Control of caseous lymphadenitis. Veterinary Clinics of North America: Food Animal Practice, v. 27, n. 1, p. 193-202, 2011. ISSN 0749-0720. XENARIOS, I. et al. DIP: the database of interacting proteins. Nucleic acids research, v. 28, n. 1, p. 289-291, 2000. ISSN 0305-1048. YIN, L.; BAUER, C. E. Controlling the delicate balance of tetrapyrrole biosynthesis. Philosophical Transactions of the Royal Society of London B: Biological Sciences, v. 368, n. 1622, p. 20120262, 2013. ISSN 0962-8436. ZHANG, R.; OU, H. Y.; ZHANG, C. T. DEG: a database of essential genes. Nucleic acids research, v. 32, n. suppl 1, p. D271-D272, 2004. ISSN 0305-1048. ZHANG, X.; XU, J.; XIAO, W.-X. A New Method for the Discovery of Essential Proteins. PloS one, v. 8, n. 3, p. e58763, 2013. ISSN 1932-6203. clxxxiii ZHOU, H. et al. Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions. Biol Direct, v. 9, n. 5, 2014. ZORAGHI, R.; REINER, N. E. Protein interaction networks as starting points to identify novel antimicrobial drug targets. Current opinion in microbiology, v. 16, n. 5, p. 566-572, 2013. ISSN 1369-5274. clxxxiv Anexos clxxxv I - C. pseudotuberculosis Phop confers virulence and may be targeted by natural compounds Sandeep Tiwari, Marcília Pinheiro da Costa, Sintia Almeida, Syed Shah Hassan, Syed Babar Jamal, Alberto Oliveira, Edson Luiz Folador, Flavia Rocha, Vinícius Augusto Carvalho de Abreu, Fernanda Dorella, Rafael Hirata, Diana Magalhães de Oliveira, Maria Fátima da Silva Teixeira, Artur Silva, Debmalya Barh e Vasco Azevedo Após a construção de uma rede de interação proteína-proteína, seja por método experimental ou computacional, diversas análises podem ser executadas. Dentre estas análises, podemos citar a comparação entre duas ou mais redes de interação, a análise de um conjunto específico de proteínas como um cluster, a análise de uma via metabólica de interesse ou mesmo análise de interação entre proteínas específicas. Neste trabalho, foi gerada a rede de interação parcial para as proteínas codificadas por dois genes específicos de interesse: phoP e phoR. A rede de interação, contendo do primeiro até o terceiro nível de interação do sistema phoPR, permitiu o planejamento de experimentos em laboratório para verificar como a expressão destes dois genes poderiam regular a expressão de outras proteínas. Após submissão do artigo, visto que haviam evidências experimentais comprovando os resultados, a pedido dos revisores, a imagem da rede de interação foi retirada (Figure 45). Figure 45. Rede de interação parcial das proteínas codificadas pelos genes phoPR. Este trabalho desenvolvido em colaboração com o MSc. Sandeep Tiwari e foi publicado em setembro de 2014 pela revista Integrative Biology com DOI número 10.1039/C4IB00140K, disponível em http://pubs.rsc.org/en/content/articlehtml/2014/ib/c4ib00140k. clxxxvi I.I - Introduction clxxxvii I.II - Materials and methods clxxxviii clxxxix cxc I.III - Result and discussion cxci cxcii cxciii cxciv cxcv cxcvi I.IV - Conclusion I.V - References cxcvii cxcviii II - Outros resultados Aqui, serão apresentados cinco trabalhos publicados na forma de artigo cujo resultados não possuem relação direta com redes de interação proteína-proteína, mas que foram desenvolvidos durante o período de doutorado. Estas atividades, por serem diferentes do tema principal desenvolvido na tese, complementam o conhecimento na área de Bioinformática, sendo estes momentos de colaboração uma grande oportunidade para novos aprendizados. Se tratando de montagem, anotação e curadoria de genomas, este aprendizado é extrapolado ainda mais, pois, além das técnicas e ferramentas usadas no processo de montagem e anotação, a atividade de curadoria, apesar de ser uma tarefa “manual” e trabalhosa, conduz a uma reflexão biológica sobre o organismo, viabilizando conhecer melhor os genes, proteínas e sua organização. Apesar de pouco valorizada cientificamente, o trabalho de montagem, anotação e curadoria de genomas é extremamente relevante, pois, é a base para o desenvolvimento de futuros trabalhos científicos, inclusive para predições in silico de interação proteína-proteína, como desenvolvido nesta tese. Adicionalmente à curadoria manual de genoma mas ainda relacionados a esta atividade, foram desenvolvidos dois scripts na linguagem de programação Perl com as seguintes finalidades: (i) corrigir a posição de start e stop códon dos elementos estruturais após curadoria de genomas fragmentado e distribuído para vários curadores, situação que ocorre principalmente após correções de frame-shifts gerados por regiões de homopolímeros, quando as coordenadas dos elementos estruturais do genoma se alteram, consequentemente modificando as coordenadas subsequente do genoma curado por outro pesquisador, necessitando ser corrigida e; (ii) transferir automaticamente a anotação de um genoma já curado para outro genoma em processo de anotação. Estes scripts não foram desenvolvidos com intuito de gerar publicação, mas sim de serem utilizados pelo grupo para agilizar o processo de anotação automática e curadoria de genomas, dentre os quais, alguns dos quais eu tive oportunidade de participar. A seguir estão relacionados quatro artigos científicos publicados nos quais colaborei principalmente nas etapas de anotação funcional e curadoria de genoma. No quinto artigo publicado, as atividades de colaboração se resumem principalmente na execução de programas de bioinformática e análises dos resultados retornados. cxcix II.I - Genome Sequence of Lactococcus lactis subsp. lactis NCDO 2118, a GABA-Producing Strain cc II.I.I - References cci II.II - Genome Sequence of Corynebacterium pseudotuberculosis MB20 bv. equi Isolated from a Pectoral Abscess of an Oldenburg Horse in California II.II.I - References ccii cciii II.III - Genome Sequence of Corynebacterium ulcerans Strain 210932 II.III.I - References cciv ccv II.IV - Genome Sequence of Corynebacterium ulcerans Strain FRC11 ccvi II.IV.I - References ccvii II.V - Proteome scale comparative modeling for conserved drug and vaccine targets identification in Corynebacterium pseudotuberculosis Syed Shah Hassan1, Sandeep Tiwari1, Luís Carlos Guimarães1, Syed Babar Jamal1, Edson Folador1, Neha Barve Sharma45, Siomar de Castro Soares1, Síntia Almeida1, Amjad Ali1, Arshad Islam6, Fabiana Dias Póvoa2, Vinicius Augusto Carvalho de Abreu1, Neha Jain45, Antaripa Bhattacharya5, Lucky Juneja45, Anderson Miyoshi1, Artur Silva3, Debmalya Barh5, Adrian Gustavo Turjanski7, Vasco Azevedo1 and Rafaela Salgado Ferreira2*  * Corresponding author: Rafaela S Ferreira rafaelasf@gmail.com Author Affiliations 1 Laboratory of Cellular and Molecular Genetics, Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil 2 Departament of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil 3 Institute of Biological Sciences, Federal University of Pará, Belém, Para, Brazil 4 School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India 5 Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal, India 6 Department of Chemistry, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil 7 Structural Bioinformatics Group, Institute of Physical Chemistry of Materials, Environment and Energy, University of Buenos Aires, Argentine BMC Genomics 2014, 15(Suppl 7):S3 doi:10.1186/1471-2164-15-S7-S3 The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/15/S7/S3 © 2014 Hassan et al.; licensee BioMed Central Ltd. II.V.I - Abstract Corynebacterium pseudotuberculosis (Cp) is a pathogenic bacterium that causes caseous lymphadenitis (CLA), ulcerative lymphangitis, mastitis, and edematous to a broad spectrum of hosts, including ruminants, thereby threatening economic and dairy industries worldwide. Currently there is no effective drug or vaccine available against Cp. To identify new targets, we adopted a novel integrative strategy, which began with the prediction of the modelome (tridimensional protein structures for the proteome of an organism, generated through comparative modeling) for 15 previously sequenced C. pseudotuberculosis strains. This pan- modelomics approach identified a set of 331 conserved proteins having 95-100% intra-species sequence similarity. Next, we combined subtractive proteomics and modelomics to reveal a set of 10 Cp proteins, which may be essential for the bacteria. Of these, 4 proteins (tcsR, mtrA, nrdI, and ispH) were essential and non-host homologs (considering man, horse, cow and sheep as hosts) and satisfied all criteria of being putative targets. Additionally, we subjected these 4 proteins to virtual screening of a drug-like compound library. In all cases, ccviii molecules predicted to form favorable interactions and which showed high complementarity to the target were found among the top ranking compounds. The remaining 6 essential proteins (adk, gapA, glyA, fumC, gnd, and aspA) have homologs in the host proteomes. Their active site cavities were compared to the respective cavities in host proteins. We propose that some of these proteins can be selectively targeted using structure-based drug design approaches (SBDD). Our results facilitate the selection of C. pseudotuberculosis putative proteins for developing broad-spectrum novel drugs and vaccines. A few of the targets identified here have been validated in other microorganisms, suggesting that our modelome strategy is effective and can also be applicable to other pathogens. II.V.II - Background Antimicrobial resistance involving a rapid loss of effectiveness in antibiotic treatment and the increasing number of multi-resistant microbial strains pose global challenges and threats. Thereby, efforts to find new drug and/or vaccine targets to control them are becoming indispensible. Corynebacterium pseudotuberculosis (Cp) is a pathogen of great veterinary and economic importance, since it affects animal livestock, mainly sheep and goats, worldwide, and its presence is reported in other mammals in several Arabic, Asiatic, East and West African and North and South American countries, as well as in Australia [1]. C. pseudotuberculosis is a Gram-positive, facultative intracellular, and pleomorphic organism; it is non-motile, although presenting fimbriae [2]. Based on rpoB gene (a β subunit of RNA polymerase), it shows a close phylogenetic relationship with other type strains of CMNR (Corynebacterium, Mycobacterium, Nocardia and Rhodococcus), a group that comprises genera of great medical, veterinary and biotechnological importance [1,3]. A recent study showed that phylogenetic analysis for the identification of Corynebacterium and other CMNR species based on rpoB gene sequences are more accurate than analyses based on 16S rRNA [4]. Its pathogenicity and biological impact have already led to the sequencing of various strains of this pathogen from a wide range of hosts [3]. The pathogen causes several infectious diseases in goat and sheep population (biovar ovis), including caseous lymphadenitis (CLA), a chronic contagious disease characterized by abscess formation in superficial lymph nodes and in subcutaneous tissues. In severe cases, biovar equi infects the lungs, kidneys, liver and spleen, thereby threatening the herd life of the infected animals [2,5]. The disease has been rarely reported in humans, as a result of occupational exposure, with symptoms similar to lymphadenitis abscesses [6-8]. The bacteria can survive for several weeks in soil in adverse conditions, what seems to contribute to its resistance and disease transmission [9,10]. Direct contact to infectious secretions or contaminated materials are the primary sources of pathogen transmission between animals, but most frequently the infection occurs through exposed skin lacerations [5]. Given the medical importance of Cp and a lack of efficient medicines, in this study we applied a computational strategy to search for new molecular targets from this bacterium. ccix Recently, computational approaches such as reverse vaccinology, differential genome analyses [11], subtractive and comparative microbial genomics have become popular for rapid identification of novel targets in the post genomic era [12], [13]. These approaches were used to identify targets in various human pathogens, like Mycobacterium tuberculosis [14], Helicobacter pylori [15], Burkholderia pseudomalleii [16], Neisseria gonorrhea [17], Pseudomonas aeruginosa [18] and Salmonella typhi [19]. In general, such approaches follow the principle that genes/proteins must be essential to the pathogen and preferably have no homology to the host proteins [20]. Nevertheless, essential targets that are homologous to their corresponding host proteins may also be molecular targets for structure-based selective inhibitors development. In this case, the targets must show significant differences in the active sites or in other druggable pockets, when pathogenic and host proteins are compared [21-23]. Once a molecular target is chosen, the conventional experimental methods for drug discovery consist of testing many synthetic molecules or natural products to identify lead compounds. Such practices are laborious, time consuming and require high investments [24,25]. On the other hand, computational methods for structure-based rational drug design can expedite the process of ligand identification and molecular understanding of interactions between receptor and ligand [26]. Such approaches are dependent on the availability of the structural information about the target protein. Considering the availability of experimental structures in PDB (Protein Data Bank) only for a low percentage of the known protein sequences, comparative modeling is frequently the method of choice for obtaining 3D coordinates for proteins of interest [27] for the development of specific drugs and docking analyses [28,29]. In this work, we used a modelomic approach for the predicted proteome of C. pseudotuberculosis species. This served to bridge the gap between raw genomic information and the identification of good therapeutic targets based on the three dimensional structures. The novelty of this strategy relies in using the structural information from high-throughput comparative modeling for large-scale proteomics data for inhibitor identification, potentially leading to the discovery of compounds able to prevent bacterial growth. The predicted proteomes of 15 C. pseudotuberculosis strains were modeled (pan-modelome) using the MHOLline workflow. Intra-species conserved proteome (core-modelome) with adequate 3D models was further filtered for their essential nature for the bacteria, using the database of essential genes (DEG). This led to the identification of 4 essential bacterial proteins without homologs in the host proteomes, which were employed in virtual screening of compound libraries. Furthermore, we investigated a set of 6 essential host homologs proteins. We observed residues of the predicted bacterial protein cavities that are completely different from the ones found in the homologous domains, and therefore could be specifically targeted. By applying this computational strategy we provide a final list of predicted putative targets in C. pseudotuberculosis, in biovar ovis and equi. They could provide an insight into designing of peptide vaccines, and identification of lead, natural and drug-like compounds that bind to these proteins. ccx II.V.III - Materials and methods II.V.III.I - Genomes selection Proteomes predicted based on the genomes of fifteen C. pseudotuberculosis strains, including both biovar equi and biovar ovis (Table 1) were used in this study. Most of these genomes were sequenced by our group and are available at NCBI. We downloaded the genome sequences in gbk format from the NCBI server (ftp://ftp.ncbi.nih.gov/genomes/Bacteria webcite) and the corresponding protein sequences (curated CDSs) were exported using Artemis Annotation Tool [30] for further analyses. Table 1. Strains of C. pseudotuberculosis employed in the pan-modelome study, and their respective information regarding genomes statistics, disease prevalence and broad-spectrum hosts. II.V.III.II - Pan-modelome construction A high throughput biological workflow, MHOLline (http://www.mholline.lncc.br webcite), was used to predict the modelome (complete set of protein 3D models for the whole proteome) for each Cp strain. MHOLline uses the program MODELLER [31] for protein 3D structure prediction through comparative modeling. Furthermore, the workflow includes BLASTp (Basic Local Alignment Search Tool for Protein) [32], HMMTOP (Prediction of transmembrane helices and topology of proteins) [33], BATS (Blast Automatic Targeting for Structures), FILTERS, ECNGet (Get Enzyme Commission Number), MODELLER and PROCHECK [34] programs. The protocol used here was modified accordingly from the original work by Capriles et al., 2010 [35]. Briefly, the input files of protein sequences were used in FASTA format for all strains because the MHOLline accepts only .faa format files for the whole process. Firstly, MHOLline selected the template structures available at the Protein data Bank (PDB) via BLASTp (version 2.2.18), using the default parameters (e-value ≤ 10e-5). Secondly, the program BATS refined the BLASTp search for template sequence identification into different groups namely G0, G1, G2 and G3. Only the protein sequences in the group G2, which are characterized by an e-value ≤ 10e-5, Identity ≥ 0.25 and LVI ≤ 0.7 (where LVI is a length variation index of the BATS program for sequence coverage, the lower the LVI value, the higher the sequence coverage and vice versa) were selected. Among the MHOLline output files, the group G2 contained the largest number of protein sequences (≥ 50% for each input file). Subsequently, the "Filter" tool classified the group G2 sequences into seven distinct quality models groups, from "Very High" to "Very Low" depending on the quality of the template structure for a given query protein sequence. The program MODELLER then modeled all these groups in an automated manner. The number of sequences in the group G2 varies for each C. pseudotuberculosis strain. Only the first four distinct quality model groups of G2 were taken into consideration in this study, these were: 1- Very High quality model sequences (identity ≥ 75%) (LVI ≤ 0.1), 2- High quality model sequences (identity ≥ 50%) and < 75%) (LVI ≤ 0.1), 3- Good quality model sequences (identity ≥ 50%) (LVI > 0.1 and ≤ ccxi 0.3) and 4- Medium to Good quality models (identity ≥ 35% and < 50%) (LVI ≤ 0.3) (http://www.mholline.lncc.br webcite). The percentage of identity represents identity between query and template sequences, a LVI ≤ 0.1 is equivalent to coverage of more than 90%, while LVI ≤ 0.3 corresponds to coverage of more than 70%. Therefore, all protein 3D models considered in this study were built from sequences for which there existed a template with identity ≥ 35% and LVI coverage over 70%. Later on, the ECNGet tool assigned an Enzyme Commission (EC) number to each sequence in G2, according to the best PDB template. The MODELLER (v9v5) program performed the automated global alignment and 3D protein model construction. Finally, the program PROCHECK (v3.5.4) evaluated the constructed models based on their stereo-chemical quality. Additionally, transmembrane regions in the input protein sequences were predicted by HMMTOP, for putative vaccine and drug targets identification. II.V.III.III - Identification of intra-species conserved genes/proteins The words genes and proteins are interchangeably used here but they refer to the same protein target of the pathogen. For the identification of highly conserved proteins with 3D models in all Cp strains (≥ 95% sequence identity), the standalone release of NCBI BLASTp+ (v2.2.26) was acquired from the NCBI ftp site (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ webcite), installed on a local machine and a search was performed for all strains using Cp1002 as a reference genome. The highly conserved proteins were selected using a comparative genomics/proteomics approach using an all-against-all BLASTp analysis with cut off values of E = 0.0001 [12,17,20,36]. II.V.III.IV - Analyses of essential and non-host homologous (ENH) proteins To select conserved targets that were essential to the bacteria, a subtractive genomics approach was followed [20]. Briefly, the set of core-modelome proteins from C. pseudotuberculosis were subjected to the Database of Essential Genes (DEG) for homology analyses. DEG contains experimentally validated essential genes from 20 bacteria [37]. The BLASTp cutoff values used were: E-value = 0.0001, bit score ≥100, identity ≥ 35% [20]. Furthermore, the pool of essential genes was subjected to NCBI-BLASTp (E-value = 0.0001, bit score ≥100, identity ≥ 35%) against (human, equine, bovine and ovine proteomes) to identify essential non-host homologs targets [12]. The set of essential non-host homologous proteins were further crosschecked with the NCBI-BLASTp PDB database using default parameters to find any structural similarity with the available host homologs protein structures, keeping cutoff level to ≤ 15% for query coverage. These proteins were checked for their biochemical pathway using KEGG (Kyoto Encyclopedia of Genes and Genomes) [38], virulence using PAIDB (Pathogenicity island database) [39], functionality using UniProt (Universal Protein Resource) [40], and cellular localization using CELLO (subCELlular LOcalization predictor) [41]. The final list of targets was based on 12 criteria as described previously [20]. ccxii II.V.III.V - Analyses of essential and host homologous (EH) proteins We have extrapolated our analyses and also considered protein targets that were predicted as essential to bacterial survival but showed homology to host proteins. This was based on the possibility to find differences between bacterial and host proteins to rationally design inhibitors. The pool of essential protein targets that showed cut off values equal or higher than those for essential non-host homologs through NCBI-BLASTp was treated as host homologous proteins. These were also analyzed for pathway involvement, virulence, functional annotation and cellular localization like essential non-host homologous proteins. To verify the presence of significant residue differences in druggable protein cavities, a structural comparison was performed for each pathogen and their corresponding host protein through the molecular visualization program PyMOL (v1.5, Schrodinger, LLC) (http://www.pymol.org webcite). The related published data of each template structure for each host homolog was also crosschecked for information about these residues, based on the PDB code of each template structure as input in the PDBelite server [42]. Catalytic Site Atlas (CSA) was also consulted to get robust information of the active site residues for the druggable enzyme targets [43]. CSA is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure and has 2 types of entry, original hand-annotated entries with literature references and homologous entries, found by PSI-BLAST alignment to an individual original entry, using an e-value cut-off of 0.00005. CSA can be accessed via a 4- letter PDB code. The equivalent residue that aligns in the query sequence to the catalytic residue found in the original entry is documented. Though the DoGSiteScorer predicts the druggable protein cavities, the host homologous proteins were further subjected to CASTp (Computed Atlas of Surface Topography of Proteins) [44], Pocket-Finder and Q-SiteFinder [45] to get more reliable and robust results about the druggable cavities of the target proteins. II.V.III.VI - Prediction of druggable pockets 3D structure information and druggability analyses are important factors for prioritizing and validating putative pathogen targets [46,47]. As aforementioned, for druggability analyses, the final list of essential non-host and host homologous protein targets in PDB format, were subjected to DoGSiteScorer [48], an automated pocket detection and analysis tool for calculating the druggability of protein cavities. For each cavity detected the program returns the residues present in the pocket and a druggable score ranging from 0 to 1. The closer to 1 the obtained values are, the more druggable the protein cavity is predicted to be, i.e. the cavities are predicted to be more likely to bind ligands with high affinity [48]. The DoGSiteScorer also calculates volume, surface area, lipophilic surface, depth and other related parameters for each predicted cavity. ccxiii II.V.III.VII - Virtual screening and docking analyses The ligand library was obtained from the ZINC database, containing 11,193 drug-like molecules, with Tanimoto cutoff level of 60% [49]. Proteins were inspected for structural errors such as missing atoms or erroneous bonds and protonation states in MVD (Molegro Virtual Docker) [50]. The cavities predicted with DogSiteScorer (druggability ≥ 0.80) for all protein targets, were compared with the cavities detected by MVD. The most druggable cavity, according to DogSiteScorer, was subjected to virtual screening. MVD includes three search algorithms for molecular docking namely MolDock Optimizer [50], MolDock Simplex Evolution (SE), and Iterated Simplex (IS). In this work the MolDock Optimizer search algorithm, which is based on a differential evolutionary algorithm, was employed. The default parameters used for the guided differential evolution algorithm are a) population size = 50, b) crossover rate = 0.9, and c) scaling factor = 0.5. The top ranked 200 compounds for each protein were analyzed in Chimera for shape complementarity and hydrogen bond interactions, leading to the selection of a final set of 10 compounds for each target protein. II.V.IV - Results and discussion II.V.IV.I - Modelome and common targets in C. pseudotuberculosis species Here we report the identification of common putative targets among 15 strains of C. pseudotuberculosis species based on the construction of genome scale protein three- dimensional structural models. Structural information of target proteins can aid in drug and/or vaccine design and in the discovery of new lead compounds [51]. The approach employed here generated high-confidence structural models through the MHOLline workflow (Figure 1) from orthologous protein. To identify the common conserved proteins with a sequence similarity of 95-100%, a comparative genomics approach was performed where all the BATS classified G2 sequences from "Very High" to "Medium to Good" quality, from 14 Cp strains, were aligned to the G2 sequences of Cp1002, assumed as a reference genome for this study. In total, a set of 331 protein sequences was selected, being conserved in all strains. An overview of the different steps involved in this computational approach for genome scale modelome and prioritization of putative drug and vaccine targets is given in Figure 2a-b. Figure 1. High-throughputness (efficiency) of the MHOLline biological workflow for genome-scale modelome (3D models) prediction. Predicted proteomes from the genomes of 15 C. pseudotuberculosis strains were fed to the MHOLline workflow in FASTA format. The blue line represents the number of input data, according to the left-hand side y-axis. The bars show the number in the form of MHOLline output data (according to the right-hand side y- axis) of: not aligned sequences (G0, green bars); sequences for which there is a template structure available at RCSB PDB (yellow bars); sequences with acceptable template structures that where modeled in the MHOLline workflow (G2, red bars); sequences with predicted transmembrane regions (HMMTOP, purple bars) and the number of sequences that were predicted as enzymes in each genome and were assigned an EC number (ECNGet, gray bars). The x-axis represents the C. pseudotuberculosis genomes used in this study. ccxiv Figure 2. Overview of different computational steps employed in the identification of putative essential targets (non-host homologous and host homologous) for drugs and vaccines from the core-proteome of 15 C. pseudotuberculosis strains. Figure 2b. Intra- species subtractive modelomics workflow for conserved targets identification in C. pseudo tuberculosis species. The table (from left to right) represents the total number of protein sequences as an input data in fasta format fed to the MHOLline workflow (upper forward arrow). The remaining columns show the output data of group G2 (upper backward arrow), first by BATS and then by Filter tools of the MHOLline workflow respectively. Columns 4th- 7th constitute the number of protein sequences of different qualities of all 15 Cp strains, where the sequences of 14 Cp strains were compared using BLASTp, to the sequences of Cp1002 strain as reference, for the identification of conserved protein targets (core-modelome). The funnel shows how this workflow processes and filters a large quantity of genomic data for putative drug and vaccine targets identification of a pathogen. II.V.IV.II - Identification of ENH and EH proteins as putative drug and/or vaccine targets To identify essential proteins as putative therapeutic targets in C. pseudotuberculosis, from the set of core-modelome, these were compared to the Database of Essential Genes (DEG). Based on this filter, the number of selected targets was reduced drastically to a final set of only 10 targets. These were compared to the aforementioned corresponding host proteomes, leading to the identification of 4 essential non-host homologous proteins (ENH, Table 2) and 6 essential host homologous proteins (EH, Table 3). Table 2. Drug and/or vaccine targets prioritization parameters and functional annotation of the four essential non-host homologous putative targets. Table 3. Drug and/or vaccine targets prioritization parameters and functional annotation of the six essential host homologous putative targets. Among the ENH proteins, two targets were selected from a bacterial unique pathway, the two component signaling system. These targets are tcsR (two-component response regulator) and mtrA (two component sensory transduction transcriptional regulatory protein). While the tcsR is a novel protein target, as it is has not been described so far as a target in any organism, mtrA has been already reported as a target in Mycobacterium [52] and provides multidrug resistance to Mycobacterium avium [53]. Therefore, targeting mtrA in C. pseudotuberculosis may also be effective in controlling the infection of CLA. The remaining ENH protein targets, nrdI and ispH, also participate in biochemical pathways. NrdI (ribonucleoside-diphosphate reductase alpha chain) is a flavodoxin which contains a diferric-tyrosyl radical cofactor and it is involved in nucleotide metabolism in E. coli [54]. It has been reported as a putative target in several pathogens including C. pseudotuberculosis, Corynebacterium diphtheriae and Mycobacterium tuberculosis [20]. The target ispH (4-hydroxy-3-methylbut-2-enyl diphosphate reductase; EC 1.17.1.2) is an essential cytoplasmic enzyme in Escherichia coli [55]. This iron-sulfur protein plays a crucial role in terpene metabolism of various pathogenic bacteria [56,57] and it is a predicted target in Salmonella tyhpimurium [58] and Plasmodium falciparum [59]. It should be noted that according to the cut off threshold for NCBI-BLASTp ccxv that we have followed, ispH shows homology only to the human host. So, if human is not considered as a possible host, ispH can also be considered as a common putative target. The roles of these proteins in different metabolic pathways was confirmed from KEGG [38] and METACYC [60] databases. II.V.IV.III - Prioritization parameters of drug and/or vaccine targets Previous studies have shown several factors that can aid in determining the suitability of therapeutic targets [46]. The availability of 3D structural information, the main approach of our study, is very helpful in drug development. Other important factors for drug targets include preferred low MW and high druggability. On the other hand, for vaccine targets the information about subcellular localization is important and proteins that contain transmembrane motifs are preferred [36,46,61,62]. We have determined most of these prioritizing properties for the 10 essential proteins (Table 2 &3). Interestingly, according to the target-prioritizing criterion, all targets have a low MW, and are predicted to be localized in the cytoplasmic compartment of the Cp. Druggability evaluation with DoGSiteScorer [48] for all conserved targets allowed the prediction of numerous druggable cavities with at least one druggable cavity for each Cp target. For the 4 ENH proteins tcsR, mtrA, nrdI, and ispH, 3, 5, 5 and 2 cavities with score ≥ 0.80 were observed respectively. For each protein, the cavity that exhibited the highest druggability score was selected for docking analyses. For 6 EH targets, adk, gapA, glyA, fumC, gnd, and aspA, 1, 3, 3, 2, 8 and 6 cavities were observed respectively according to the aforementioned druggability score criteria (Table 2 &3). Here, in each case, the most druggable predicted cavity was structurally compared with the cavities in respective host proteins. II.V.IV.IV - Virtual screening and molecular docking analyses of ENH targets For each ENH target protein (mtrA, ispH, tcsR and nrdl), the top 200 drug-like molecules from virtual screening were visually inspected to select 10 molecules that showed favorable interactions with the target. The biological importance of each target and an analysis of the predicted protein-ligand interaction are described below. ZINC codes and MolDock scores of selected ligands, the number of hydrogen bonds as well as protein residues involved in these interactions, are shown in a table for each target protein (Tables 4, 5, 6, 7. Figures showing the predicted binding mode for one of the 10 selected ligands are also shown for each target (Additional files 1, 2, 3, 4, 5). Table 4. ZINC codes, MolDock scores and predicted hydrogen bonds for the ten compounds selected among the top ranking 200 molecules against Cp1002_0515 (MtrA, DNA-binding response regulator). Table 5. ZINC codes, MolDock scores and predicted hydrogen bonds for the ten compounds selected among the top ranking 200 molecules against Cp1002_0742 (IspH, 4-hydroxy-3- methyl but-2-enyl diphosphate reductase). ccxvi Table 6. ZINC codes, MolDock scores and predicted hydrogen bonds for the ten compounds selected among the top ranking 200 molecules against Cp1002_1648 (TcsR, Two component transcriptional regulator). Table 7. ZINC codes, MolDock scores and predicted hydrogen bonds for the ten compounds selected among the top ranking 200 molecules against Cp1002_1676 (NrdI). Additional file 1. Docking representation of the best drug-like compound ZINC75109074 in the most druggable protein cavity of Cp1002_0515 (MtrA, DNA-binding response regulator). Three hydrogen bonds were observed with Thr73, Asp48 and Arg116. Additional file 2. Docking representation of compound ZINC00510419 in the most druggable protein cavity of Cp1002_0742 (IspH, 4-hydroxy-3-methyl but-2-enyl diphosphate reductase). Residues Cys39, Thr225, Ser250, His68 and Asn252 are predicted to make seven hydrogen bonds to this ligand. Additional file 3. Docking representation of the best drug-like compound ZINC00510419 in the most druggable protein cavity of Cp1002_1648 (TcsR, Two component transcriptional regulator). Hydrogen bonds were observed with residues Val76, Gln185 and Asn193. Additional file 4. Docking representation of the best drug-like compound ZINC04721321 in the most druggable protein cavity of Cp1002_1676 (NrdI protein). Hydrogen bonds were observed with residues Ser8, Thr13 and Leu116. Additional file 5 (a-f). Comparison among the most druggable cavities from essential bacterial and the respective host homologue proteins. Protein structures are shown as cartoon (green for the bacterial protein and gray for Ovis aries host protein). Other host proteins are not shown for simplicity, but the same substitutions were present in all host proteins analyzed. Residues that differ in the bacterial and host cavity are highlighted in sticks and labeled (bacterial labels in green and host labels in black). a) Cp1002_0692 (Glyceralderayde 3- phosphate dehydrogenase); b) Cp1002_0385 (adenylate kinase); c) Cp1002_0728 (serine hydroxymethyltransferase); d) Cp1002_0738 (fumarate hydratase class II) the site shown is formed by three monomers, which are represented in green, blue and orange. No residues are highlighted, since the active sites are identical between bacteria and host; e) Cp1002_1005 (6- phosphogluconate dehydrogenase); f) Cp1002_1042 (aspartate ammonia-lyase). Figures were prepared with the PyMol. Cp1002_0515 (MtrA, DNA-binding response regulator) is part of the two-component signal transduction system consisting of the sensor kinase (Histidine protein kinases, HKs) and the response regulator, MtrB and MtrA respectively. This system is highly conserved in Corynebacteria and Mycobacteria and it is essential for their survival to adapt to environmental changes. Homologs of MtrA and MtrB are present in many species of the genera Corynebacterium, Mycobacterium, Nocardia, Rhodococcus (CMNR), and others like Thermomonospora, Leifsonia, Streptomyces, Propionibacterium, and Bifidobacterium [63]. MtrA represents the fourth family member of the OmpR/PhoB family of response regulators. Like other family members, MtrA has been reported to be essential in M. tuberculosis [64]. It possesses an N-terminal regulatory domain and a C-terminal helix-turn-helix DNA-binding ccxvii domain, already indicating that this response regulator functions as a transcriptional regulator, with phosphorylation of the regulatory domain modulating the activity of the protein [65]. Based on a comparison with a crystallographic structure of the MtrA template (2GWR, MtrA from M. tuberculosis), the active site residues involved in H-bond interactions with the crystallographic ligand are Val145, Gln151, Ile152 and Leu154. Although none of these residues is predicted to form hydrogen bonds with the ten selected docked ligands, these molecules were predicted to interact with other residues in the pocket. Table 4 shows the 10 selected ligands according to their minimum energy values and number of hydrogen bond interactions. ZINC75109074 (N-benzyl-N-[[2-(2-thienyl)-1H-imidazol-4-yl] methyl] prop-2- en-1-amine) is shown here as the top scoring ligand (Additional file 1). Cp1002_0742 (IspH, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase) is an iron-sulfur oxidoreductase enzyme that plays a key role in the metabolism of terpenes in several pathogens. Terpenes constitute a large class of natural compounds. Their biosynthesis initiates with the building blocks isopentenyl-diphosphate (IPP) and dimethylallyldiphosphate (DMAPP), and differs in bacteria and mammals [57]. In bacteria and other pathogenic microorganisms the enzyme IspH catalyzes the last step in the production of IPP and DMAPP. The three structural units of the enzyme harbor a cubic iron-sulfur cluster at their center, enabling the enzyme to accomplish a challenging reaction by converting an allyl alcohol to two isoprene components. The iron-sulfur proteins normally participate in electron transfers. The IspH enzyme, thereby, in a similar fashion, binds the substrate directly to the iron-sulfur cluster [57]. In the template crystal structure of IspH (PDB 3KE8), it has been shown that His41, His74, His124, Thr167, Ser225, Ser226, Asn227 and Ser269 are the active site residues that are involved in hydrogen bond interactions with the ligand 4-hydroxy-3- methylbutyldiphosphate (EIP). Also, Cys12, Cys96, Cys197 and EIP have been shown to make metal interaction with the Fe4S4 (Iron/Sulfur Cluster). Although the ten selected drug- like compounds (Table 5) did not show any interaction with the aforementioned IspH residues, they are predicted to make very good hydrogen bond interactions with other surrounding residues of the predicted cavity. The predicted binding mode of the best scoring compound, ZINC00510419 is shown in Additional file 2. Good shape complementarity and 6 hydrogen bond interactions are observed in this complex. Cp1002_1648 (TcsR, Two component transcriptional regulator) is a novel target without host homologs proteins. Differently from MtrA and IspH, in this case the template structure from Escherichia coli for TcsR did not contain any ligand (PDB 1A04), and no reported information was found about the ligand-residues interactions in their cavities. Therefore, among the cavities identified by MVD, the best cavity for virtual screening analysis was simply chosen based on the highest druggability score by the DogSiteScorer. Compound ZINC00510419 (Additional file 3) was the top-ranking compound, forming a network of 3 hydrogen bonds with Val76, Gln185 and Asn193. Table 6 lists the 10 compounds selected for this target. ccxviii Cp1002_1676 (NrdI, protein) belongs to the nrdI protein family, a unique group of metalloenzymes that are essential for cell-proliferation [66]. It is classified as a ribonucleotide reductase (RNR), an iron-dependent enzyme that belongs to class Oxidoreductases (EC 1.17.4.1) acting on CH or CH2 groups with a disulfide as acceptor [67]. The class Ia enzyme supplies deoxynucleotides during normal aerobic growth. The class Ib RNR plays a similar role although its function in E. coli is not clear, but it is reported to be expressed under oxidative stress and iron-limited conditions [68]. Class I RNR enzymes have two homodimeric subunits, α2 (NrdE), where nucleotide reduction takes place, and β2 (NrdF) containing an unidentified metallocofactor for initiating nucleotide reduction in α2. Although the exact function of NrdI within RNR has not yet been fully characterized, it is found in the same operon as NrdE and NrdF, and encodes an unusual flavodoxin, a bacterial electron- transfer protein that includes a flavin mononucleotide that has been proposed to be involved in metallocofactor biosynthesis and/or maintenance. It has also been proposed that NrdI plays an important role in E. coli class Ib RNR cluster assembly. Recent in vitro studies have shown that a stable diferric-tyrosyl radical (FeIII2-Y·) and dimanganese (III)-Y· (MnIII2-Y·) cofactors are active in nucleotide reduction [69]. The first one can be formed by self-assembly from FeII and O2 while the later cofactor can be generated from MnII-2-NrdF, but only in the presence of O2 and NrdI protein [54,69]. RNR is responsible for the de novo conversion of ribonucleoside diphosphates into deoxyribonucleoside diphosphates and it is essential for DNA synthesis and repair [70]. The active site residues of RNR, in the template structure of NrdI protein (PDB 3N3A), include Ser8, Ser9, Ser11, Ser48, Asn13, Asn83, Thr14, Tyr49, Ala89 and Gly91, all of which are involved in a hydrogen bond network with the cofactor flavin mononucleotide isoalloxazine ring (FMN, PDB 3N3A) [71]. Interestingly, two of these residues, Ser8 and Tyr49, were predicted to make hydrogen bonds with all 10 selected ligands (Table 7). The interaction between the top scoring compound ZINC01585114 (5-nitro-3, 4- diphenyl-2-furamide) and the residues from the predicted target cavities are shown in Additional file 4. Furthermore, the drug-like molecule ZINC00510419 (3,4-bis (5-methylisoxazole-3- carbonyl)-1,2,5-oxadiazole 2-oxide was among the top ten selected molecules for three of the pathogen target proteins, showing good H-bond interactions. It ranked first against the targets Cp1002_0742 (MolDock score = -151.376, no. of H-bonds = 7) and Cp1002_1648 (MolDock score = -167.633, no. of H-bonds = 3) and ranked fourth against the target Cp1002_1676 (MolDock score = -154.064, no. of H-bonds = 4). II.V.IV.V - Essential host homologous as putative targets To compare the predicted EH protein targets to their host homologs, two approaches were taken. First, ClustalX (v2.1, http://www.clustal.org webcite), a multiple sequence alignment program, was used to find different residues between bacterial and host proteins. As expected, a high percentage of residues was found to be conserved, but significant differences were also observed. Most percentage identities are between 35 and 50 (Table 8), except for fumarate ccxix hydratase, which shows 54% sequence identity to human and equine homologous proteins, but no hits in bovine and ovine proteomes. Table 8. Percentage of sequence identity between C. pseudotuberculosis and host homologous proteins. Next, to determine if the observed differences could be exploited in rational design of ligands selective to bacterial proteins, we focused on the predicted druggable cavities. A structural alignment to the host homologous proteins was performed and the cavities were compared in PyMol. In most cases, the DogSiteScorer predicted more than one cavity for each input Cp protein structure. The number of residues in the bacterial predicted cavity that differ from the residues in the cavity of the host protein, for all druggable pockets, varied from zero to seven (Table 9). Table 9. Comparison of the residues from druggable cavities in C. pseudotuberculosis proteins and the corresponding residues in structurally aligned host protein cavities. For conserved host-homologous targets Cp1002_0385 (adk, Adenylate kinase), Cp1002_0692 (gapA, Glyceraldehyde 3-phosphate dehydrogenase), Cp1002_0728 (glyA, Serine hydroxymethyltransferase), Cp1002_0738 (fumC, Fumarate hydratase class II/fumarase), Cp1002_1005 (gnd, 6-Phosphogluconate dehydrogenase) and Cp1002_1042 (aspA, Aspartate ammonia-lyase/aspartase), three, four, five, zero, seven and three different residues were observed, respectively. Then, a more detailed analysis was performed for the predicted highest druggable cavity for each protein. The results are described below, together with information about the biological importance of each target protein. Cp1002_0692 (GapA, Glyceraldehyde 3-phosphate dehydrogenase, GAPDH/G3PDH, EC 1.2.1.12) catalyzes the sixth step of glycolysis. In addition, GAPDH has recently been shown to be involved in several non-metabolic processes, including transcription activation, initiation of apoptosis [72] fast axonal or axoplasmic transport and endoplasmic reticulum to Golgi vesicle shuttling [73,74]. This enzyme has been reported as an anti-trypanosomatid and anti- leishmania drug target in structure-based drug design efforts [21-23]. Furthermore, it has been shown as an interesting putative drug and vaccine target in malaria pathogenesis [75]. Comparison of protein cavities reveals significant differences between bacterial and host proteins, with replacement of bacterial Lys157, Arg229 and Asn311 by Asp, Thr and Ala, respectively. Such differences result in a more basic cavity in bacteria, making it possible to rationally design selective ligands, especially negatively charged molecules, which interact with Lys157 and Arg229, or compounds able to form hydrogen bond to Asn311 (Additional file 5). Nucleoside monophosphate kinases vitally participate in sustaining the intracellular nucleotide pools in all living organisms. Cp1002_0385 (Adk, Adenylate kinase, EC 2.7.4.3) is a ubiquitous enzyme, which catalyzes the reversible Mg2+-dependent transfer of the terminal phosphate group from ATP to AMP, releasing two molecules of ADP [76]. Only one ccxx highly druggable cavity was predicted for adenylate kinase, with a druggability score = 0.81. Three residues in the bacteria cavity were different from the hosts: Leu, Met and Val in the hosts replaced Phe35, Ile53 and Thr64, respectively (Additional file 5). These differences impact the cavity volume, since aromatic and bulky Phe is replaced by Leu, and the ability to make hydrogen bonds, through the replacement of a Thr by a Val. Therefore; the bacterial cavity is smaller and more hydrophilic, making it possible to envision rational design of selective ligands that interact with Thr64. Cp1002_0728 (GlyA, Serine hydroxymethyltransferase EC 2.1.2.1) is an enzyme that plays an important role in cellular one-carbon pathways by catalyzing the reversible, simultaneous conversions of L-serine to glycine (retro-aldol cleavage) and tetrahydrofolate to 5,10- methylenetetrahydrofolate [77]. In Plasmodium, serine hydroxymethyltransferase (SHMT) has been reported as an attractive drug target [78]. For this protein 3 residues were observed different between bacteria and host: Ala99 and Ala101 replaced two Ser residues while Trp177 replaced Thr (Additional file 5). At first glance these changes could have a big impact in the active site, generating a considerably more hydrophilic pocket in the hosts. However, careful inspection of the pocket reveals that the side chains of these residues are not turned towards the pocket, in such a way that these differences probably would not allow rational design of selective ligands. Cp1002_0738 (FumC, Fumaratehydratase class II/fumarase EC 4.2.1.2) catalyzes the reversible hydration/dehydration of fumarate to S-malate during the ubiquitous Krebs cycle, through the aci-carboxylate intermediate subsequent to olefin production [79]. There are two classes of fumarases; Class I fumarases, composed of heat-labile, iron-sulfur (4Fe-4S) homodimeric enzymes, only found in prokaryotes; and Class II fumarases, made of thermostable homotetrameric enzymes [80] found in both prokaryotic and eukaryotic mitochondria. Class II belongs to a superfamily that also includes aspartate-ammonia lyases, arginino-succinatases, d-crystallins and 3-carboxy-cis, cis-muconate lactonizing enzymes. All these enzymes release fumarate from different substrates, ranging from adenylosuccinate to malate [81-84]. FumC of Escherichia coli is the first member of class II fumarases family whose structure has been solved and provided most of the structural information [85]. Inhibition of fumarase in the tricarboxylic acid cycle (TCA) has been reported as a potential molecular target of bismuth drugs in Helicobacter pylori [86]. Comparison of the active site cavity of this protein, which is formed in the interface of three monomers, revealed no differences between bacteria and hosts (additional file 5). Cp1002_1005 (Gnd, 6-Phosphogluconate dehydrogenase EC 1.1.1.44) is an enzyme from the pentose phosphate pathway. It forms ribulose 5-phosphate from 6-phosphogluconate. The enzyme 6-phosphogluconate dehydrogenase is a potential drug target for the parasitic protozoan Trypanosoma brucei, the causative organism of human African trypanosomiasis [87]. Three druggable sites with score > 0.80 were detected in this protein. As opposed to the observation for other proteins, the most druggable predicted cavity (score = 0.88) was not the active site. Leu, Lys and Val residues in the hosts replace residues Met94, Gln96 and Ile148 ccxxi in the bacterial cavity, respectively (Additional file 5). The most significant of these differences is the replacement of Gln by Lys, which could make binding of negative molecules more favorable to the host proteins. Cp1002_1042 (AspA, Aspartate ammonia-lyase/aspartase EC 4.3.1.1) catalyzes the deamination of aspartic acid to form fumarate and ammonia [88]. Recent progresses to prepare enantiopure l-aspartic acid derivatives, highly valuable tools for biological research and chiral building blocks for pharmaceuticals and food additives, make it a target of interest for industrial applications. On the other hand, the important role that it plays in microbial nitrogen metabolism makes it a putative drug target in overcoming bacterial pathogenesis [89]. Based on the sequence alignment for this protein, two significant differences in residues are observed in the most druggable pocket: bacterial His447 and Ile428 are replaced by Leu and Lys in host proteins. Such differences should allow rational ligand design. It is interesting to note that additional differences in the position of helices that contain these residues increase the difference between the active sites (Additional file 5). Based on the above-mentioned analyses, we conclude that it would be difficult to rationally design selective ligands for Cp1002_0738 (FumC, Fumaratehydratase class II), since no residue differences were observed in the most druggable cavity, and for Cp1002_0728 (GlyA, Serine hydroxymethyltransferase), where the side chains of differing residues are not turned toward the druggable pocket. On the other hand, for putative essential and homologous targets that include Cp1002_0692 (GapA, Glyceraldehyde 3-phosphate dehydrogenase), Cp1002_0385 (Adk, Adenylate kinase), Cp1002_1005 (Gnd, 6-Phosphogluconate dehydrogenase) and Cp1002_1042 (AspA, Aspartate ammonia-lyase), significant differences were observed in druggable pockets, suggesting that despite the existence of a host homologous protein they could be good targets for the design of ligands, selective only to the bacterial proteins. II.V.V - Conclusion Here, for the first time, the genomic information was used to determine the conserved predicted proteome of 15 strains of C. pseudotuberculosis, along with their three-dimensional structural information. Even though the structural information discussed is fully computationally predicted, and could therefore deviate from eventually solved experimental structures, we have been careful to concentrate on the analysis of protein models for which there were good templates which provided high quality models, minimizing this concern. The data presented here can effectively contribute in guiding further research for antibiotics and vaccines development. The final dataset can provide valuable information in designing molecular biology and immunization experiments in animal models for validating the targets of a pathogen, as well as in experimental structure determination protocols. The criterion for target selection in C. pseudotuberculosis was stringent, resulting in a small set of prioritized putative drug and vaccine targets, of which four are essential and non- ccxxii homologous and six are essential and host homologous proteins. For the latter, a detailed structural comparison between the residues of the predicted cavities of host and pathogen proteins has been performed, showing in most cases the potential for the development of selective ligands. Therefore, we suggest that the whole set can be considered for antimicrobial chemotherapy, especially the four essential non-host homologous targets. The in silico approaches followed in this study might aid in the development of novel therapeutic drugs and vaccines in a broad-spectrum of hosts at intraspecies level against C. pseudotuberculosis. Furthermore, the strategy described here could also be applied to other pathogenic microorganisms. II.V.VI - Authors' contributions Coordinated entire work: SSH RSF VA DB. Performed all in silico analyses: SSH RSF ST SBJ NBS FDP LCG. Cross-analyzed genome contents, pan-modelome construction, conserved pan-modelome, subtractive modelome approach, virtual screening & docking analyses and residue level structural comparison: SSH RSF ST FDP AI SCS SA DB AGT. Provided timely consultation and reviewed the manuscript: VA AI SCS SA DB NBS LCG AA AM AS VACA AGT. Read and approved the final manuscript: RSF SSH ST AI SCS SBJ SA DB NBS LCG AGTAA AM AS VA. Conceived and designed the work: SSH RSF VA DB. Analyzed the data: SSH RSF ST AI SCS SBJ SA DB NBS LCG AA AB LJ AGTAM AS VA. Wrote the paper: SSH RSF ST. II.V.VII - Conflict of interest The authors declare that they have no competing interests. II.V.VIII - Acknowledgements We acknowledge financial support from the funding agencies CNPq, CAPES and FAPEMIG. Hassan S.S acknowledges the receipt of fellowship under "TWAS-CNPq Postgraduate Fellowship Program" for doctoral studies. This article has been published as part of BMC Genomics Volume 15 Supplement 7, 2014: Proceedings of the 9th International Conference of the Brazilian Association for Bioinformatics and Computational Biology (X-Meeting 2013). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/15/S7. ccxxiii II.V.IX - References 1. Hassan SS, Schneider MP, Ramos RT, Carneiro AR, Ranieri A, Guimaraes LC, Ali A, Bakhtiar SM, Pereira Ude P, dos Santos AR, et al.: Whole-genome sequence of Corynebacterium pseudotuberculosis strain Cp162, isolated from camel. Journal of bacteriology 2012, 194(20):5718-5719. 2. Dorella FA, Pacheco LG, Oliveira SC, Miyoshi A, Azevedo V: Corynebacterium pseudotuberculosis: microbiology, biochemical properties, pathogenesis and molecular studies of virulence. Veterinary research 2006, 37(2):201-218. 3. Soares SC, Trost E, Ramos RT, Carneiro AR, Santos AR, Pinto AC, Barbosa E, Aburjaile F, Ali A, Diniz CA, et al.: Genome sequence of Corynebacterium pseudotuberculosis biovar equi strain 258 and prediction of antigenic targets to improve biotechnological vaccine production. Journal of biotechnology 2012. 4. Khamis A, Raoult D, La Scola B: Comparison between rpoB and 16S rRNA gene sequencing for molecular identification of 168 clinical isolates of Corynebacterium. Journal of clinical microbiology 2005, 43(4):1934- 1936. 5. Williamson LH: Caseous lymphadenitis in small ruminants. Vet Clin North Am Food Anim Pract 2001, 17(2):359-371. vii 6. Peel MM, Palmer GG, Stacpoole AM, Kerr TG: Human lymphadenitis due to Corynebacterium pseudotuberculosis: report of ten cases from Australia and review. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 1997, 24(2):185-191. 7. Luis MA, Lunetta AC: [Alcohol and drugs: preliminary survey of Brazilian nursing research]. Revista latino- americana de enfermagem 2005., 13Spec No:1219-1230 8. Mills AE, Mitchell RD, Lim EK: Corynebacterium pseudotuberculosis is a cause of human necrotising granulomatous lymphadenitis. Pathology 1997, 29(2):231-233. 9. Augustine JL, Renshaw HW: Survival of Corynebacterium pseudotuberculosis in axenic purulent exudate on common barnyard fomites. American journal of veterinary research 1986, 47(4):713-715. 10. Yeruham I, Friedman S, Perl S, Elad D, Berkovich Y, Kalgard Y: A herd level analysis of a Corynebacterium pseudotuberculosis outbreak in a dairy cattle herd. Veterinary dermatology 2004, 15(5):315-320. 11. Perumal D, Lim CS, Sakharkar KR, Sakharkar MK: Differential genome analyses of metabolic enzymes in Pseudomonas aeruginosa for drug target identification. In silico biology 2007, 7(4-5):453-465. 12. Barh D, Gupta K, Jain N, Khatri G, Leon-Sicairos N, Canizalez-Roman A, Tiwari S, Verma A, Rahangdale S, Shah Hassan S, et al.: Conserved host-pathogen PPIs. Integrative biology : quantitative biosciences from nano to macro 2013. 13. Pizza M, Scarlato V, Masignani V, Giuliani MM, Arico B, Comanducci M, Jennings GT, Baldi L, Bartolini E, Capecchi B, et al.: Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science 2000, 287(5459):1816-1820. 14. Asif SM, Asad A, Faizan A, Anjali MS, Arvind A, Neelesh K, Hirdesh K, Sanjay K: Dataset of potential targets for Mycobacterium tuberculosis H37Rv through comparative genome analysis. Bioinformation 2009, 4(6):245-248. 15. Dutta A, Singh SK, Ghosh P, Mukherjee R, Mitter S, Bandyopadhyay D: In silico identification of potential therapeutic targets in the human pathogen Helicobacter pylori. In silico biology 2006, 6(1-2):43-47. 16. Chong CE, Lim BS, Nathan S, Mohamed R: In silico analysis of Burkholderia pseudomallei genome sequence for potential drug targets. In silico biology 2006, 6(4):341-346. 17. Barh D, Kumar A: In silico identification of candidate drug and vaccine targets from various pathways in Neisseria gonorrhoeae. In silico biology 2009, 9(4):225-231. 18. Sakharkar KR, Sakharkar MK, Chow VT: A novel genomics approach for the identification of drug targets in pathogens, with special reference to Pseudomonas aeruginosa. In silico biology 2004, 4(3):355-360. 19. Rathi B, Sarangi AN, Trivedi N: Genome subtraction for novel target definition in Salmonella typhi. Bioinformation 2009, 4(4):143-150. 20. Barh D, Jain N, Tiwari S, Parida BP, D'Afonseca V, Li L, Ali A, Santos AR, Guimaraes LC, de Castro Soares S, et al.: A novel comparative genomics analysis for common drug and vaccine targets in Corynebacterium pseudotuberculosis and other CMN group of human pathogens. Chemical biology & drug design 2011, 78(1):73-84. 21. Aronov AM, Verlinde CL, Hol WG, Gelb MH: Selective tight binding inhibitors of trypanosomal glyceraldehyde-3-phosphate dehydrogenase via structure-based drug design. Journal of medicinal chemistry 1998, 41(24):4790-4799. 22. Singh S, Malik BK, Sharma DK: Molecular modeling and docking analysis of Entamoeba histolytica glyceraldehyde-3 phosphate dehydrogenase, a potential target enzyme for anti-protozoal drug development. Chemical biology & drug design 2008, 71(6):554-562. 23. Suresh S, Bressi JC, Kennedy KJ, Verlinde CL, Gelb MH, Hol WG: Conformational changes in Leishmania mexicana glyceraldehyde-3-phosphate dehydrogenase induced by designed inhibitors. Journal of molecular biology 2001, 309(2):423-435. 24. Adams CP, Brantner VV: Estimating the cost of new drug development: is it really 802 million dollars? Health affairs 2006, 25(2):420-428. ccxxiv 25. Kola I, Landis J: Can the pharmaceutical industry reduce attrition rates? Nature reviews Drug discovery 2004, 3(8):711-715. 26. Congreve M, Murray CW, Blundell TL: Structural biology and drug discovery. Drug discovery today 2005, 10(13):895-907. 27. Baker D, Sali A: Protein structure prediction and structural genomics. Science 2001, 294(5540):93-96. 28. Cavasotto CN, Phatak SS: Homology modeling in drug discovery: current trends and applications. Drug discovery today 2009, 14(13-14):676-683. 29. Behera DK, Behera PM, Acharya L, Dixit A, Padhi P: In silico biology of H1N1: molecular modelling of novel receptors and docking studies of inhibitors to reveal new insight in flu treatment. Journal of biomedicine & biotechnology 2012, 2012:714623. 30. Mural RJ: ARTEMIS: a tool for displaying and annotating DNA sequence. Briefings in bioinformatics 2000, 1(2):199-200. 31. Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A: Comparative protein structure modeling using MODELLER. Current protocols in protein science / editorial board, John E Coligan [et al] 2007. Chapter 2:Unit 2 9 32. Mount DW: Using the Basic Local Alignment Search Tool (BLAST). CSH protocols 2007. 2007:pdb top17 33. Tusnady GE, Simon I: The HMMTOP transmembrane topology prediction server. Bioinformatics 2001, 17(9):849-850. 34. Laskowski RA, Macarthur MW, Moss DS, Thornton JM: Procheck - a Program to Check the Stereochemical Quality of Protein Structures. J Appl Crystallogr 1993, 26:283-291. 35. Capriles PV, Guimaraes AC, Otto TD, Miranda AB, Dardenne LE, Degrave WM: Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment. BMC genomics 2010, 11:610. 36. Abadio AK, Kioshima ES, Teixeira MM, Martins NF, Maigret B, Felipe MS: Comparative genomics allowed the identification of drug targets against human fungal pathogens. BMC genomics 2011, 12:75. 37. Zhang R, Ou HY, Zhang CT: DEG: a database of essential genes. Nucleic acids research 2004, 32(Database):D271-272. 38. Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research 2000, 28(1):27- 30. 39. Yoon SH, Park YK, Lee S, Choi D, Oh TK, Hur CG, Kim JF: Towards pathogenomics: a web-based resource for pathogenicity islands. Nucleic acids research 2007, 35(Database):D395-400. 40. Magrane M, Consortium U: UniProt Knowledgebase: a hub of integrated protein data. Database : the journal of biological databases and curation 2011, 2011:bar009. 41. Yu CS, Lin CJ, Hwang JK: Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions. Protein science : a publication of the Protein Society 2004, 13(5):1402-1406. 42. Velankar S, Alhroub Y, Best C, Caboche S, Conroy MJ, Dana JM, Fernandez Montecelo MA, van Ginkel G, Golovin A, Gore SP, et al.: PDBe: Protein Data Bank in Europe. Nucleic acids research 2012, 40(Database):D445-452. 43. Porter CT, Bartlett GJ, Thornton JM: The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic acids research 2004, 32(Database):D129-133. 44. Dundas J, Ouyang Z, Tseng J, Binkowski A, Turpaz Y, Liang J: CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic acids research 2006, 34(Web Server):W116-118. 45. Laurie AT, Jackson RM: Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 2005, 21(9):1908-1916. 46. Aguero F, Al-Lazikani B, Aslett M, Berriman M, Buckner FS, Campbell RK, Carmona S, Carruthers IM, Chan AW, Chen F, et al.: Genomic-scale prioritization of drug targets: the TDR Targets database. Nature reviews Drug discovery 2008, 7(11):900-907. 47. Butt AM, Nasrullah I, Tahir S, Tong Y: Comparative genomics analysis of Mycobacterium ulcerans for the identification of putative essential genes and therapeutic candidates. PloS one 2012, 7(8):e43080. 48. Volkamer A, Kuhn D, Rippmann F, Rarey M: DoGSiteScorer: a web server for automatic binding site prediction, analysis and druggability assessment. Bioinformatics 2012, 28(15):2074-2075. 49. Voigt JH, Bienfait B, Wang S, Nicklaus MC: Comparison of the NCI open database with seven large chemical structural databases. Journal of chemical information and computer sciences 2001, 41(3):702-712. 50. Thomsen R, Christensen MH: MolDock: a new technique for high-accuracy molecular docking. Journal of medicinal chemistry 2006, 49(11):3315-3321. 51. Hopkins AL, Groom CR: The druggable genome. Nature reviews Drug discovery 2002, 1(9):727-730. 52. Li Y, Zeng J, He ZG: Characterization of a functional C-terminus of the Mycobacterium tuberculosis MtrA responsible for both DNA binding and interaction with its two-component partner protein, MtrB. Journal of biochemistry 2010, 148(5):549-556. 53. Cangelosi GA, Do JS, Freeman R, Bennett JG, Semret M, Behr MA: The two-component regulatory system mtrAB is required for morphotypic multidrug resistance in Mycobacterium avium. Antimicrobial agents and chemotherapy 2006, 50(2):461-468. ccxxv 54. Cotruvo JA, Stubbe J: NrdI, a flavodoxin involved in maintenance of the diferric-tyrosyl radical cofactor in Escherichia coli class Ib ribonucleotide reductase. Proceedings of the National Academy of Sciences of the United States of America 2008, 105(38):14383-14388. 55. McAteer S, Coulson A, McLennan N, Masters M: The lytB gene of Escherichia coli is essential and specifies a product needed for isoprenoid biosynthesis. Journal of bacteriology 2001, 183(24):7403-7407. 56. Eberl M, Hintz M, Reichenberg A, Kollas AK, Wiesner J, Jomaa H: Microbial isoprenoid biosynthesis and human gammadelta T cell activation. FEBS letters 2003, 544(1-3):4-10. 57. Span I, Wang K, Wang W, Zhang Y, Bacher A, Eisenreich W, Li K, Schulz C, Oldfield E, Groll M: Discovery of acetylene hydratase activity of the iron-sulphur protein IspH. Nature communications 2012, 3:1042. 58. Plaimas K, Eils R, Konig R: Identifying essential genes in bacterial metabolic networks with machine learning methods. BMC systems biology 2010, 4:56. 59. Vinayak S, Sharma YD: Inhibition of Plasmodium falciparum ispH (lytB) gene expression by hammerhead ribozyme. Oligonucleotides 2007, 17(2):189-200. 60. Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, et al.: The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic acids research 2010, 38(Database):D473-479. 61. Caffrey CR, Rohwer A, Oellien F, Marhofer RJ, Braschi S, Oliveira G, McKerrow JH, Selzer PM: A comparative chemogenomics strategy to predict potential drug targets in the metazoan pathogen, Schistosoma mansoni. PloS one 2009, 4(2):e4413. 62. Crowther GJ, Shanmugam D, Carmona SJ, Doyle MA, Hertz-Fowler C, Berriman M, Nwaka S, Ralph SA, Roos DS, Van Voorhis WC, et al.: Identification of attractive drug targets in neglected-disease pathogens using an in silico approach. PLoS neglected tropical diseases 2010, 4(8):e804. 63. Brocker M, Mack C, Bott M: Target genes, consensus binding site, and role of phosphorylation for the response regulator MtrA of Corynebacterium glutamicum. Journal of bacteriology 2011, 193(5):1237-1249. 64. Zahrt TC, Deretic V: An essential two-component signal transduction system in Mycobacterium tuberculosis. Journal of bacteriology 2000, 182(13):3832-3838. 65. Friedland N, Mack TR, Yu M, Hung LW, Terwilliger TC, Waldo GS, Stock AM: Domain orientation in the inactive response regulator Mycobacterium tuberculosis MtrA provides a barrier to activation. Biochemistry 2007, 46(23):6733-6743. 66. Lammers M, Follmann H: The Ribonucleotide Reductases - a Unique Group of Metalloenzymes Essential for Cell-Proliferation. Struct Bond 1983, 54:27-91. 67. Nordlund P, Reichard P: Ribonucleotide reductases. Annual review of biochemistry 2006, 75:681-706. 68. Monje-Casas F, Jurado J, Prieto-Alamo MJ, Holmgren A, Pueyo C: Expression analysis of the nrdHIEF operon from Escherichia coli. Conditions that trigger the transcript level in vivo. The Journal of biological chemistry 2001, 276(21):18031-18037. 69. Cotruvo JA, Stubbe J: An active dimanganese(III)-tyrosyl radical cofactor in Escherichia coli class Ib ribonucleotide reductase. Biochemistry 2010, 49(6):1297-1309. 70. Elledge SJ, Zhou Z, Allen JB: Ribonucleotide reductase: regulation, regulation, regulation. Trends in biochemical sciences 1992, 17(3):119-123. 71. Boal AK, Cotruvo JA, Stubbe J, Rosenzweig AC: Structural basis for activation of class Ib ribonucleotide reductase. Science 2010, 329(5998):1526-1530. 72. Tarze A, Deniaud A, Le Bras M, Maillier E, Molle D, Larochette N, Zamzami N, Jan G, Kroemer G, Brenner C: GAPDH, a novel regulator of the pro-apoptotic mitochondrial membrane permeabilization. Oncogene 2007, 26(18):2606-2620. 73. Zala D, Hinckelmann MV, Yu H, Lyra da Cunha MM, Liot G, Cordelieres FP, Marco S, Saudou F: Vesicular glycolysis provides on-board energy for fast axonal transport. Cell 2013, 152(3):479-491. 74. Bressi JC, Verlinde CL, Aronov AM, Shaw ML, Shin SS, Nguyen LN, Suresh S, Buckner FS, Van Voorhis WC, Kuntz ID, et al.: Adenosine analogues as selective inhibitors of glyceraldehyde-3-phosphate dehydrogenase of Trypanosomatidae via structure-based drug design. Journal of medicinal chemistry 2001, 44(13):2080-2093. 75. Pal-Bhowmick I, Andersen J, Srinivasan P, Narum DL, Bosch J, Miller LH: Binding of aldolase and glyceraldehyde-3-phosphate dehydrogenase to the cytoplasmic tails of Plasmodium falciparum merozoite duffy binding-like and reticulocyte homology ligands. mBio 2012., 3(5) 76. Bellinzoni M, Haouz A, Grana M, Munier-Lehmann H, Shepard W, Alzari PM: The crystal structure of Mycobacterium tuberculosis adenylate kinase in complex with two molecules of ADP and Mg2+ supports an associative mechanism for phosphoryl transfer. Protein science : a publication of the Protein Society 2006, 15(6):1489-1493. 77. Appaji Rao N, Ambili M, Jala VR, Subramanya HS, Savithri HS: Structure-function relationship in serine hydroxymethyltransferase. Biochimica et biophysica acta 2003, 1647(1-2):24-29. 78. Sopitthummakhun K, Thongpanchang C, Vilaivan T, Yuthavong Y, Chaiyen P, Leartsakulpanich U: Plasmodium serine hydroxymethyltransferase as a potential anti-malarial target: inhibition studies using improved methods for enzyme production and assay. Malaria journal 2012, 11:194. 79. Mechaly AE, Haouz A, Miras I, Barilone N, Weber P, Shepard W, Alzari PM, Bellinzoni M: Conformational changes upon ligand binding in the essential class II fumarase Rv1098c from Mycobacterium tuberculosis. FEBS letters 2012, 586(11):1606-1611. ccxxvi 80. Woods SA, Schwartzbach SD, Guest JR: Two biochemically distinct classes of fumarase in Escherichia coli. Biochimica et biophysica acta 1988, 954(1):14-26. 81. Sampaleanu LM, Vallee F, Slingsby C, Howell PL: Structural studies of duck delta 1 and delta 2 crystallin suggest conformational changes occur during catalysis. Biochemistry 2001, 40(9):2732-2742. 82. Yang J, Wang Y, Woolridge EM, Arora V, Petsko GA, Kozarich JW, Ringe D: Crystal structure of 3-carboxy- cis,cis-muconate lactonizing enzyme from Pseudomonas putida, a fumarase class II type cycloisomerase: enzyme evolution in parallel pathways. Biochemistry 2004, 43(32):10424-10434. 83. Toth EA, Yeates TO: The structure of adenylosuccinate lyase, an enzyme with dual activity in the de novo purine biosynthetic pathway. Structure 2000, 8(2):163-174. 84. Tsai M, Koo J, Yip P, Colman RF, Segall ML, Howell PL: Substrate and product complexes of Escherichia coli adenylosuccinate lyase provide new insights into the enzymatic mechanism. Journal of molecular biology 2007, 370(3):541-554. 85. Weaver TM, Levitt DG, Donnelly MI, Stevens PP, Banaszak LJ: The multisubunit active site of fumarase C from Escherichia coli. Nature structural biology 1995, 2(8):654-662. 86. Chen Z, Zhou Q, Ge R: Inhibition of fumarase by bismuth(III): implications for the tricarboxylic acid cycle as a potential target of bismuth drugs in Helicobacter pylori. Biometals : an international journal on the role of metal ions in biology, biochemistry, and medicine 2012, 25(1):95-102. 87. Ruda GF, Campbell G, Alibu VP, Barrett MP, Brenk R, Gilbert IH: Virtual fragment screening for novel inhibitors of 6-phosphogluconate dehydrogenase. Bioorganic & medicinal chemistry 2010, 18(14):5056-5062. 88. Shi W, Dunbar J, Jayasekera MM, Viola RE, Farber GK: The structure of L-aspartate ammonia-lyase from Escherichia coli. Biochemistry 1997, 36(30):9136-9144. 89. de Villiers M, Puthan Veetil V, Raj H, de Villiers J, Poelarends GJ: Catalytic mechanisms and biocatalytic applications of aspartate and methylaspartate ammonia lyases. ACS chemical biology 2012, 7(10):1618-1628. ccxxvii II.VI - Curriculum Vitae Edson Luiz Folador Curriculum Vitae Junho/2015 ccxxviii _________________________________________________________________________________ II.VI.I - Dados pessoais Nome Edson Luiz Folador Filiação Eloi Nelso Folador e Jadviga Kinga Folador Nascimento 23/11/1972 - Cascavel/PR - Brasil Identidade 19958749 PC - MG - 25/09/2012 CPF 528.696.521-00 _________________________________________________________________________________ II.VI.II - Formação acadêmica/titulação 2013 - Atual Doutorado em Bioinformática. Universidade Federal de Minas Gerais, UFMG, Belo Horizonte, Brasil Título: Predição e análise comparativa da rede de interação proteína-proteína para os biovares ovis e equi de Corynebacterium pseudotuberculosis Orientador: Vasco Ariston de Carvalho Azevedo Bolsista do(a): Conselho Nacional de Desenvolvimento Científico e Tecnológico 2006 - 2008 Mestrado em Tecnologia em Saúde. Pontifícia Universidade Católica do Paraná, PUC/PR, Curitiba, Brasil Título: GO-SIEVe: Software para determinar códigos de evidência em anotação gênica, Ano de obtenção: 2008 Orientador: Humberto Maciel França Madeira 2003 - 2004 Especialização em Desenvolvimento de Sistemas Web e Apoio a Decisão. Universidade Paranaense, UNIPAR, Umuarama, Brasil Título: Bancos de Dados Relacionais: Um Estudo da Viabilidade de utilização de Tabela Resumo Orientador: Angelo Alfredo Sucolotti 1999 - 2002 Graduação em Sistemas de Informação. Universidade Paranaense, UNIPAR, Umuarama, Brasil Título: Desenvolvimento Sistema Controle Financeiro Orientador: Angelo Alfredo Sucolloti _________________________________________________________________________________ II.VI.III - Formação complementar 2014 - 2014 Curso de curta duração em PATRIC: Recursos integrados estudo patogenicidade. Universidade Federal de Minas Gerais, UFMG, Belo Horizonte, Brasil Bolsista do(a): Conselho Nacional de Desenvolvimento Científico e Tecnológico 2014 - 2014 Extensão universitária em Formação em Docência do Ensino Superior. Universidade Federal de Minas Gerais, UFMG, Belo Horizonte, Brasil 2014 - 2014 Curso de curta duração em Practical Bioinformatics on Gene Functional Netwok. Universidade Federal de Minas Gerais, UFMG, Belo Horizonte, Brasil Bolsista do(a): Conselho Nacional de Desenvolvimento Científico e Tecnológico 2012 - 2012 Curso de curta duração em Montagem, anotação e extração dados transcriptoma. Centro de Pesquisa René Rachou, CPQRR, Brasil ccxxix 2012 - 2012 Curso de curta duração em RNAseq. Universidade Federal de Minas Gerais, UFMG, Belo Horizonte, Brasil 2011 - 2011 Curso de curta duração em Técnicas para montagem e análise de genomas. Universidade Estadual de Campinas, UNICAMP, Campinas, Brasil 2010 - 2010 Curso de curta duração em Curso de verão em bioinformática. Universidade de São Paulo, USP, Sao Paulo, Brasil 2005 - 2005 Curso de curta duração em Formação de Tutores Moodle. Universidade de Brasília, UNB, Brasília, Brasil 2002 - 2002 Curso de curta duração em Data Warehouse. Universidade Paranaense, UNIPAR, Umuarama, Brasil 2002 - 2002 Curso de curta duração em Php. Universidade Paranaense, UNIPAR, Umuarama, Brasil 2002 - 2002 Curso de curta duração em Montagem e Manutenção de Computadores. Universidade Paranaense, UNIPAR, Umuarama, Brasil 2002 - 2002 Curso de curta duração em Interbase. Universidade Paranaense, UNIPAR, Umuarama, Brasil 2001 - 2001 Curso de curta duração em Tcp Ip. Universidade Paranaense, UNIPAR, Umuarama, Brasil 2001 - 2001 Curso de curta duração em Recursos Informática Aplicados Ensino de Biologia. Universidade Paranaense, UNIPAR, Umuarama, Brasil 1999 - 1999 Curso de curta duração em Redes e Telecomunicações. Universidade Paranaense, UNIPAR, Umuarama, Brasil 1999 - 1999 Curso de curta duração em Modelo de Arquitetura de Sistemas de Informação. Universidade Paranaense, UNIPAR, Umuarama, Brasil 1999 - 1999 Curso de curta duração em Métricas Sobre Internet. Universidade Paranaense, UNIPAR, Umuarama, Brasil 1998 - 1998 Curso de curta duração em Como Calcular Custo e Preço de Venda no Comércio. Serviço Brasileiro de Apoio às Micro e Pequenas Empresas, SEBRAE, Brasília, Brasil 1997 - 1998 Língua Espanhola. Centro de Línguas Estrangeiras Modernas, CELEM, Brasil 1993 - 1993 Curso de curta duração em Criatividade Em Vendas. Serviço Nacional de Aprendizagem Comercial, SENAC, Brasil 1993 - 1993 Curso de curta duração em Como Implantar Os Controles Financeiros Básicos na. Serviço Brasileiro de Apoio às Micro e Pequenas Empresas, SEBRAE, Brasília, Brasil 1993 - 1993 Curso de curta duração em Como Calcular Os Custos e Formar Preços de Venda. Serviço Brasileiro de Apoio às Micro e Pequenas Empresas, SEBRAE, Brasília, Brasil 1992 - 1992 Curso de curta duração em Técnica de Atendimento e Motivação Em Vendas. Serviço Nacional de Aprendizagem Comercial, SENAC, Brasil ccxxx _________________________________________________________________________________ II.VI.IV - Atuação profissional 1. Universidade Federal de Minas Gerais - UFMG ____________________________________________________________________________ Vínculo institucional 2013 - Atual Vínculo: Bolsista, Enquadramento funcional: Analista em Bioinformática, Carga horária: 40, Regime: Dedicação exclusiva ____________________________________________________________________________ Atividades 08/2014 - Atual Pesquisa e Desenvolvimento, Instituto de Ciências Biológicas Linhas de pesquisa: Predição e análise comparativa da rede de interação proteína-proteína para 15 linhagens dos biovares ovis e equi de Corynebacterium pseudotuberculosis 08/2013 - Atual Outra atividade técnico-científica, Instituto de Ciências Biológicas Especificação: Administração de Sistema de Gerenciamento de Banco de Dados, Curadoria e Anotação funcional de Genomas, Desenvolvimento de rotinas em linguagem PG/pgSQL, Desenvolvimentos de rotinas de computador em linguagem Bash ou Perl para solução de problemas em Bioinformática, Modelagem de Banco de Dados para predição de interação proteína-proteína e transferência de anotação genética 08/2013 - 07/2014 Pesquisa e Desenvolvimento, Instituto de Ciências Biológicas Linhas de pesquisa: Validação de metodologia computacional para predição de redes de interação proteína- proteína 2. Centro de Pesquisa René Rachou - CPQRR ____________________________________________________________________________ Vínculo institucional 2012 - 2013 Vínculo: Bolsista, Enquadramento funcional: Bolsista, Carga horária: 40, Regime: Dedicação exclusiva ____________________________________________________________________________ Atividades 03/2012 - 06/2012 Treinamento, LPCM Especificação: Lógica de programação para Bioinformática com exemplos práticos na linguagem de programação Perl 03/2012 - 05/2013 Serviço Técnico Especializado, LPCM Especificação: Administração e modelagem do bando de dados de predição de epítopos, Administração e modelagem dos bancos de dados do laboratório de Bioinformática, Desenvolvimento de rotinas de Bioinformática nas linguagens de programação C, Perl, Php 3. Instituto Nacional de Câncer - INCA ____________________________________________________________________________ Vínculo institucional 2009 - 2011 Vínculo: Bolsista CNPQ DTI-1, Enquadramento funcional: Analista em Bioinformática, Carga horária: 40, Regime: Dedicação exclusiva ccxxxi ____________________________________________________________________________ Atividades 11/2009 - 11/2009 Pós-graduação, Programa de Pós-Graduação em Oncologia (PPGO) Disciplinas ministradas: Introdução a Bioinformática (Módulo de Bando de Dados) 03/2009 - 06/2012 Serviço Técnico Especializado, Coordenação de Pesquisa, Laboratório de Bioinformática e Biologia Computacional (LBBC) Especificação: Desenvolvimento de aplicações e rotinas principalmente nas linguagens de programação Perl, PHP e HTML., Desenvolvimento de um Sistema de Gerenciamento de Informações para Laboratório (LIMS) de proteômica 03/2009 - 02/2012 Serviço Técnico Especializado, Coordenação de Pesquisa, Laboratório de Bioinformática e Biologia Computacional (LBBC) Especificação: Administração de Banco de Dados (DBA): instalação, configuração, gerenciamento e modelagem das bases de dados sob o Sistema de Gerenciamento de Banco de Dados (SGBD) Postgres. 4. Instituto de Estudos Avançados e Pós-Graduação - ESAP ____________________________________________________________________________ Vínculo institucional 2006 - 2008 Vínculo: Celetista formal, Enquadramento funcional: Professor títular, Carga horária: 8, Regime: Parcial ____________________________________________________________________________ Atividades 07/2008 - 12/2008 Graduação, Sistema de informação Disciplinas ministradas: Projeto e Análise de Algoritmos II 02/2008 - 07/2008 Graduação, Sistema de informação Disciplinas ministradas: Projeto e Análise de Algoritmos I 10/2007 - 12/2008 Direção e Administração, Curso Sistemas de Informação Cargos ocupados: Coordenador de Curso 08/2007 - 12/2007 Graduação, Sistema de informação Disciplinas ministradas: Banco de Dados I 02/2007 - 06/2007 Graduação, Sistema de informação Disciplinas ministradas: Banco de Dados II 02/2007 - 06/2007 Graduação, Administração Disciplinas ministradas: Recursos Computacionais II 07/2006 - 12/2006 Graduação, Sistema de informação Disciplinas ministradas: Engenharia de Software I ccxxxii 5. Universidade Estadual do Oeste do Paraná - UNIOESTE ____________________________________________________________________________ Vínculo institucional 2005 - 2005 Vínculo: Colaborador, Enquadramento funcional: Colaborador em projeto de pesquisa, Carga horária: 2, Regime: Parcial 2003 - 2005 Vínculo: Colaborador, Enquadramento funcional: Professor titular, Carga horária: 24, Regime: Parcial ___________________________________________________________________________ Atividades 07/2004 - 07/2004 Conselhos, Comissões e Consultoria, Conselho de Ensino, Pesquisa e Extensão Especificação: Banca Avaliadora Monitoria Disciplina Engenharia de Software 01/2004 - 12/2004 Graduação, Engenharia Agrícola Disciplinas ministradas: Processamento de Dados 01/2004 - 12/2004 Graduação, Engenharia Civil Disciplinas ministradas: Introdução a Computação 01/2004 - 12/2004 Graduação, Informática Disciplinas ministradas: Banco de Dados I 07/2003 - 12/2003 Graduação, Informática Disciplinas ministradas: Algoritmos e Estrutura de Dados, Engenharia de software 07/2003 - 12/2003 Graduação, Engenharia Civil Disciplinas ministradas: Introdução a Computação 6. União Panamericana de Ensino - UNIPAN ____________________________________________________________________________ Vínculo institucional 2004 - 2007 Vínculo: Outro, Enquadramento funcional: Professor titular, Carga horária: 4, Regime: Parcial ____________________________________________________________________________ Atividades 01/2007 - 07/2007 Graduação, Ciência da Computação Disciplinas ministradas: Pesquisa e Ordenação de Dados 01/2006 - 12/2006 Graduação, Ciência da Computação Disciplinas ministradas: Banco de Dados, Pesquisa e Ordenação de Dados 01/2005 - 12/2005 Graduação, Ciência da Computação Disciplinas ministradas: Estrutura, Pesquisa e Ordenação de Dados, Banco de Dados 03/2004 - 12/2004 Graduação, Ciência da Computação Disciplinas ministradas: Estrutura, Pesquisa e Ordenação de Dados - C ccxxxiii 7. União Educacional do Médio Oeste Paranaense Ltda - UNIMEO ____________________________________________________________________________ Vínculo institucional 2004 - 2004 Vínculo: Outro, Enquadramento funcional: Professor titular, Carga horária: 8, Regime: Parcial ____________________________________________________________________________ Atividades 07/2004 - 10/2004 Graduação, Sistema de Informação Disciplinas ministradas: Pesquisa e Ordenação de Dados - C 02/2004 - 06/2004 Graduação, Sistema de Informação Disciplinas ministradas: Projeto e Análise de Dados Orientado a Objeto, Estrutura de Dados - C 8. Maxicon System Ltda - MAXICON ____________________________________________________________________________ Vínculo institucional 2002 - 2003 Vínculo: Funcionário, Enquadramento funcional: Programador Sênior, Carga horária: 44, Regime: Dedicação exclusiva 2001 - 2002 Vínculo: Estagiário, Enquadramento funcional: Programador, Carga horária: 40, Regime: Integral ____________________________________________________________________________ Atividades 07/2001 - 02/2003 Serviço Técnico Especializado, Desenvolviemnto de sistemas Especificação: Análise e desenvolvimento de sistema sob BD Oracle com Front End Forms 6.0 e Linguagem de programação PL/SQL 9. Salgado & Haddad Ltda - CDI ____________________________________________________________________________ Vínculo institucional 1995 - 1996 Vínculo: Funcionário, Enquadramento funcional: Instrutor Informática, Carga horária: 20, Regime: Parcial ____________________________________________________________________________ Atividades 08/1995 - 09/1996 Treinamento Especificação: Treinamento Aplicativo Word, Excel, Power Point 10. Comercial de Calçados Âncora Ltda - ÂNCORA ____________________________________________________________________________ Vínculo institucional 1992 - 1995 Vínculo: Funcionário, Enquadramento funcional: Gerente, Carga horária: 44, Regime: Integral _________________________________________________________________________ Atividades 02/1992 - 03/1995 Direção e Administração Cargos ocupados: Gerente ccxxxiv 11. Grisa & Grisa Ltda - GRISA ____________________________________________________________________________ Vínculo institucional 1989 - 1991 Vínculo: Funcionário, Enquadramento funcional: Vendedor Interno, Carga horária: 44, Regime: Integral ____________________________________________________________________________ Atividades 03/1989 - 02/1991 Serviço Técnico Especializado Especificação: Vendedor Balconista, Crediarista _________________________________________________________________________________ II.VI.V - Linhas de pesquisa 1. Predição e análise comparativa da rede de interação proteína-proteína para 15 linhagens dos biovares ovis e equi de Corynebacterium pseudotuberculosis 2. Validação de metodologia computacional para predição de redes de interação proteína-proteína _________________________________________________________________________________ II.VI.VI - Projetos Projetos de pesquisa 2015 - Atual Estudo do interatoma e exossoma em Corynebacterium pseudotuberculosis para pesquisa de novos alvos terapêuticos Descrição: Existe uma dificuldade na eliminação da C. pseudotuberculosis por macrófagos, e desvendar como ocorre a interação entre patógeno e hospedeiro, conhecer a cascata de resposta em nível transcricional, nos dois organismos simultaneamente, bem como elucidar o efeito do exossoma secretado na resposta imune do hospedeiro, abriria um leque de tentativas para busca de soluções eficazes contra este problema enfrentado. Tanto o patógeno quanto o hospedeiro buscam uma resposta rápida, adaptativa, eficaz para a própria sobrevivência. Assim, perceber a alteração no ambiente e transmitir a informação montando uma rede de resposta ideal é o ponto chave para entender todo o processo para manutenção dos organismos no ambiente. Chamada de projetos MEC/MCTI/CAPES/CNPq/FAPs nº 09/2014. Situação: Em andamento Natureza: Projetos de pesquisa Alunos envolvidos: Mestrado acadêmico (4); Doutorado (2); Integrantes: Edson Luiz Folador; Adriana Ribeiro Carneiro (Responsável) 2013 - Atual Rede de cooperação acadêmica para o estudo e desenvolvimento de ferramentas para a genômica Estrutural e Funcional Descrição: Fortalecer e ampliar o intercâmbio acadêmico entre os programas inter-unidades de Pós- Graduação em Bioinformática da UFMG (CAPES 6) e da USP (5), o de Biotecnologia da UFPA (CAPES 5) e o de Bioinformática da UFPR (CAPES 3) com a criação de uma rede voltada a aumentar a formação de recursos humanos em Biologia Computacional, em resposta à presente chamada. Edital nº 51/2013 BIOLOGIA COMPUTACIONAL. Situação: Em andamento Natureza: Projetos de pesquisa Alunos envolvidos: Mestrado acadêmico (7); Doutorado (6); ccxxxv Integrantes: Edson Luiz Folador; HASSAN, SYED SHAH; TIWARI, SANDEEP; ALMEIDA, SINTIA; OLIVEIRA, ALBERTO; Diego Cesar Batista Mariano; Letícia C. Oliveira; Vinicius Augusto Carvalho de Abreu; Vasco Azevedo (Responsável); Rafaela Salgado Ferreira _________________________________________________________________________________ II.VI.VII - Produção bibliográfica Artigos completos publicados em periódicos 1. FOLADOR EL, OLIVEIRA, ALBERTO, TIWARI, SANDEEP, JAMAL, SYED BABAR, FERREIRA, R. S., BARH, D., Ghosh, P., SILVA, A., AZEVEDO, V. In silico protein-protein interactions: avoiding data and method biases over sensitivity and specificity. Current Protein and Peptide Science., v.16, p.1 -, 2015. 2. FOLADOR, EDSON LUIZ, HASSAN, SYED SHAH, LEMKE, NEY, BARH, DEBMALYA, SILVA, ARTUR, FERREIRA, RAFAELA SALGADO, AZEVEDO, VASCO An improved interolog mapping-based computational prediction of protein-protein interactions with increased network coverage. Integrative Biology., v.6, p.1080 - 1087, 2014. 3. SILVA, WANDERSON M, CARVALHO, RODRIGO D, SOARES, SIOMAR C, BASTOS, ISABELA FS, FOLADOR, EDSON L, SOUZA, GUSTAVO HMF, LE LOIR, YVES, MIYOSHI, ANDERSON, SILVA, ARTUR, AZEVEDO, VASCO Label-free proteomic analysis to confirm the predicted proteome of Corynebacterium pseudotuberculosis under nitrosative stress mediated by nitric oxide. BMC Genomics., v.15, p.1065 -, 2014. 4. TIWARI, SANDEEP, DA COSTA, MARCÍLIA PINHEIRO, ALMEIDA, SINTIA, HASSAN, SYED SHAH, JAMAL, SYED BABAR, OLIVEIRA, ALBERTO, FOLADOR, EDSON LUIZ, ROCHA, FLAVIA, DE ABREU, VINÍCIUS AUGUSTO CARVALHO, DORELLA, FERNANDA, HIRATA, RAFAEL, DE OLIVEIRA, DIANA MAGALHAES, DA SILVA TEIXEIRA, MARIA FÁTIMA, SILVA, ARTUR, BARH, DEBMALYA, AZEVEDO, VASCO C. pseudotuberculosis Phop confers virulence and may be targeted by natural compounds. Integrative Biology., v.9, p.1 - 12, 2014. 5. HASSAN, S. S., TIWARI, SANDEEP, GUIMARÃES, LUIS CARLOS, JAMAL, SYED BABAR, FOLADOR, EDSON LUIZ, SHARMA, N. B., SOARES, SIOMAR DE CASTRO, ALMEIDA, SINTIA, ALI, A., ISLAM, A., POVOA, F. D., ABREU, V. A. C., JAIN, N., BHATTACHARYA, A., JUNEJA, L., MIYOSHI, A., SILVA, A., BARH, D., TURJANSKI, A. G., AZEVEDO, V., FERREIRA, R. S. Proteome scale comparative modeling for conserved drug and vaccine targets identification in Corynebacterium pseudotuberculosis. BMC Genomics., v.15, p.S3 -, 2014. 6. REZENDE, ANTONIO M., FOLADOR, EDSON L., RESENDE, DANIELA DE M., RUIZ, J. C. Computational Prediction of Protein-Protein Interactions in Leishmania Predicted Proteomes. Plos One., v.7, p.e51304 -, 2012. 7. BARAUNA, R. A., GUIMARAES, L. C., VERAS, A. A. O., DE SA, P. H. C. G., GRACAS, D. A., PINHEIRO, K. C., SILVA, A. S. S., FOLADOR, E. L., BENEVIDES, L. J., VIANA, M. V. C., CARNEIRO, A. R., SCHNEIDER, M. P. C., SPIER, S. J., EDMAN, J. M., RAMOS, R. T. J., AZEVEDO, V., SILVA, A. Genome Sequence of Corynebacterium pseudotuberculosis MB20 bv. equi Isolated from a Pectoral Abscess of an Oldenburg Horse in California. Genome Announcements., v.2, p.e00977-14 - e00977- 14, 2014. 8. BENEVIDES, LEANDRO DE JESUS, VIANA, MARCUS VINICIUS CANÁRIO, MARIANO, DIEGO CÉSAR BATISTA, ROCHA, FLÁVIA DE SOUZA, BAGANO, PRISCILLA CAROLINNE, FOLADOR, EDSON LUIZ, PEREIRA, FELIPE LUIZ, DORELLA, FERNANDA ALVES, LEAL, CARLOS AUGUSTO GOMES, CARVALHO, ALEX FIORINI, SOARES, SIOMAR DE CASTRO, CARNEIRO, ADRIANA, ccxxxvi RAMOS, ROMMEL, BADELL-OCANDO, EDGAR, GUISO, NICOLE, SILVA, ARTUR, FIGUEIREDO, HENRIQUE, AZEVEDO, VASCO, GUIMARÃES, LUIS CARLOS Genome Sequence of Corynebacterium ulcerans Strain FRC11. Genome Announcements., v.3, p.e00112-15 -, 2015. 9. VIANA, M. V. C., DE JESUS BENEVIDES, L., BATISTA MARIANO, D. C., DE SOUZA ROCHA, F., BAGANO VILAS BOAS, P. C., FOLADOR, E. L., PEREIRA, F. L., ALVES DORELLA, F., GOMES LEAL, C. A., FIORINI DE CARVALHO, A., SILVA, A., DE CASTRO SOARES, S., PEREIRA FIGUEIREDO, H. C., AZEVEDO, V., GUIMARAES, L. C. Genome Sequence of Corynebacterium ulcerans Strain 210932. Genome Announcements., v.2, p.e01233-14 - e01233-14, 2014. 10. OLIVEIRA, L C, SARAIVA, T D L, SOARES, S C, RAMOS, R T J, SA, P H C G, CARNEIRO, A R, MIRANDA, F, FREIRE, M, RENAN, W, JUNIOR, A F O, SANTOS, A R, PINTO, A C, SOUZA, B M, CASTRO, C P, DINIZ, C A A, ROCHA, C S, MARIANO, D C B, DE AGUIAR, E L, FOLADOR, E L, BARBOSA, E G V, ABURJAILE, F F, GONCALVES, L A, GUIMARAES, L C, AZEVEDO, M, AGRESTI, P C M, SILVA, R F, TIWARI, S, ALMEIDA, S S, HASSAN, S S, PEREIRA, V B, ABREU, V A C, PEREIRA, U P, DORELLA, F A, CARVALHO, A F, PEREIRA, F L, LEAL, C A G, FIGUEIREDO, H C P, SILVA, A, MIYOSHI, A, AZEVEDO, V Genome Sequence of Lactococcus lactis subsp. lactis NCDO 2118, a GABA-Producing Strain. Genome Announcements., v.2, p.e00980-14 - e00980-14, 2014. 11. TAVARES, RAPHAEL, SCHERER, NICOLE DE MIRANDA, PAULETTI, BIANCA ALVES, ARAÚJO, ELÓI, FOLADOR, EDSON LUIZ, ESPINDOLA, GABRIEL, Ferreira, Carlos Gil, LEME, ADRIANA FRANCO PAES, DE OLIVEIRA, PAULO SERGIO LOPES, Passetti, Fabio SpliceProt: a protein sequence repository of predicted human splice variants. Proteomics (Weinheim. Print)., v.14, p.181 - 185, 2014. 12. Santos, Paula F, Santos, Paula F, Ruiz, Jerônimo C, Soares, Rodrigo PP, Moreira, Douglas S, Rezende, Antônio M, Folador, Edson L, Oliveira, Guilherme C, Romanha, Alvaro J, Murta, Silvane MF, Oliveira, Guilherme C, Ruiz, Jerônimo C, Rezende, Antônio M, Soares, Rodrigo PP, Murta, Silvane MF, Moreira, Douglas S, Folador, Edson L, Romanha, Alvaro J Molecular characterization of the hexose transporter gene in benznidazole resistant and susceptible populations of Trypanosoma cruzi. Parasites & Vectors., v.5, p.161 - 186, 2012. 13. WAJNBERG, G., BRAIT, M., FOLADOR, E.L., PARRELLA, P., CAIMS, P., BARBANO, R., FERREIRA, C.G., PASSETTI, F., SIDRANSKY, D., HOQUE, M.O. 573 Copy Number Variation Analysis for Identification of Novel Disease-related Regions in Bladder Cancer. European Journal of Cancer., v.48, p.S136 -, 2012. 14. Renaud, Gabriel, Neves, Pedro, Folador, Edson L, Ferreira, Carlos Gil, Passetti, Fabio Segtor: Rapid Annotation of Genomic Coordinates and Single Nucleotide Variations Using Segment Trees. Plos One., v.6, p.e26715 -, 2011. 15. BIDARRA, Jorge, Folador, Edson L, CAVASIN, Rodrigo José, MARCON, Marlon xListas - Um léxico eletrônico para a Língua Portuguesa. Línguas & Letras (UNIOESTE)., v.1, p.6 - 6, 2005. Capítulos de livros publicados 1. ABURJAILE, F. F., SANTANA, M. P., VIANA, M. V. C., SILVA, WANDERSON M, FOLADOR EL, SILVA, A., AZEVEDO, V. Genomics In: A Textbook of Biotechnology.1 ed.Irving, TX 75039, USA : SM Online Publishers LLC, 2015, v.1, p. 32-50. Trabalhos publicados em anais de eventos (resumo) 1. Folador, Edson L, Gomes, Renata B., Neves, Pedro, Renaud, Gabriel, Ferreira, Carlos Gil, Abdelhay, Eliane, Passetti, Fabio ccxxxvii pLIMS: an innovative approach to manage and analyze 2D/1D protein gel In: International Workshop on Genomic Databases - IWGD, 2010, Buzios. IWGD'10 Abstracts book., 2010. 2. Folador, Edson L, Gomes, Renata B., Neves, Pedro, Renaud, Gabriel, Ferreira, Carlos Gil, Abdelhay, Eliane, Passetti, Fabio PLIMS: A Bioinformatic tool for the 2D/1D protein gel electrophoresis experiments management and analysis In: X-meeting, 2009, Angra dos Reis. X-meeting abstracts book 2009., 2009. 3. Folador, Edson L, SUCOLOTTI, Angelo A. Estudo da Viabilidade do Uso de Tabelas Resumo em Banco de Dados Relacional In: III Encontro de iniciação Científica, III Fórum de Pesquisa, 2004, Umuarama. 3º Encontro de Iniciação Científica e Fórum de Pesquisa. Unipar - Umuarama - PR: DEGPP/Unipar, 2004. v.3. p.249 - 250 II.VI.VIII - Apresentação de trabalho e palestra 1. VIANA, M. V. C., BENEVIDES, L. J., MARIANO, D. C. B., ROCHA, FLAVIA, FOLADOR, E. L., PEREIRA, F. L., DORELLA, F. A., LEAL, C. A. G., CARVALHO, A. F., SILVA, A., SOARES, S. C., FIGUEIREDO, H. C. P., AZEVEDO, V., GUIMARAES, L. C. Complete genome sequence of Corynebacterium ulcerans strain 210932, 2014. (Congresso, Apresentação de Trabalho) 2. BENEVIDES, L. J., VIANA, M. V. C., MARIANO, D. C. B., ROCHA, FLAVIA, FOLADOR, E. L., PEREIRA, F. L., DORELLA, F. A., CARVALHO, A. F., LEAL, C. A. G., SILVA, A., SOARES, S. C., FIGUEIREDO, H. C. P., AZEVEDO, V., GUIMARAES, L. C. Complete genome sequence of Corynebacterium ulcerans 210931, 2014. (Seminário, Apresentação de Trabalho) 3. Mariano, D. C. B, OLIVEIRA, L. C., Folador EL, DE AGUIAR, E. L., BENEVIDES, L. J., PEREIRA, F. L., RAMOS, R. T. J., AZEVEDO, V. SIMBA: A web tools for complete assembly of bacterial genomes, 2014. (Congresso, Apresentação de Trabalho) 4. Folador, Edson L, Gomes, Renata B., Neves, Pedro, Renaud, Gabriel, Ferreira, Carlos Gil, Abdelhay, Eliane, Passetti, Fabio Current status of the pLIMS project: a Bioinformatics tool to promote collaborative 1D/2D- PAGE proteomics experiments, 2011. (Congresso, Apresentação de Trabalho) 5. Madeira, Humberto M. F., MAlucelli, Andreia, Folador, Edson L GO-SIEVE - A method to aid the assignment of evidence codes in genome annotations, 2010. (Congresso, Apresentação de Trabalho) 6. Folador, Edson L, Gomes, Renata B., Renaud, Gabriel, Neves, Pedro, Ferreira, Carlos Gil, Passetti, Fabio pLIMS: uma abordagem inovadora para gerenciamento e análise de experimentos em gel de eletroforeses 2D/1D de proteína para projetos colaborativos, 2010. (Congresso, Apresentação de Trabalho) 7. Folador, Edson L, Gomes, Renata B., Renaud, Gabriel, Neves, Pedro, Ferreira, Carlos Gil, Passetti, Fabio pLIMS: Ferramenta de bioinformática para gerenciamento e análise de experimentos em gel de eletroforese 1D/2D, 2009. (Congresso, Apresentação de Trabalho) 8. Folador, Edson L, MAlucelli, Andreia, Madeira, Humberto M. F. ccxxxviii GO-SIEV – Software system for inferring annotation evidence from already annotated genes, 2007. (Congresso, Apresentação de Trabalho) II.VI.IX - Programa de computador sem registro 1. Folador, Edson L, Passetti, Fabio pLIMS: uma abordagem inovadora para gerenciamento e análise de experimentos em gel de eletroforeses 2D/1D para projetos colaborativos, 2009 2. Folador, Edson L GO-SIEVe - Software para inferir códigos de evidência em anotação genética, 2008 3. Folador, Edson L Sistema de Controle de Auto Peças, 2001 4. Folador, Edson L Sistema de Cotrole para pedidos de Compras Bibliográficas, 2001 Demais produções técnicas 1. Folador EL Introdução a Bioinformática, 2012. (Extensão, Curso de curta duração ministrado) 2. Folador EL O uso de ferramentas de Bioinformática para a inovação científica em Oncologia, 2012. (Extensão, Curso de curta duração ministrado) 3. Passetti, Fabio, Folador, Edson L I Curso prático de introdução à programação para Bioinformática, 2011. (Extensão, Curso de curta duração ministrado) II.VI.X - Orientações e Supervisões Orientações e supervisões concluídas Trabalhos de conclusão de curso de graduação 1. Jeferson do Nascimento. Aplicação de data mining na busca de padrões de dados referente à criminalidade no município de Cascavel. 2006. Curso (Ciência da Computação) - União Pan- Americana de Ensino II.VI.XI - Eventos Participação em eventos 1. Apresentação de Poster / Painel no(a) X-Meeting, 2014. (Congresso) SIMBA: A web tools for complete assembly of bacterial genomes. 2. Publications Ethics and Optimizing your Chances of Acceptance in Journals, 2014. (Seminário). 3. X-Meeting, 2013. (Congresso). ccxxxix 4. Apresentação de Poster / Painel no(a) X-Meeting, 2011. (Congresso) Current status of the pLIMS project: a Bioinformatics tool to promote collaborative 1D/2D-PAGE proteomics experiments. 5. III Fórum de Integração dos Alunos de Pós-Graduação, 2011. (Encontro). 6. Curso de Bioinformática - Algoritmos e técnicas computacionais para montagem e análise de genomas., 2011. (Seminário). 7. Apresentação de Poster / Painel no(a) X-Meeting, 2010. (Congresso) GO-SIEVE - A METHOD TO AID THE ASSIGNMENT OF EVIDENCE CODES IN GENOME ANNOTATIONS. 8. Apresentação Oral no(a) International Workshop on Genomic Databases - IWGD, 2010. (Congresso) pLIMS: uma abordagem inovadora para gerenciamento e análise de experimentos em gel de eletroforeses 2D/1D de proteína para projetos colaborativos. 9. Curso de verão em bioinformática (USP), 2010. (Seminário). 10. Apresentação de Poster / Painel no(a) X-meeting, 2009. (Congresso) pLIMS: Ferramenta de bioinformática para gerenciamento e análise de experimentos em gel de eletroforese 1D/2D. 11. GE Day, 2009. (Encontro). 12. Apresentação Oral no(a) X-meeting, 2007. (Congresso) GO-SIEV - Software system for inferring annotation evidence from already annotated genes. 13. II EPAC - Encontro Paranaense de Computação, 2007. (Encontro). 14. I EPAC - Encontro Paranaense de Computação, 2005. (Encontro). 15. 3ª Semana de Informática, 2003. (Encontro). II.VI.XII - Organização de evento 1. Passetti, Fabio, Folador, Edson L I Curso prático de introdução à programação para Bioinformática, 2011. (Outro, Organização de evento) 2. Kessler, Neivor, Oliveira, Lindomar S., Folador, Edson L, Santos, Vera B. Empresa Destaque 2007, 2007. (Outro, Organização de evento) II.VI.XIII - Participação em banca de trabalhos de conclusão Graduação 1. Konopatzki, Angélica Lima, Gavioli, Alan, Folador, Edson L Participação em banca de Susana Paula Saretto Ferronatto. Mapeamento tecnológico dos estabelecimentos de ensino médio de Cascavel nas intituições públicas e privadas, 2007 (Ciência da Computação) União Pan-Americana de Ensino ccxl 2. Konopatzki, Angélica Lima, Folador, Edson L, Wagner, Emerson Participação em banca de Matheus de Lima Boza. Mineração de dados para definição do perfil da saúde pública em Cascavel com relação às doenças crônicas não-transmissíveis, 2007 (Ciência da Computação) União Pan-Americana de Ensino 3. Wagner, Emerson, Chrusciak, Daniele, Folador, Edson L Participação em banca de Giancarlo E. C. Fiorenza. Modelo para implantação de tecnologia da informação em prefeituras municipais de pequeno porte, 2007 (Ciência da Computação) União Pan-Americana de Ensino 4. Antiquera, Paulo R. da Silva, Folador, Edson L, Chrusciak, Daniele Participação em banca de Alexandre Magno Semmer. Persistência em banco de dados relacional para sistemas web, 2007 (Ciência da Computação) União Pan-Americana de Ensino 5. Piovesan, Suzan Lelly Borges, Gavioli, Alan, Folador, Edson L Participação em banca de Jony Carlos Palaoro. Protótipo de algoritmo genético para roteamento de rodovias, 2007 (Ciência da Computação) União Pan-Americana de Ensino II.VI.XIV - Participação em banca de comissões julgadoras 1. Processo de Seleção de Monitores, 2004 Universidade Estadual do Oeste do Paraná _________________________________________________________________________________ II.VI.XV - Outras informações relevantes 1 Aprovado em 3º lugar no cuncurso público do CEFET/MG para a disciplina de Algoritmos e Programação de Computadores. Edital geral Nº 149/2014 e Edital específico Nº 62/14. http://www.jusbrasil.com.br/diarios/72348349/dou-secao-3-30-06-2014-pg-60. http://pesquisa.in.gov.br/imprensa/servlet/INPDFViewer?jornal=3&pagina=60&data=30/06/2014&captc hafield=firistAccess