Accessing the Variability of Multicopy Genes in Complex Genomes using Unassembled Next-Generation Sequencing Reads: The Case of Trypanosoma cruzi Multigene Families

dc.creatorJoão Luísreis-cunha
dc.creatorVanessa Gomes Fraga
dc.creatorLucia Maria da Cunha Galvão
dc.creatorLilian Lacerda Bueno
dc.creatorRicardo Toshio Fujiwara
dc.creatorMariana Santos Cardoso
dc.creatorGustavo Coutinho Cerqueira
dc.creatorDaniella c. Bartholomeu
dc.creatorAnderson Coqueiro-dos-santos
dc.creatorSamuel Alexandre Pimenta-Carvalho
dc.creatorLarissa Pinheiro Marques
dc.creatorGabriela F. Rodrigues-luiz
dc.creatorRodrigo P. Baptista
dc.creatorLaila Viana de Almeida
dc.creatorNathan Ravi Medeiros Honorato
dc.creatorFrancisco Pereira Lobo
dc.date.accessioned2023-12-06T19:30:58Z
dc.date.accessioned2025-09-09T00:57:49Z
dc.date.available2023-12-06T19:30:58Z
dc.date.issued2022-10-20
dc.description.sponsorshipCNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico
dc.description.sponsorshipCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
dc.format.mimetypepdf
dc.identifier.doihttps://doi.org/10.1128/mbio.02319-22
dc.identifier.issn2150-7511
dc.identifier.urihttps://hdl.handle.net/1843/61810
dc.languageeng
dc.publisherUniversidade Federal de Minas Gerais
dc.relation.ispartofmBio
dc.rightsAcesso Aberto
dc.subjectAntígenos
dc.subjectMuseus
dc.subjectTrypanosoma cruzi
dc.subjectGenomas
dc.subjectGenes
dc.subject.otherMulticopy genes
dc.subject.otherVariability
dc.subject.otherCopy number variation
dc.subject.otherComplex genomes
dc.subject.otherT. cruzi
dc.subject.otherMASP
dc.subject.otherMucins
dc.subject.otherTranssialidases
dc.subject.otherAntigenicity
dc.titleAccessing the Variability of Multicopy Genes in Complex Genomes using Unassembled Next-Generation Sequencing Reads: The Case of Trypanosoma cruzi Multigene Families
dc.typeArtigo de periódico
local.citation.epage15
local.citation.spagee02319-22
local.citation.volumeXX
local.description.resumoRepetitive elements cause assembly fragmentation in complex eukaryotic genomes, limiting the study of their variability. The genome of Trypanosoma cruzi, the parasite that causes Chagas disease, has a high repetitive content, including multigene families. Although many T. cruzi multigene families encode surface proteins that play pivotal roles in host-parasite interactions, their variability is currently underestimated, as their high repetitive content results in collapsed gene variants. To estimate sequence variability and copy number variation of multigene families, we developed a read-based approach that is independent of gene-specific read mapping and de novo assembly. This methodology was used to estimate the copy number and variability of MASP, TcMUC, and Trans-Sialidase (TS), the three largest T. cruzi multigene families, in 36 strains, including members of all six parasite discrete typing units (DTUs). We found that these three families present a specific pattern of variability and copy number among the distinct parasite DTUs. Inter-DTU hybrid strains presented a higher variability of these families, suggesting that maintaining a larger content of their members could be advantageous. In addition, in a chronic murine model and chronic Chagasic human patients, the immune response was focused on TS antigens, suggesting that targeting TS conserved sequences could be a potential avenue to improve diagnosis and vaccine design against Chagas disease. Finally, the proposed approach can be applied to study multicopy genes in any organism, opening new avenues to access sequence variability in complex genomes. IMPORTANCE Sequences that have several copies in a genome, such as multicopy-gene families, mobile elements, and microsatellites, are among the most challenging genomic segments to study. They are frequently underestimated in genome assemblies, hampering the correct assessment of these important players in genome evolution and adaptation. Here, we developed a new methodology to estimate variability and copy numbers of repetitive genomic regions and employed it to characterize the T. cruzi multigene families MASP, TcMUC, and transsialidase (TS), which are important virulence factors in this parasite. We showed that multigene families vary in sequence and content among the parasite’s lineages, whereas hybrid strains have a higher sequence variability that could be advantageous to the parasite's survivability. By identifying conserved sequences within multigene families, we showed that the mammalian host immune response toward these multigene families is usually focused on the TS multigene family. These TS conserved and immunogenic peptides can be explored in future works as diagnostic targets or vaccine candidates for Chagas disease. Finally, this methodology can be easily applied to any organism of interest, which will aid in our understanding of complex genomic regions.
local.identifier.orcidhttps://orcid.org/0000-0003-0974-7357
local.publisher.countryBrasil
local.publisher.departmentICB - DEPARTAMENTO DE BIOQUÍMICA E IMUNOLOGIA
local.publisher.departmentICB - DEPARTAMENTO DE PARASITOLOGIA
local.publisher.departmentICB - INSTITUTO DE CIÊNCIAS BIOLOGICAS
local.publisher.initialsUFMG
local.url.externahttps://journals.asm.org/doi/10.1128/mbio.02319-22

Arquivos

Pacote original

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
Accessing the Variability of Multicopy Genes in Complex Genomes using Unassembled Next-Generation Sequencing Reads_ The Case of Trypanosoma cruzi Multigene Families.pdf
Tamanho:
2.78 MB
Formato:
Adobe Portable Document Format

Licença do pacote

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
License.txt
Tamanho:
1.99 KB
Formato:
Plain Text
Descrição: