Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings

Dörr D, Stoye J, Böcker S, Jahn K (2014)
BMC Genomics 15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
OA
Autor*in
Abstract / Bemerkung
Background: Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences require prior knowledge of gene family assignments of genes in the dataset of interest. These families are often computationally predicted on the basis of sequence similarity or higher order features of gene products. Errors introduced in this process amplify in subsequent gene order analyses and thus may deteriorate gene cluster prediction. Results: In this work, we present a new dynamic model and efficient computational approaches for gene cluster prediction suitable in scenarios ranging from traditional gene family-based gene cluster prediction, via multiple conflicting gene family annotations, to gene family-free analysis, in which gene clusters are predicted solely on the basis of a pairwise similarity measure of the genes of different genomes. We evaluate our gene family-free model against a gene family-based model on a dataset of 93 bacterial genomes. Conclusions: Our model is able to detect gene clusters that would be also detected with well-established gene family-based approaches. Moreover, we show that it is able to detect conserved regions which are missed by gene family-based methods due to wrong or deficient gene family assignments.
Erscheinungsjahr
2014
Zeitschriftentitel
BMC Genomics
Band
15
Ausgabe
Suppl. 6: Proc. of RECOMB-CG 2014
Art.-Nr.
S2
ISSN
1471-2164
Finanzierungs-Informationen
Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.
Page URI
https://pub.uni-bielefeld.de/record/2687033

Zitieren

Dörr D, Stoye J, Böcker S, Jahn K. Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics. 2014;15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.
Dörr, D., Stoye, J., Böcker, S., & Jahn, K. (2014). Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics, 15(Suppl. 6: Proc. of RECOMB-CG 2014), S2. doi:10.1186/1471-2164-15-S6-S2
Dörr, Daniel, Stoye, Jens, Böcker, Sebastian, and Jahn, Katharina. 2014. “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”. BMC Genomics 15 (Suppl. 6: Proc. of RECOMB-CG 2014): S2.
Dörr, D., Stoye, J., Böcker, S., and Jahn, K. (2014). Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics 15:S2.
Dörr, D., et al., 2014. Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics, 15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.
D. Dörr, et al., “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”, BMC Genomics, vol. 15, 2014, : S2.
Dörr, D., Stoye, J., Böcker, S., Jahn, K.: Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics. 15, : S2 (2014).
Dörr, Daniel, Stoye, Jens, Böcker, Sebastian, and Jahn, Katharina. “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”. BMC Genomics 15.Suppl. 6: Proc. of RECOMB-CG 2014 (2014): S2.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-06T09:18:25Z
MD5 Prüfsumme
084536c32a0dcb8d56920d30f53c67e3


1 Zitation in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Finding approximate gene clusters with Gecko 3.
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Bocker S., Nucleic Acids Res. 44(20), 2016
PMID: 27679480

29 References

Daten bereitgestellt von Europe PubMed Central.

Evolution of gene order conservation in prokaryotes
AUTHOR UNKNOWN, 2001
Molecular evidence for an ancient duplication of the entire yeast genome.
Wolfe KH, Shields DC., Nature 387(6634), 1997
PMID: 9192896
Algorithms for finding gene clusters
AUTHOR UNKNOWN, 2001
Quadratic time algorithms for finding common intervals in two and moresequences
AUTHOR UNKNOWN, 2004
Common intervals of multiple permutations
AUTHOR UNKNOWN, 2011
The algorithmic of gene teams
AUTHOR UNKNOWN, 2002
Identifying conserved gene clusters in the presence of homology families.
He X, Goldwasser MH., J. Comput. Biol. 12(6), 2005
PMID: 16108708
Integer linear programs for discovering approximate gene clusters
AUTHOR UNKNOWN, 2006
Computation of median gene clusters.
Bocker S, Jahn K, Mixtacki J, Stoye J., J. Comput. Biol. 16(8), 2009
PMID: 19689215
The COG database: an updated version includes eukaryotes.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510
How environmental solution conditions determine the compaction velocity of single DNA molecules.
Hirano K, Ichikawa M, Ishido T, Ishikawa M, Baba Y, Yoshikawa K., Nucleic Acids Res. 40(1), 2011
PMID: 21896618
OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011.
Waterhouse RM, Zdobnov EM, Tegenfeldt F, Li J, Kriventseva EV., Nucleic Acids Res. 39(Database issue), 2010
PMID: 20972218
OrthoMCL: identification of ortholog groups for eukaryotic genomes.
Li L, Stoeckert CJ Jr, Roos DS., Genome Res. 13(9), 2003
PMID: 12952885
InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.
Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL., Nucleic Acids Res. 38(Database issue), 2009
PMID: 19892828
Domain architecture comparison for multidomain homology identification.
Song N, Sedgewick RD, Durand D., J. Comput. Biol. 14(4), 2007
PMID: 17572026
Family classification without domain chaining.
Joseph JM, Durand D., Bioinformatics 25(12), 2009
PMID: 19478015
Genome-wide comparative gene family classification.
Frech C, Chen N., PLoS ONE 5(10), 2010
PMID: 20976221
Domains, motifs and clusters in the protein universe.
Liu J, Rost B., Curr Opin Chem Biol 7(1), 2003
PMID: 12547420
Algorithms on indeterminate strings
AUTHOR UNKNOWN, 2003
Fast algorithms to enumerate all common intervals of two permutations
AUTHOR UNKNOWN, 2000
Character sets of strings
AUTHOR UNKNOWN, 2007
Toward automatic reconstruction of a highly resolved tree of life.
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P., Science 311(5765), 2006
PMID: 16513982
Metrics for GO based protein semantic similarity: a systematic evaluation.
Pesquita C, Faria D, Bastos H, Ferreira AE, Falcao AO, Couto FM., BMC Bioinformatics 9 Suppl 5(), 2008
PMID: 18460186
Basic local alignment search tool.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ., J. Mol. Biol. 215(3), 1990
PMID: 2231712
Proteinortho: detection of (co-)orthologs in large-scale analysis.
Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ., BMC Bioinformatics 12(), 2011
PMID: 21526987
RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more.
Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J., Nucleic Acids Res. 41(Database issue), 2012
PMID: 23203884
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®
Quellen

PMID: 25571793
PubMed | Europe PMC

Suchen in

Google Scholar