Genome mining tools

From ActinoBase
Revision as of 23:52, 17 November 2021 by Nelly (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Actinobacteria are talented producers of secondary metabolites, many of which have useful biological activities. Thanks to the development of many targeted genome mining tools for bacteria, we can now identify previously uncharacterized biosynthetic gene clusters (BGCs) for natural products.

Some useful genome mining resources are listed below:

antiSMASH:1 (antibiotics and Secondary Metabolite Analysis Shell)

  • Identification and annotation of secondary metabolite gene clusters
  • Twitter

PRISM:2 (PRediction Informatics for Secondary Metabolism)

  • Identification of nonribosomal peptides, type I and II polyketides and RiPPs

BAGEL:3

  • Identification of gene clusters for bacteriocins and RiPPs

CLUSEAN:4

  • Annotation and analysis of secondary metabolite gene clusters

ClusterFinder:5

  • Identification of secondary metabolite gene clusters

ARTS:6 (Antibiotic Resistant Target Seeker)

  • Genome mining for secondary metabolites and potential antibiotics based on antibiotic resistance targets

2metDB:7

  • Genome mining for polyketides and nonribosomal peptides

PKMiner:8

  • Genome mining for type II polyketide synthases

SBSPKS:9

  • Sequence analysis of polyketide synthases

RiPPMINER:10

  • Genome mining and deciphering chemical structures of RiPPs

RODEO:11 (Rapid ORF Description and Evaluation Online)

  • Identification of biosynthetic gene clusters and prediction of RiPP precursor peptides

RiPPER:12

  • Identification of RiPP precursor peptides and biosynthetic gene clusters
  • Click here for a RiPPER user guide (chapter 14) or email Dr Andy Truman (Andrew.truman@jic.ac.uk) to request a copy.

BiG-SCAPE:13

  • Identification of gene cluster families through a collection of genomes. This software classifies antiSMASH outputs into cluster families. Also on this paper, software CORASON searches any gene cluster in a set of genomes.

EvoMining:14

  • Identification of expansions of gene families from central metabolism that have been recruited into biosynthetic gene clusters.

BiG SLiCE :15

  • Allows identification of cross-species BGC patterns. This is a specialized Big data method that avoids network comparison to allow the classification of millions of BGCs.


  1. Medema, M.H., Blin, K., Cimermancic, P., de Jager, V., Zakrzewski, P., Fischbach, M.A., Weber, T., Takano, E., Breitling, R. (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Research, 1; 39 doi: 10.1093/nar/gkr466
  2. Skinnider, M.A., Merwin, N. J., Johnston, C. W., Magarvey, N. A. (2017) PRISM 3: expanded prediction of natural product chemical structures from microbial genomes. et al, Nucleic Acids Research, doi: 10.1093/nar/gkx320
  3. Van Heel, A.J., de Jong, A., Song, C., Viel, J. H., Kok, J., Kuipers, O. P. (2018) BAGEL4: a user-friendly webserver to thoroughly mine RiPPs and bacteriocins. Nucleic Acids Research, 2;46 doi: org/10.1093/nar/gky383.
  4. Weber, T., Rausch, C., Lopez, P., Hoof, I., Gaykova, V., Huson, D. H., Wohlleben, W. (2009) CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. Journal of Biotechnology, 140:13-7 doi: 10.1016/j.jbiotec.2009.01.007
  5. Cimermancic, P., Medema, M.H., Claesen, J., Kurita, K., Wieland Brown, L.C., Mavrommatis, K., Pati, A., Godfrey, P.A., Koehrsen, M., Clardy, J., Birren, B.W., Takano, E., Sali, A., Linington, R.G., Fischbach, M.A. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. (2014) Cell, 158:412-21 doi: 10.1016/j.cell.2014.06.034
  6. Alanjary, M., Kronmiller, B., Adamek, M., Blin, K., Weber, T., Huson, D., Philmus, B., Ziemert, N. (2017) The Antibiotic Resistant Target Seeker (ARTS), an exploration engine for antibiotic cluster prioritization and novel drug target discovery. Nucleic Acids Research, 3;45(W1):W42-W48 doi: 10.1093/nar/gkx360
  7. Bachmann, B. O., and Ravel, J., (2009) Chapter 8. Methods for in silico prediction of microbial polyketide and nonribosomal peptide biosynthetic pathways from DNA sequence data. Methods in Enzymology, 458:181-217 doi: 10.1016/S0076-6879(09)04808-3
  8. Kim, J., and Yi, G.S. (2012) PKMiner: a database for exploring type II polyketide synthases. BMC Microbiology, 8;12:169. doi: 10.1186/1471-2180-12-169.
  9. Anand, S., Prasad, M.V., Yadav, G., Kumar, N., Shehara, J., Ansari, M.Z., Mohanty, D. (2010) SBSPKS: structure-based sequence analysis of polyketide synthases. Nucleic Acids Research, 38(Web Server issue):W487-96. doi: 10.1093/nar/gkq340.
  10. Agrawal, P., Khater, S., Gupta, M., Sain, N., Mohanty, D. (2017) RiPPMiner: a bioinformatics resource for deciphering chemical structures of RiPPs based on prediction of cleavage and cross-links. Nucleic Acids Research, 3;45 doi: 10.1093/nar/gkx408.
  11. Tietz, J. I., Schwalen, C.J., Patel, P.S., Maxson, T., Blair, P.M., Tai, H.C., Zakai, U.I., Mitchell, D.A. (2017) A new genome-mining tool redefines the lasso peptide biosynthetic landscape. Nature Chemical Biology, 13(5):470-478 doi: 10.1038/nchembio.2319.
  12. Santos-Aberturas, J., Chandra, G., Frattaruolo, L., Lacret, R., Pham, T. H., Vior, N. M., Eyles, T. H., Truman, A. W. (2019) Uncovering the unexplored diversity of thioamidated ribosomal peptides in Actinobacteria using the RiPPER genome mining tool. Nucleic Acids Research, 47;(9):4624–4637 doi: 10.1093/nar/gkz192
  13. Navarro-Muñoz, J.C., Selem-Mojica, N., Mullowney, M.W. et al. (2020) A computational framework to explore large-scale biosynthetic diversity. Nat Chem Biol 16, 60–68. https://doi.org/10.1038/s41589-019-0400-9
  14. Nelly Sélem-Mojica, César Aguilar, Karina Gutiérrez-García, Christian E. Martínez-Guerrero​, Fancisco Barona-Gómez. (2019) EvoMining reveals the origin and fate of natural product biosynthetic enzymes. Microbial Genomics https://doi.org/10.1099/mgen.0.000260
  15. Satria A Kautsar, Justin J J van der Hooft, Dick de Ridder, Marnix H Medema (2021) BiG-SLiCE: A highly scalable tool maps the diversity of 1.2 million biosynthetic gene clusters, GigaScience, Volume 10, Issue 1, giaa154, https://doi.org/10.1093/gigascience/giaa154