FENNEC: Functional Exploration of Natural Networks and Ecological Communities

Assessment of species composition in ecological communities and networks is an important aspect of biodiversity research. Yet, for many ecological questions the ecological properties (traits) of organisms in a community are more informative than their scientific names. Furthermore, other properties like threat status, invasiveness, or human usage are relevant for many studies, but they can not be directly evaluated from taxonomic names alone. Despite the fact that various public databases collect such trait information, it is still a tedious manual task to enrich existing community tables with those traits, especially for large data sets. For example, nowadays, meta-barcoding or automatic image processing approaches are designed for high-throughput analyses, yielding thousands of taxa for hundreds of samples in very short time frames.

We developed the FENNEC, a web-based workbench that eases this process by mapping publicly available trait data to the user’s community tables in an automated process. We run a public instance holding traits that cover a range of topics includeing specialization, invasiveness, vulnerability, and agricultural relevance. Scientists are free to use the FENNEC as a resource for their ecological research.

Website: https://fennec.molecular.eco

Freely available at GitHub:  https://github.com/molbiodiv/fennec

Preprint: https://www.biorxiv.org/content/early/2017/09/27/194308

AliTV – Alignment Toolbox and Visualization

The comparison of genome structures of organisms can yield interesting insights into evolutionary processes. In order to do the comparison, whole genome alignments are required. However, the interpretation of whole genome alignments is difficult without proper visualization. AliTV utilizes d3.js to create interactive visualizations of whole genome alignments.

Example visualizations including the alignment of seven chloroplast genomes are available online.

Freely available at GitHub:  https://github.com/AliTVTeam/AliTV

Publication: https://peerj.com/articles/cs-116/

TBro: visualization and management of de novo transcriptomes

A web based transcriptome browser suitable for de novo transcriptomics. It has been used to analyze the Venus Flytrap transcriptome.

TBro is a web application that allows biologists to browse the vast amount of data generated by RNA-seq experiments. Powerful search options exist to find transcripts of interest. All information for each transcript is aggregated on a single page. Transcripts of interest can be organized in carts and analyzed together.

Freely available at GitHub:  https://github.com/TBroTeam/TBro

Publication: https://academic.oup.com/database/article/doi/10.1093/database/baw146/2742073

biojs-io-biom: A JavaScript library for handling data in Biological Observation Matrix (BIOM) format.

This library provides an easy to use interface to interact with data in BIOM format. The library itself is written using ES6 and is tested with Mocha. In order to provide compatibility with both versions 1.0 and 2.1 of the BIOM format a lightweight conversion server has been developed. You can find a public instance of the conversion server here.

Freely available at GitHub:  https://github.com/molbiodiv/biojs-io-biom

Publication: https://f1000research.com/articles/5-2348/v2

bcgTree: automatized phylogenetic tree building from bacterial core genomes

The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis.

Freely available at GitHub:  https://github.com/molbiodiv/bcgTree

Publication: http://www.nrcresearchpress.com/doi/abs/10.1139/gen-2015-0175


chloroExtractor: extraction and assembly of the chloroplast genome from whole genome shotgun data

The chloroExtractor is a perl based program which provides a pipeline for DNA extraction of chloroplast DNA from whole genome plant data. Too huge amounts of chloroplast DNA can cast problems for the assembly of whole genome data. One solution for this problem can be a core extraction before sequencing, but this can be expensive. The chloroExtractor takes whole genome data and extracts the chloroplast DNA, so different DNA is separated easily by the chloroExractor. Furthermore, the chloroExtractor takes the chloroplast DNA and tries to assemble it. This is possible because of the preserved nature of the chloroplasts primary and secondary structure. Through k-mer filtering the k-mers which contain the chloroplast sequences get extracted and can then be used to assemble the chloroplast on a guided assembly with several other chloroplasts.

Freely available at GitHub:  https://github.com/chloroExtractorTeam/chloroExtractor

Publication: http://joss.theoj.org/papers/eaceb6ac6723a3ea5749f7f50d4a4ad4

16S2Genome: Genomic traits for 16S rDNA microbiota studies

Molecular sequencing techniques help to understand microbial biodiversity with regard to species richness, assembly structure and function. In this context, available methods are barcoding, metabarcoding, genomics and metagenomics. The first two are restricted to taxonomic assignments, whilst genomics only refers to functional capabilities of a single organism. Metagenomics by contrast yields information about organismal and functional diversity of a community. However currently it is very demanding regarding labour and costs and thus not applicable to most laboratories. Here, we show in a proof-of-concept that computational approaches are able to retain functional information about microbial communities assessed through 16S rDNA (meta)barcoding by referring to reference genomes. We developed an automatic pipeline to show that such integration may infer preliminary or supplementary genomic content of a community.

Reference: Keller A, Horn H, Förster F, Schultz J. (2014) Computational integration of genomic traits into 16S rDNA microbiota sequencing studies. Gene. 549:1 186–191

Github: https://github.com/molbiodiv/16S2Genome