image/svg+xml phyloFlash v3.0 beta 1 by Harald Gruber-Vodicka, Elmar Pruesse, Brandon Seah

High throughput phylogenetic screening using SSU rRNA gene(s) abundance(s)

Click on report section headers to expand, mouse-over underlined text to see explanations.

Library name: test

Graphical Summary

Mouseover on panels to expand details.

Mapping identity (%) 20 30 40 50 60 70 80 90 100 10000 20000 30000 40000 50000 60000 70000 80000 86589 Read-mapping %identity of reads vs. reference database. Lower %identity hits may indicate presence of divergent taxa not represented in the database.
Insert size (bp) 200 300 400 500 600 700 800 887 1000 2000 3000 4000 5000 6000 7000 8000 9000 9888 Insert sizes for read pairs. Distribution should generally be unimodal; more than one peak may indicate contamination from other libraries.
99.995 % pairs mapped Mapped pair Mapped single Mapped bad pair Proportion of reads mapped to SSU rRNA database. Typically < 1% for metagenomes, ca. 20% for metatranscriptomes without rRNA depletion or poly-A selection.
Reads assembled Assembled Unassembled Proportion of reads assembled to full-length sequences. High proportion unassembled suggest either assembly failure or high diversity of organisms with low coverage.
Taxonomic summary from reads mapped Eukaryota;Opisthokonta;Holozoa;Metazoa 88901 Bacteria;Actinobacteria;Actinobacteria;Micrococcales 1523 Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales 661 Bacteria;Proteobacteria;Gammaproteobacteria;Xanthomonadales 660 Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales 187 Bacteria;Firmicutes;Bacilli;Lactobacillales 181 Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales 142 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodospirillales 70 Bacteria;Acidobacteria;Acidobacteria;Subgroup 6 59 Bacteria;Spirochaetae;Spirochaetes;W27 55 Bacteria;Firmicutes;Bacilli;Bacillales 26 Bacteria;Firmicutes;Clostridia;Clostridiales 8 Eukaryota;Amoebozoa;Discosea;Longamoebia 6 Bacteria;Actinobacteria;Actinobacteria;Corynebacteriales 5 Eukaryota;Opisthokonta;Nucletmycea;Fungi 4 Eukaryota;Amoebozoa;Dictyostelia;Dictyostelium 4 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales 4 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales 3 Bacteria;Proteobacteria;Betaproteobacteria;Rhodocyclales 3 Bacteria;Acidobacteria;Acidobacteria;Subgroup 4 2 Bacteria;Planctomycetes;Phycisphaerae;S-70 2 Bacteria;Proteobacteria;Gammaproteobacteria;Chromatiales 2 Bacteria;Actinobacteria;Thermoleophilia;Solirubrobacterales 1 Bacteria;Proteobacteria;Betaproteobacteria;Methylophilales 1 Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales 1 Eukaryota;SAR;Stramenopiles;Ochrophyta 1 Eukaryota;Excavata;Discoba;Discicristata 1 Eukaryota;Archaeplastida;Chloroplastida;Chlorophyta 1 Bacteria;Actinobacteria;Actinobacteria;Actinomycetales 1 Bacteria;Proteobacteria;Alphaproteobacteria;Rhizobiales 1 Eukaryota;SAR;Rhizaria;Cercozoa 1 Bacteria;Actinobacteria;Actinobacteria;Actinopolysporales 1
Coverage on 18S model 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1794 100 200 300 400 500 600 700 781 Coverage evenness across eukaryotic 18S rRNA gene model from Barrnap, using Nhmmer from random subsample of mapped reads. This helps to detect contamination from tag sequencing libraries (sharp coverage peaks). For the eukaryotic model it is normal to see one or two regions with low coverage because of variable regions in the 18S rRNA gene that are not present in all organisms.
Coverage on 16S model 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1505 100 200 282 Coverage evenness across prokaryotic 16S rRNA gene model from Barrnap, using Nhmmer from random subsample of mapped reads. This helps to detect contamination from tag sequencing libraries (sharp coverage peaks).

Tree of full-length assembled sequences (Click to expand)

Full-length assembled SSU rRNA sequences along with closest hits from SILVA database, in an alignment guide tree produced by MAFFT. This tree helps to visualize the relatedness of sequences in library to known relatives. Colored circles have areas proportional to the number of SSU rRNA reads that map to each respective sequence (re-mapping is done separately for SPAdes and EMIRGE full-length sequence sets). Click on the toggle switches to turn them on and off. Additional circle representing proportion of SSU rRNA reads that were not assembled is in lower right corner.

Color key: Full-length sequences assembled by SPAdes, Full-length sequences reconstructed by EMIRGE, Closest-matching reference sequence from SILVA database

Toggle reads mapped: SPAdes I O EMIRGE I O Reads: 1188 Reads: 65695 Reads: 582 Reads: 1355 Reads: 75803 Unassembled SSU reads: 10681 test.PFemirge_59_0.018622 AY547556.1.1532 Bacteria;Actinobacteria;Actinobacteria;Micrococcales;Microbacteriaceae;Microbacterium;Microbacterium sp. CME1 test.PFspades_3_304.197 EU196000.1.1674 Eukaryota;Opisthokonta;Holozoa;Metazoa;Animalia;Nematoda;Chromadorea;Rhabditidae;Caenorhabditis sp. JU727 test.PFspades_2_1.86115 DQ213024.1.1496 Bacteria;Proteobacteria;Gammaproteobacteria;Xanthomonadales;Xanthomonadaceae;uncultured;Xanthomonas sp. B05-08.04.0214 test.PFspades_1_4.08069 test.PFemirge_0_0.981378

Input parameters

Input command phyloFlash.pl -CPUs 8 -lib test -everything -read1 test_F.fq.gz -read2 test_R.fq.gz
Forward read file test_F.fq.gz
Reverse read file test_R.fq.gz
Minimum mapping identity 70%
Working folder /data/db/phyloFlash_dev/phyloFlash/test_002
Database used /data/db/phyloFlash_dev/phyloFlash/119

Results

Mapping statistics

Input PE-reads 60000
Mapped SSU read pairs 59997
Mapping ratio 99.995%
Fraction assembled 88.455%
Detected median insert size 562
Used insert size 562
Insert size standard deviation 154

Output files

FASTA file of alignment of all full-length sequences test.SSU.collection.alignment.fasta
FASTA file of all full-length sequences and their closest database hits test.SSU.collection.fasta
Newick guide tree from MAFFT alignment of all full-length sequences and closest database hits test.SSU.collection.fasta.tree
SVG graphic of guide tree from MAFFT alignment of all full-length sequences and closest database hits test.SSU.collection.fasta.tree.svg
FASTA file of full-length SSU sequences test.all.final.fasta
Mapping identity histogram from BBmap test.idhistogram
Insert size histogram from BBmap test.inserthistogram
phyloFlash report in plain text test.phyloFlash
NTU abundances from initial mapping, in CSV format test.phyloFlash.NTUabundance.csv
SVG graphic of taxonomic composition from initial read mapping test.phyloFlash.NTUabundance.csv.svg
Taxonomic classification of full-length sequences, in CSV format test.phyloFlash.extractedSSUclassifications.csv
phyloFlash report in CSV format test.phyloFlash.report.csv
Taxonomic composition of unassembled SSU reads in CSV format test.phyloFlash.unassembled.NTUabundance.csv
Reads (fwd) mapping to SSU rRNA database test.test_F.fq.gz.SSU.1.fq
Reads (rev) mapping to SSU rRNA database test.test_F.fq.gz.SSU.2.fq
SAM file of initial read mapping to SSU rRNA database test.test_F.fq.gz.SSU.sam

Taxonomic affiliation of SSU rRNA reads in library

Approximate overview of taxonomic composition of ALL reads, based on mapping hits to SILVA SSU rRNA database using BBmap.

NTUs observed once 10
NTUs observed twice 3
NTUs observed three or more times 19
NTU Chao1 richness estimate 35.6666666666667

Taxonomy summarized at level 4. Only displaying taxa with > 3 reads mapped.

Taxon Reads
Eukaryota;Opisthokonta;Holozoa;Metazoa 88901
Bacteria;Actinobacteria;Actinobacteria;Micrococcales 1523
Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales 661
Bacteria;Proteobacteria;Gammaproteobacteria;Xanthomonadales 660
Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales 187
Bacteria;Firmicutes;Bacilli;Lactobacillales 181
Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales 142
Bacteria;Proteobacteria;Alphaproteobacteria;Rhodospirillales 70
Bacteria;Acidobacteria;Acidobacteria;Subgroup 6 59
Bacteria;Spirochaetae;Spirochaetes;W27 55
Bacteria;Firmicutes;Bacilli;Bacillales 26
Bacteria;Firmicutes;Clostridia;Clostridiales 8
Eukaryota;Amoebozoa;Discosea;Longamoebia 6
Bacteria;Actinobacteria;Actinobacteria;Corynebacteriales 5
Eukaryota;Amoebozoa;Dictyostelia;Dictyostelium 4
Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales 4
Eukaryota;Opisthokonta;Nucletmycea;Fungi 4

SSU rRNA assembly-based taxa

Full-length SSU rRNA sequences assembled by SPAdes, matched to SILVA database with Vsearch.

OTU Mapped Cov DB hit Taxonomy % ID Alnlen Evalue
test.PFspades_3 65695 304.197 EU196000.1.1674 Eukaryota;Opisthokonta;Holozoa;Metazoa;Animalia;Nematoda;Chromadorea;Rhabditidae;Caenorhabditis sp. JU727 100.0 1494 -1
test.PFspades_1 1355 4.08069 AY547556.1.1532 Bacteria;Actinobacteria;Actinobacteria;Micrococcales;Microbacteriaceae;Microbacterium;Microbacterium sp. CME1 99.9 1505 -1
test.PFspades_2 582 1.86115 DQ213024.1.1496 Bacteria;Proteobacteria;Gammaproteobacteria;Xanthomonadales;Xanthomonadaceae;uncultured;Xanthomonas sp. B05-08.04.0214 100.0 1496 -1

SSU rRNA reconstruction-based taxa

Full-length SSU rRNA seqeunces reconstructed by EMIRGE, matched to SILVA database by Vsearch.

OTU Mapped Ratio DB hit Taxonomy % ID Alnlen Evalue
test.PFemirge_0 75803 0.981378 EU196000.1.1674 Eukaryota;Opisthokonta;Holozoa;Metazoa;Animalia;Nematoda;Chromadorea;Rhabditidae;Caenorhabditis sp. JU727 99.6 1680 -1
test.PFemirge_59 1188 0.018622 AY547556.1.1532 Bacteria;Actinobacteria;Actinobacteria;Micrococcales;Microbacteriaceae;Microbacterium;Microbacterium sp. CME1 99.7 1508 -1

Taxonomic affiliation of unassembled SSU rRNA reads

Approximate overview of taxonomic composition for reads that did NOT assemble into full-length sequences, based on mapping hits to SILVA SSU rRNA database with BBmap.

Taxonomy summarized at level 4. Only displaying taxa with > 3 reads mapped.

Taxon Reads
Eukaryota;Opisthokonta;Holozoa;Metazoa 9005
Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales 651
Bacteria;Actinobacteria;Actinobacteria;Micrococcales 232
Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales 187
Bacteria;Firmicutes;Bacilli;Lactobacillales 181
Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales 142
Bacteria;Proteobacteria;Gammaproteobacteria;Xanthomonadales 88
Bacteria;Proteobacteria;Alphaproteobacteria;Rhodospirillales 70
Bacteria;Spirochaetae;Spirochaetes;W27 55
Bacteria;Firmicutes;Bacilli;Bacillales 26
Eukaryota;Amoebozoa;Dictyostelia;Dictyostelium 4
Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales 4
Eukaryota;Opisthokonta;Nucletmycea;Fungi 4

Interactive treemap of mapping-based taxonomic read classification

Navigation: Left-click to go down, right-click to go up in taxonomic hierarchy, hover to see counts.

Based on read-mapping hits to reference database, provides an approximate overview of taxonomic composition.

Drawn with Google Visualization API (terms of service)

Please cite...

Gruber-Vodicka H., Pruesse E., Seah B.K.B. 2017. phyloFlash v3.0 beta 1. Online: https://github.com/HRGV/phyloFlash.

Cite dependencies when used