Skip to content

Species composition

5 samples from the human microbiome project were were classified with kraken2 and bracken. Taxa with at least at 200 reads were kept and used as input to both MeSS and CAMISIM.

Use this nextflow pipeline to generate the fastqs.

Results

microViz was used for the ordination plots and statistical tests.

Bray-curtis

bray

Samples from the same bodysite cluster together. In addition, simulated samples cluster well with real samples (gold_standard and gs_filtered).

PERMANOVA

Null hypothesis : No significant difference in species composition between simulated and non simulated samples

Code
perm <- dist_permanova(mdist,
    variables = "origin:simulated+body_site",
    n_perms = 999, 
    n_processes = 3
)
                 Df SumOfSqs      R2       F Pr(>F)
body_site         3   12.153 0.37843 15.6933  0.001 ***
origin:simulated  3    1.117 0.03479  1.4429  0.067 .
Residual         73   18.844 0.58678
Total            79   32.115 1.00000

Significant difference between body sites. No significant difference between simulated and real samples

Beta dispersion

Null hypothesis : No significant difference in dispersion between samples of different origin

Fit: aov(formula = distances ~ group, data = df)

$group
                                   diff         lwr        upr     p adj
gs_filtered-gold_standard  2.249163e-03 -0.03593552 0.04043384 0.9986690
camisim-gold_standard     -2.310968e-02 -0.06129435 0.01507500 0.3905351
mess-gold_standard        -2.308946e-02 -0.06127414 0.01509522 0.3913195
camisim-gs_filtered       -2.535884e-02 -0.06354352 0.01282584 0.3082419
mess-gs_filtered          -2.533862e-02 -0.06352330 0.01284606 0.3089344
mess-camisim               2.021632e-05 -0.03816446 0.03820490 1.0000000

No significant difference between filtered and non-filtered samples, simulated and real samples.

Conclusions

  • Same species composition between original and filtered samples
  • Same species composition between MeSS and CAMISIM