Skip to content

1) Setup inputs

All mess commands take as input one or multiple tables.

In this guide we are using mess test to execute all the workflow steps using a minimal example (shown below)

minimal_test.tsv
taxon   nb  cov_sim sample
staphylococcus_aureus   1   0.1 sample1
1290    1   0.1 sample2

The input file contains:

  • Taxon/accession and nb columns for genome download
  • Third column to calculate genome coverage for read simulation.
  • sample column to map genomes to their respective sample

Other examples

Table

See coverage calculation for more details

Example

taxon nb bases
1280 1 28213610
pseudomonas_aeruginosa 1 62644040
taxon nb reads
1280 1 94045
pseudomonas_aeruginosa 1 208813
taxon nb tax_abundance
1280 1 0.5
pseudomonas_aeruginosa 1 0.5
taxon nb seq_abundance
1280 1 0.32
pseudomonas_aeruginosa 1 0.68
taxon nb cov_sim
1280 1 10
pseudomonas_aeruginosa 1 10

Directory

If you want to simulate multiple samples at once, you can point to a directory with multiple tables (one for each sample, with the sample name in the file name).

Example

📂sequencing_run
┣ 📜sample1.tsv
┣ 📜sample2.tsv
┗ 📜sample3.tsv

Or you can aggregate all sample info in one table

Example

taxon nb cov_sim sample
staphylococcus_aureus 1 10 sample1
1290 1 10 sample2
562 1 10 sample3