1) Setup inputs

All mess commands take as input one or multiple tables.

In this guide we are using mess test to execute all the workflow steps using a minimal example (shown below)

minimal_test.tsv

taxon   nb  cov_sim sample
staphylococcus_aureus   1   0.1 sample1
1290    1   0.1 sample2

The input file contains:

Taxon/accession and nb columns for genome download
Third column to calculate genome coverage for read simulation.
sample column to map genomes to their respective sample

Other examples

Table

See coverage calculation for more details

Example

basesreadstax_abundanceseq_abundancecoverage

taxon	nb	bases
1280	1	28213610
pseudomonas_aeruginosa	1	62644040

taxon	nb	reads
1280	1	94045
pseudomonas_aeruginosa	1	208813

taxon	nb	tax_abundance
1280	1	0.5
pseudomonas_aeruginosa	1	0.5

taxon	nb	seq_abundance
1280	1	0.32
pseudomonas_aeruginosa	1	0.68

taxon	nb	cov_sim
1280	1	10
pseudomonas_aeruginosa	1	10

Directory

If you want to simulate multiple samples at once, you can point to a directory with multiple tables (one for each sample, with the sample name in the file name).

Example

📂sequencing_run
┣ 📜sample1.tsv
┣ 📜sample2.tsv
┗ 📜sample3.tsv

Or you can aggregate all sample info in one table

Example

taxon	nb	cov_sim	sample
staphylococcus_aureus	1	10	sample1
1290	1	10	sample2
562	1	10	sample3