1) Setup inputs
All mess commands take as input one or multiple tables.
In this guide we are using mess test
to execute all the workflow steps using a minimal example (shown below)
The input file contains:
- Taxon/accession and nb columns for genome download
- Third column to calculate genome coverage for read simulation.
- sample column to map genomes to their respective sample
Other examples
Table
See coverage calculation for more details
Example
taxon | nb | bases |
---|---|---|
1280 | 1 | 28213610 |
pseudomonas_aeruginosa | 1 | 62644040 |
taxon | nb | reads |
---|---|---|
1280 | 1 | 94045 |
pseudomonas_aeruginosa | 1 | 208813 |
taxon | nb | tax_abundance |
---|---|---|
1280 | 1 | 0.5 |
pseudomonas_aeruginosa | 1 | 0.5 |
taxon | nb | seq_abundance |
---|---|---|
1280 | 1 | 0.32 |
pseudomonas_aeruginosa | 1 | 0.68 |
taxon | nb | cov_sim |
---|---|---|
1280 | 1 | 10 |
pseudomonas_aeruginosa | 1 | 10 |
Directory
If you want to simulate multiple samples at once, you can point to a directory with multiple tables (one for each sample, with the sample name in the file name).
Or you can aggregate all sample info in one table
Example
taxon | nb | cov_sim | sample |
---|---|---|---|
staphylococcus_aureus | 1 | 10 | sample1 |
1290 | 1 | 10 | sample2 |
562 | 1 | 10 | sample3 |