Skip to content

Output Files & Sharing Data

Output files

Nomadic produces several files that provide information about the quality of your sequencing run, and the variants that were detected. They can be found in the results directory (results/<expt_name>), and are described briefly below:

summary.read_mapping.csv

The summary.read_mapping.csv contains information about read mapping for each sample.

Each row corresponds to a sample, which can be identified by its barcode (e.g. barcode01). The values provided come from samtools flagstats and are counts of read mapping flags inside of the sample's BAM file. Together they summarise how well your sequencing reads mapped to your reference genome.

Column Description
barcode Sample barcode.
sample_id Sample ID.
n_total Total number of read alignments.
n_mapped Number of mapped read alignments.
n_unmapped Number of unmapped reads.
n_primary Number of reads mapping uniquely.
n_secondary Number of reads mapping to more than one location.
n_supplementary Number of reads mapping as chimeras.

For more information about read mapping, please see Understanding the Dashboard.

summary.region_coverage.csv

The summary.region_coverage.csv file contains information about sequencing coverage over each amplicon in each sample.

Each row contains information about coverage over a specific amplicon in a specific sample. The sample is indicated by its barcode (e.g. barcode01) and the amplicon is indicated by its name (as well as position and length). The values provided come from samtools bedcov.

Column Description
barcode Sample barcode.
sample_id Sample ID.
chrom Chromosome of amplicon.
start Start position of amplicon.
end End position of amplicon.
length Length of amplicon.
name Name of amplicon. This comes from the fourth column of BED file used when running nomadic realtime (e.g. -b flag).
n_reads Number of reads mapping to the amplicon.
mean_cov Mean coverage over the amplicon.
cov_gr100 Number of positions within the amplicon having >100x coverage.
per_cov_gr100 Percentage of positions within the amplicon having >100x coverage.
total_cov Total coverage over amplicon, i.e. mean coverage times length

For more information about region coverage, please see Understanding the Dashboard.

summary.variants.csv

The summary.variants.csvcontains preliminary information about the variants identified in each sample.

Each row contains information about the genotype, depth, quality, and within-sample allele frequency (WSAF) of a specific single-nucleotide polymorphism (SNP) in a specific sample. For all samples the same set of SNPs are described. The set of SNPs described includes all SNPs where at least one sample carried the alternative allele. Note this file is only generated when the nomadic realtime ... --call flag is used, which is the default.

Column Description
barcode Sample barcode.
sample_id Sample ID.
chrom Chromosome of SNP.
pos Position of SNP.
ref Reference nucleotide for SNP.
alt Alternative nucleotide for SNP.
qual Variant quality score of SNP.
mut_type Type of mutation caused by SNP, e.g. synonymous or non-synonymous.
aa_change Amino acid change caused by SNP. For synonymous mutations, we still report (e.g. V380V).
aa_pos Amino acid number containing the SNP.
strand Strand of gene containing the SNP.
amplicon Name of amplicon containing SNP. This comes from the fourth column of the BED file used when running nomadic realtime (e.g. -b flag).
gt Called SNP genotype for the sample. Can be reference (0/0), heterozygous (0/1), homozygous (1/1) or failed QC (./.). Note: these are from bcftools call and assume a diploid genome.
gq SNP genotype quality for the sample. Note this is different than variant quality (qual) as it refers to the quality of the genotype call, rather than whether or not the site is variable.
dp Sequencing depth over the SNP. Equivalent to coverage.
wsaf Within-sample alternative allele frequency (ad_alt / (ad_alt + ad_ref)), where ad_ref and ad_alt are the depths of the reference and alternative allele.

For more information about variant calling, please see Understanding the Dashboard.

Sharing data

The results folder for an experiment (results/<expt_name>) contains all the outputs from Nomadic.

Only the CSV files starting summary and the metadata folder are required to relaunch the dashboard with the nomadic dashboard command (see Basic Usage).