Filter¶
Filters candidate variants called by ProDuSe based upon the following characteristics:
- Minimum molecule counts: The number of molecules supporting a variant at a given position must deviate from the number expected from simple noise in order to call a variant.
- Molecule type: Stronger (duplex, strong bases) molecules will be examined for support (or lack of support) for a variant first. If too many strong molecule do not support a variant when support is expected, the variant will be discarded.
- (If designated) Dual strand support: Both the positive and negative strand must support a variant confidently before it is called.
Parameters¶
-c –config: A configuration file which can supply any of the parameters described below. -i –input: Input VCF file, listing candidate variants and molecule counts. -m –molecule_stats: A tab-delineated file listing total depth and different molecule abundances across the entire capture space. Generated by snv.py. -o –output: Path and name to use for the output VCF file. -sb –strand_bias_threshold: Strand bias p-value threshold. If the strand bias of a variant is below this value, the variant will be discarded. -st –strong_base_threshold: Minimum number of strong bases required to call a variant. Note that this threshold will increase depending on locus and overall library characteristics for each molecule type. -wt –weak_base_threshold: Minimum number of weak bases required to call a variant. Note that this threshold will increase depending on locus and capture space characteristics for each molecule type.
Additional Considerations¶
If the duplicate rate of a library is extremely low, variants will be called based upon weak bases. Thus, a large number of false positive calls may be generated.