Filter

Filters candidate variants called by ProDuSe based upon the following characteristics:

  • Minimum molecule counts: The number of molecules supporting a variant at a given position must deviate from the number expected from simple noise in order to call a variant.
  • Molecule type: Stronger (duplex, strong bases) molecules will be examined for support (or lack of support) for a variant first. If too many strong molecule do not support a variant when support is expected, the variant will be discarded.
  • (If designated) Dual strand support: Both the positive and negative strand must support a variant confidently before it is called.

Run Using

produse filter

or

python /path/to/ProDuSe/ProDuSe/filter_produse.py

Parameters

-c –config:A configuration file which can supply any of the parameters described below.
-i –input:Input VCF file, listing candidate variants and molecule counts.
-m –molecule_stats:
 A tab-delineated file listing total depth and different molecule abundances across the entire capture space. Generated by snv.py.
-o –output:Path and name to use for the output VCF file.
-sb –strand_bias_threshold:
 Strand bias p-value threshold. If the strand bias of a variant is below this value, the variant will be discarded.
-st –strong_base_threshold:
 Minimum number of strong bases required to call a variant. Note that this threshold will increase depending on locus and overall library characteristics for each molecule type.
-wt –weak_base_threshold:
 Minimum number of weak bases required to call a variant. Note that this threshold will increase depending on locus and capture space characteristics for each molecule type.

Additional Considerations

If the duplicate rate of a library is extremely low, variants will be called based upon weak bases. Thus, a large number of false positive calls may be generated.