Next: DiversityStatistics, Previous: SiteFrequencySpectrum, Up: Descriptive [Contents]
Compute various statistics describing sequence polymorphism. Two aligned sets of “species” are compared, and the number of polymorphic / fixed sites are computed:
FNumber of sites fixed in the two sets
FFNumber of sites fixed in the two sets, yet with a distinct state
PNumber of sites that are polymorphic in the two sets
FPNumber of sites that are fixed in set 1 but polymorhic in set 2
PFNumber of sites that are fixed in set 2 but polymorphic in set 1
Positions containing a gap or an unresolved character in one set are considered ambiguous. Such positions are counted separately in the following quantities:
XNumber of sites unresolved in the two sets
FXNumber of sites fixed in set 1 and unresolved in set 2
XFNumber of sites fixed in set 2 and unresolved in set 1
PXNumber of sites polymorphic in set 1 and unresolved in set 2
XPNumber of sites polymorphic in set 2 and unresolved in set 1
maf.filter= \
[...],
SequenceStatistics( \
statistics=(\ \
[...],
SiteFrequencySpectrum( \
bounds=(-0.5, 0.5, 1.5), \
ingroup=(pop1, pop2, pop3), \
outgroup=species2, \
[...]), \
ref_species=pop1, \
file=data.statistics.csv), \
[...]
|
species1={list}A list of species for set 1.
species2={list}A list of species for set 2.
Note that the “species” terminology relates to multispecies alignments, as originally implemented in the MultiZ aligner. These statistics will however be most relevant when the aligned sequences are actually from individuals from the same population / species. The term “species” is here therefore to be taken in terchnical terms (a sequence id in the alignment), and not biological.