Paralog diversityΒΆ
Following Innan (Genetics 2003 163:803-810), these statistics are
specifically designed to compute nucleotide diversity for each paralog
in a multigene family (\(\pi_w\)), as well as between-paralog
divergence for all pairs (\(\pi_b\)). They are availabled from the
function stats.paralog_pi()
.
For a given paralog, we have:
\[\pi_w = \sum_i^L \frac{2}{n_i (n_i-1)}k_i\]
with \(L\) the number of sites, \(n_i\) the number of exploitable samples for this paralog at site \(i\) and \(k_i\) the number of pairwise differences at this site.
And for a given pair of paralogs:
\[\pi_b = \sum_i^L \frac{d_i}{n_{ai} n_{bi}}\]
with \(d_i\) the number of differences between the two paralogs and \(n_{ai}\) and \(n_{bi}\) the respective numbers of exploitable samples for the two paralogs at site \(i\).