# Paralog diversity¶

Following Innan (Genetics 2003 163:803-810), these statistics are specifically designed to compute nucleotide diversity for each paralog in a multigene family ($$\pi_w$$), as well as between-paralog divergence for all pairs ($$\pi_b$$). They are availabled from the function stats.paralog_pi().

For a given paralog, we have:

$\pi_w = \sum_i^L \frac{2}{n_i (n_i-1)}k_i$

with $$L$$ the number of sites, $$n_i$$ the number of exploitable samples for this paralog at site $$i$$ and $$k_i$$ the number of pairwise differences at this site.

And for a given pair of paralogs:

$\pi_b = \sum_i^L \frac{d_i}{n_{ai} n_{bi}}$

with $$d_i$$ the number of differences between the two paralogs and $$n_{ai}$$ and $$n_{bi}$$ the respective numbers of exploitable samples for the two paralogs at site $$i$$.