GenWin

GenWin

GenWin is an R package that defines window or bin boundaries for the analysis of genomic data. Boundaries are based on the inflection points of a cubic smoothing spline fitted to the raw data. Along with defining boundaries, a technique to evaluate results obtained from unequally-sized windows is provided. Applications are particularly pertinent for, though not limited to, genome scans for selection based on variability between populations (e.g. using Wright’s fixations index, Fst, which measures variability in subpopulations relative to the total population).

GenWin is available on CRAN, the Comprehensive R Archive Network.



D'2_IS

Ohta.D.Stats

The Ohta.D.Stats R function can be implemented to calculate Tomoko Ohta’s partitioning of linkage disequilibrium, deemed D-statistics, for pairs of loci. The code is written so that it can be scaled-up to form a genome-wide test, by implementing the function repeatedly across pairs of loci in a genotype table. See our Heredity paper for an example of this function in action.




Useful Scripts

DriftSimulator.R is an R function for conducting simulations of genetic drift at a single locus. Initial frequency, number of generations, and population demographics can all be manipulated, and plotting is simple. Documentation is in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

DriftSimulatorWithBottlenecks.R is very similar to the above R function for conducting simulations of genetic drift at a single locus, but also enables the user to specify a bottleneck event. Documentation is in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

VectorFst.R is a simple R function that can be used to calculate locus-by-locus \(F_{ST}\) values from allele frequency data. Basic documentation is included in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

ModifiedRogersDistanceFunction.R is a basic function for calculating the modified Roger’s genetic distance between individuals. The calculation is simple, but I’m not aware of other implementations in R. Apply to a dataframe with individuals in rows and markers in columns. There should be two columns per marker (one column for each allele), coded as 0, 1, or 2.