R Package

We have implemented our ‘Hierarchical Boosting’ framework as an R package that is freely available for everyone to use. Boosting is a supervised algorithm that estimates logistic regressions (we call it boosting functions) of input variables (summary statistics / selection tests) to maximize the differences between two competing scenarios (e.g. selection vs. neutrality). Our method sequentially applies different boosting functions in a hierarchical classification scheme to classify genomic regions into different selection regimes.

The framework implemented here relies on the results of several selection test that need to be previously computed on simulated and empirical data. We do not provide the software needed to estimate selection tests or run the simulations. High correlation (e.g. r2 > 0.8) between summary statistics or selection tests should be avoided since it will prevent coefficient convergence of the implemented boosting algorithm.

We advice you to read the manuscript (and the manual) before using the package.

Source code:

hierarchicalBoosting_1.0.tar.gz

Manual:

manual.pdf

GitHub repository:

https://github.com/marcpybus/hierarchicalBoosting

Installation (in R):

Install some necessary packages :

install.packages(c("mboost","corpora","gridExtra"),dependencies=T)

Install ‘Hierarchical Boosting’ package (from source):

install.packages("hierarchicalBoosting_1.0.tar.gz", type="source")

Install ‘Hierarchical Boosting’ package (from GitHub):

library(devtools)
dev_mode()
install_github("marcpybus/hierarchicalBoosting")

The provided manual explains in detail how to use the package and contains a easy-to-follow example. Please, run it to check that everything is ok.