An example R-package using the SPLITT library to calculate the likelihood of a phylogenetic mixed model (PMM).

Travis build status Coverage status CRAN_Status_Badge Downloads

PMMUsingSPLITT

The goal of PMMUsingSPLITT is to provide a minimal example of how to use the SPLITT C++ library in an R-package. The package implements parallelized log-likelihood calculation of the univariate phylogenetic mixed model (PMM). The PMM is used in the comparative analysis of biological data originating from a set of living and/or extinct species to estimate the rate of phenotypic evolution resulting from genetic drift. The calculation of the log-likelihood of the model parameters, given a phylogenetic tree and trait data at the tips, is done using a quadratic polynomial representation of the log-likelihood function described in the article ‘Parallel Likelihood Calculation for Phylogenetic Comparative Models: the ‘SPLITT’ C++ Library’. A preprint of the article is available from https://doi.org/10.1101/235739. The package provides an implementation in R (function ‘AbcPMMLogLik’) as well as a parallel implementation in C++ (function ‘AbcPMMLogLikCpp’) based on the ‘SPLITT’ library for serial and parallel lineage traversal of trees (https://venelin.github.io/SPLITT/index.html). The function ‘MiniBenchmark’ allows to compare the calculation times for different tree sizes. See this guide for a tutorial.

Installation

You can install the released version of PMMUsingSPLITT from CRAN with:

install.packages("PMMUsingSPLITT")

Examples

  • Calculating the likelihood of the PMM for a given tree, data and model parameters.
library(ape)
library(PMMUsingSPLITT)
#> Loading required package: Rcpp

set.seed(10)

N <- 1000
x0 <- 0.1
alpha <- 1
theta <- 10
sigma2 <- 0.25
sigmae2 <- 1

tree <- rtree(N)

g <- rTraitCont(tree, model = "OU", root.value = x0,
                alpha = alpha, sigma = sqrt(sigma2),
                ancestor = FALSE)

x <- g + rnorm(n = N, mean = 0, sd = sqrt(sigmae2))

cat("logLikelihood using R:", AbcPMMLogLik(x, tree, x0, sigma2, sigmae2), "\n")
#> logLikelihood using R: -1551.016
cat("logLikelihood using R:", AbcPMMLogLikCpp(x, tree, x0, sigma2, sigmae2), "\n")
#> logLikelihood using R: -1551.016
  • Performing a benchmark to measure the likelihood calculation times using different parallelization strategies:
# N specifies the size of the phylogenetic tree. 
# Ntests specifies the number of executions in one time measurement 
# (the more the better, but also slower).
MiniBenchmark(N = 100, Ntests = 100)
#> Performing a mini-benchmark of the PMM log-likelihood calculation with 
#>       a tree of size N= 100 ;
#> Calling each likelihood calculation Ntests= 100  times ...
#> CPU:  Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz 
#> OpenMP version:  201107 
#> Number of threads: 8 
#> Measuring calculation times...
#>    model                                            mode time.ms
#> 1    PMM                                      R (serial)    2.10
#> 2    PMM                                      C++ (AUTO)    0.04
#> 3    PMM              C++ (SINGLE_THREAD_LOOP_POSTORDER)    0.02
#> 4    PMM                 C++ (SINGLE_THREAD_LOOP_PRUNES)    0.02
#> 5    PMM                 C++ (SINGLE_THREAD_LOOP_VISITS)    0.02
#> 6    PMM                  C++ (MULTI_THREAD_LOOP_PRUNES)    0.23
#> 7    PMM                  C++ (MULTI_THREAD_LOOP_VISITS)    0.40
#> 8    PMM C++ (MULTI_THREAD_LOOP_VISITS_THEN_LOOP_PRUNES)    0.40
#> 9    PMM                  C++ (MULTI_THREAD_VISIT_QUEUE)    2.28
#> 10   PMM     C++ (MULTI_THREAD_LOOP_PRUNES_NO_EXCEPTION)    0.56
#> 11   PMM                        C++ (HYBRID_LOOP_PRUNES)    0.63
#> 12   PMM                        C++ (HYBRID_LOOP_VISITS)    0.39
#> 13   PMM       C++ (HYBRID_LOOP_VISITS_THEN_LOOP_PRUNES)    0.40
MiniBenchmark(N = 1000, Ntests = 10)
#> Performing a mini-benchmark of the PMM log-likelihood calculation with 
#>       a tree of size N= 1000 ;
#> Calling each likelihood calculation Ntests= 10  times ...
#> CPU:  Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz 
#> OpenMP version:  201107 
#> Number of threads: 8 
#> Measuring calculation times...
#>    model                                            mode time.ms
#> 1    PMM                                      R (serial)    21.0
#> 2    PMM                                      C++ (AUTO)     0.1
#> 3    PMM              C++ (SINGLE_THREAD_LOOP_POSTORDER)     0.1
#> 4    PMM                 C++ (SINGLE_THREAD_LOOP_PRUNES)     0.1
#> 5    PMM                 C++ (SINGLE_THREAD_LOOP_VISITS)     0.2
#> 6    PMM                  C++ (MULTI_THREAD_LOOP_PRUNES)     0.9
#> 7    PMM                  C++ (MULTI_THREAD_LOOP_VISITS)     0.6
#> 8    PMM C++ (MULTI_THREAD_LOOP_VISITS_THEN_LOOP_PRUNES)     2.6
#> 9    PMM                  C++ (MULTI_THREAD_VISIT_QUEUE)    16.5
#> 10   PMM     C++ (MULTI_THREAD_LOOP_PRUNES_NO_EXCEPTION)     0.7
#> 11   PMM                        C++ (HYBRID_LOOP_PRUNES)     1.0
#> 12   PMM                        C++ (HYBRID_LOOP_VISITS)     0.1
#> 13   PMM       C++ (HYBRID_LOOP_VISITS_THEN_LOOP_PRUNES)     0.1
MiniBenchmark(N = 10000, Ntests = 10)
#> Performing a mini-benchmark of the PMM log-likelihood calculation with 
#>       a tree of size N= 10000 ;
#> Calling each likelihood calculation Ntests= 10  times ...
#> CPU:  Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz 
#> OpenMP version:  201107 
#> Number of threads: 8 
#> Measuring calculation times...
#>    model                                            mode time.ms
#> 1    PMM                                      R (serial)   121.0
#> 2    PMM                                      C++ (AUTO)     0.6
#> 3    PMM              C++ (SINGLE_THREAD_LOOP_POSTORDER)     0.8
#> 4    PMM                 C++ (SINGLE_THREAD_LOOP_PRUNES)     0.6
#> 5    PMM                 C++ (SINGLE_THREAD_LOOP_VISITS)     0.8
#> 6    PMM                  C++ (MULTI_THREAD_LOOP_PRUNES)     0.3
#> 7    PMM                  C++ (MULTI_THREAD_LOOP_VISITS)     0.6
#> 8    PMM C++ (MULTI_THREAD_LOOP_VISITS_THEN_LOOP_PRUNES)     1.7
#> 9    PMM                  C++ (MULTI_THREAD_VISIT_QUEUE)   155.3
#> 10   PMM     C++ (MULTI_THREAD_LOOP_PRUNES_NO_EXCEPTION)     0.2
#> 11   PMM                        C++ (HYBRID_LOOP_PRUNES)     0.4
#> 12   PMM                        C++ (HYBRID_LOOP_VISITS)     1.5
#> 13   PMM       C++ (HYBRID_LOOP_VISITS_THEN_LOOP_PRUNES)     1.1