воскресенье, 18 июня 2017 г.

R packages useful for general purpose development

The goal of this article is to provide brief overview of R packages which can be useful for general purpose programming in R. I am going to update this article from time to time to add more useful packages here.

  1. testthat. Write unit tests for R code. Alternatives - RUnit.
  2. roxygen2. Auto generate package documentation from special formatted comments in-line provided with R code.
  3. checkmate. Package to implement fast pre-conditions checks and asserts for your code.
  4. logging. Package to organize consistent logging in your R scripts and packages. The package allows to specify different handlers for logging messages (e.g. to redirect them to file, database, console), different level of logging (e.g. for debug purposes or production mode).
  5. lintr. Static code analyzer for R. Can be integrated with testthat package to check code formatting & style to during package compilation. This allows to enforce common. Alternatives - formatR.
  6. argparse. Package to simplify parsing of command line arguments for R scripts.
  7. cyclocomp. Calculate cyclomatic complexity of R functions. Can be used to calculate count of unit tests required to cover all execution paths within function.
  8. covr. Calculates test coverage for R functions.
  9. TypeInfo. Prototype R package to specify types for function parameters & return values. This may be a good way to fight problems with R weak typing.


Comparing approaches to correlated random numbers generation

Introduction

Correlated random numbers generation is crucial part of market data simulations and thus it is one of the important functions within Monte-Carlo risk engines. The most popular approaches here are usage of Cholesky decomposition, Singular value decomposition, or Eigen decomposition (aka Spectral decomposition). These approaches have their own advantages and disadvantages. In this article I would like to perform small comparison of these methods on real life data.


Approaches to Correlated Random Numbers Generation

In general, correlated random generation consists of two steps:

1. Decomposition of  the correlation matrix C:


2. Then correlated random numbers can be generated by using U matrix as follows:


So, let's imagine that we have variable corr_matrix with correlation matrix:
corr_matrix = matrix(c(1.0, 0.3, 0.6, 0.3, 1.0, 0.4, 0.6, 0.4, 1.0 ), 
nrow = 3, ncol = 3)

And we have matrix rnd with 3 independent series of random numbers:
rnd = matrix(rnorm(10000 * 3), nrow = 10000, ncol = 3)


Then, for Cholesky decomposition this approach looks as follows:
u = chol(corr_matrix) corr_rnd = rnd %*% u

For SVD:
svd_m = svd(corr_matrix) 
u = svd_m$u %*% diag(sqrt(svd_m$d)) %*%
t(svd_m$v)corr_rnd = rnd %*% u

For Eigen decomposition:
e = eigen(corr_matrix, symmetric = T)u = e$vectors %*%
diag(sqrt(e$values))corr_rnd = rnd %*% u