Efficient R: do.call / lapply

R

A distinctive coding style

shikokuchuo
05-28-2021

                                                            sha256
1 2074954df14cc65b017b3e9d4b291353151672d450f2b623acc2a5d253767e42

Use case

The use of the do.call / lapply() combination is a powerful way to leverage functional programming in R. In short, you write a function that performs some actions and apply it to a list of inputs, which can then be fed into a function that combines everything into a single object.

Let us take an example, where we would like to calculate the ichimoku clouds for a selection of the major world stock indices, but also preserve the volume data, all in one tidy object.

We use the ‘ichimoku’ package1 which not only draws the ichimoku clouds, but also provides an interface to the OANDA fxTrade API which is a rich source of high-quality financial data (free but requires registration).

Setup

We could set it up as per the below:

library(ichimoku)

tickers <- c("DE30_EUR", "JP225_USD", "SPX500_USD", "UK100_GBP")

process <- function(x, from, to) {
  # Use ichimoku::oanda() to retrieve data from the OANDA fxTrade API
  pxdata <- oanda(x, from = from, to = to)
  # Extract volume column
  volume <- pxdata$volume
  # Calculate the cloud by calling ichimoku::ichimoku()
  cloud <- ichimoku(pxdata, ticker = x)
  # Return a list of ticker, ichimoku cloud object, volume data
  list(x, cloud, volume)
}

We now want to apply our function to each element of ‘tickers’ in turn, and then for the results to be combined.

Loop

One way to achieve this would be to iterate over ‘tickers’ using a loop:

# Define a list to contain the loop output, specifying the length in advance as good practice
portfolio <- vector(mode = "list", length = length(tickers))

# Loop over each element in 'tickers' and save in pre-defined list
for (i in seq_along(tickers)) {
  portfolio[[i]] <- process(tickers[i], from = "2015-09-03", to = "2016-06-30")
}

# Create output matrix by calling rbind on each element of the list
portfolio <- do.call(rbind, portfolio)

portfolio
     [,1]         [,2]          [,3]       
[1,] "DE30_EUR"   ichimoku,2808 integer,209
[2,] "JP225_USD"  ichimoku,2856 integer,213
[3,] "SPX500_USD" ichimoku,2856 integer,213
[4,] "UK100_GBP"  ichimoku,2808 integer,209

This approach takes 3-4 lines of code.

Furthermore, ‘i’ remains as a leftover object in the global environment.

Somewhat messy.

do.call / lapply

Instead we can use a do.call / lapply() combination to achieve the same result in one line:

portfolio <- do.call(rbind, lapply(tickers, process, from = "2015-09-03", to = "2016-06-30"))

portfolio
     [,1]         [,2]          [,3]       
[1,] "DE30_EUR"   ichimoku,2808 integer,209
[2,] "JP225_USD"  ichimoku,2856 integer,213
[3,] "SPX500_USD" ichimoku,2856 integer,213
[4,] "UK100_GBP"  ichimoku,2808 integer,209

There are also no intermediate objects generated that clutter the global environment.

To explain:

The use of do.call / lapply() provides for a far more succinct and distinctive coding style.

The added bonus is that of the ‘apply’ family of functions, lapply() is almost always the fastest and most performant as the output type is fixed and it does not try to do things with names or simplify the output structure.

For a more structured format rather than a list, lapply() can be fed into a do.call() with:

The use of this type of combination is of particular benefit in programming where both performance and predictability of output types is paramount.

Tidy data output

portfolio
     [,1]         [,2]          [,3]       
[1,] "DE30_EUR"   ichimoku,2808 integer,209
[2,] "JP225_USD"  ichimoku,2856 integer,213
[3,] "SPX500_USD" ichimoku,2856 integer,213
[4,] "UK100_GBP"  ichimoku,2808 integer,209

‘portfolio’ is a tidy matrix with a row for each ticker, and a column for each data type.

We can easily access any element of the matrix by specifying its index value, for example the ichimoku object for the S&P 500 Index by [3,2]. In the example below we run autostrat() on this object:

autostrat(portfolio[3, 2][[1]], n = 1)
                       [,1]               
Strategy               "cloudT > kijun"   
---------------------  "----------"       
Strategy cuml return % 7.76               
Per period mean ret %  0.0554             
Periods in market      74                 
Total trades           3                  
Average trade length   24.67              
Trade success %        100                
Worst trade ret %      0.81               
---------------------  "----------"       
Benchmark cuml ret %   3.7                
Per period mean ret %  0.0269             
Periods in market      135                
---------------------  "----------"       
Direction              "long"             
Start                  2015-12-21 22:00:00
End                    2016-06-29 22:00:00
Ticker                 "SPX500_USD"       

2


  1. Gao, C. (2021), ichimoku: Visualization and Tools for Ichimoku Kinko Hyo Strategies. R package version 1.1.0, https://CRAN.R-project.org/package=ichimoku.↩︎

  2. Further examples: Youngju Nielsen of Sungkyunkwan University uses do.call / lapply to good effect in her course https://www.coursera.org/learn/the-fundamental-of-data-driven-investment/↩︎

Citation

For attribution, please cite this work as

shikokuchuo (2021, May 28). shikokuchuo{net}: Efficient R: do.call / lapply. Retrieved from https://shikokuchuo.net/posts/09-docall-lapply/

BibTeX citation

@misc{shikokuchuo2021efficient,
  author = {shikokuchuo, },
  title = {shikokuchuo{net}: Efficient R: do.call / lapply},
  url = {https://shikokuchuo.net/posts/09-docall-lapply/},
  year = {2021}
}