Skip to contents

Torch Integration

Custom serialization functions may be registered to handle external pointer type reference objects.

This allows tensors from the torch package to be used seamlessly in ‘mirai’ computations.

Setup Steps

  1. Register the serialization and unserialization functions as a list supplied to serialization(), specifying ‘class’ as ‘torch_tensor’ and ‘vec’ as TRUE.

  2. Set up dameons - this may be done before or after setting serialization().

  3. Use everywhere() to make the torch package available on all daemons for convenience (optional).

library(mirai)
library(torch)

serialization(refhook = list(torch:::torch_serialize, torch::torch_load),
              class = "torch_tensor",
              vec = TRUE)
daemons(1)
#> [1] 1
everywhere(library(torch))

Example Usage

The below example creates a convolutional neural network using torch::nn_module().

A set of model parameters is also specified.

The model specification and parameters are then passed to and initialized within a ‘mirai’.

model <- nn_module(
  initialize = function(in_size, out_size) {
    self$conv1 <- nn_conv2d(in_size, out_size, 5)
    self$conv2 <- nn_conv2d(in_size, out_size, 5)
  },
  forward = function(x) {
    x <- self$conv1(x)
    x <- nnf_relu(x)
    x <- self$conv2(x)
    x <- nnf_relu(x)
    x
  }
)

params <- list(in_size = 1, out_size = 20)

m <- mirai(do.call(model, params), model = model, params = params)

call_mirai(m)$data
#> An `nn_module` containing 1,040 parameters.
#> 
#> ── Modules ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#> • conv1: <nn_conv2d> #520 parameters
#> • conv2: <nn_conv2d> #520 parameters

The returned model is an object containing many tensor elements.

m$data$parameters$conv1.weight
#> torch_tensor
#> (1,1,.,.) = 
#>   0.1682  0.1571  0.0880  0.1221 -0.0901
#>   0.1598  0.0815  0.1215  0.0138 -0.1608
#>  -0.0636  0.1779 -0.0090 -0.0544  0.0573
#>   0.0618 -0.1297  0.1649 -0.0272 -0.1092
#>  -0.1108  0.0749 -0.1585 -0.1256  0.1463
#> 
#> (2,1,.,.) = 
#>   0.1313 -0.1575 -0.0257  0.1861 -0.0727
#>  -0.1925  0.1271  0.0439 -0.0086 -0.1037
#>   0.0715  0.0691  0.1181 -0.1048  0.0228
#>   0.0806  0.1871 -0.0866 -0.1398  0.1804
#>   0.1763 -0.1477 -0.1943  0.1088 -0.1668
#> 
#> (3,1,.,.) = 
#>   0.1692  0.1886  0.0159 -0.0523 -0.1667
#>  -0.0769  0.1226 -0.1706  0.0817 -0.1353
#>  -0.1113 -0.0852  0.0073 -0.0749 -0.1621
#>   0.0164 -0.0800  0.1728 -0.1919  0.1623
#>   0.0064 -0.0127  0.0406  0.1909 -0.0757
#> 
#> (4,1,.,.) = 
#>   0.1336 -0.0816 -0.1282 -0.0812  0.0911
#>  -0.1254 -0.1583 -0.1503 -0.1723  0.0713
#>  -0.0995  0.0038  0.1080 -0.0095 -0.1887
#>   0.0842  0.0617 -0.0039 -0.0643 -0.0611
#>  -0.1795  0.0070 -0.0341 -0.0909 -0.1070
#> 
#> (5,1,.,.) = 
#>   0.0207 -0.0057 -0.1417 -0.0102  0.1322
#> ... [the output was truncated (use n=-1 to disable)]
#> [ CPUFloatType{20,1,5,5} ][ requires_grad = TRUE ]

It is usual for model parameters to then be passed to an optimiser.

This can also be initialized within a ‘mirai’ process.

optim <- mirai(optim_rmsprop(params = params), params = m$data$parameters)

call_mirai(optim)$data
#> <optim_rmsprop>
#>   Inherits from: <torch_optimizer>
#>   Public:
#>     add_param_group: function (param_group) 
#>     clone: function (deep = FALSE) 
#>     defaults: list
#>     initialize: function (params, lr = 0.01, alpha = 0.99, eps = 1e-08, weight_decay = 0, 
#>     load_state_dict: function (state_dict, ..., .refer_to_state_dict = FALSE) 
#>     param_groups: list
#>     state: State, R6
#>     state_dict: function () 
#>     step: function (closure = NULL) 
#>     zero_grad: function () 
#>   Private:
#>     step_helper: function (closure, loop_fun)

daemons(0)
#> [1] 0

Above, tensors and complex objects containing tensors were passed seamlessly between host and daemon processes, in the same way as any other R object.

The custom serialization in mirai leverages R’s own native ‘refhook’ mechanism to allow such completely transparent usage. Designed to be fast and efficient, data copies are minimised and the ‘official’ serialization methods from the torch package are used directly.