Concurrency and parallel code execution
{mirai} is a minimalist async evaluation framework for R,
released this week to CRAN.
未来 みらい mirai is Japanese for ‘future’.
It provides an extremely simple and lightweight method for concurrent / parallel code execution.
Whilst frameworks for parallelisation exist for R, {mirai} is designed for simplicity.
The package provides just 2 functions:
eval_mirai()
to evaluate async
call_mirai()
to call the result
{mirai} has a tiny pure R code base, relying on a single package - {nanonext}. {nanonext} itself is a lightweight wrapper for the NNG C library with zero package dependencies.
Background R processes are created and evaluation occurs independently. mirai employs nanonext/NNG as a concurrency framework - a blazing-fast, lightweight solution for moving data between these processes seamlessly. Crucially it provides a true cross-platform abstraction layer across Linux, Windows, MacOS, the BSDs, Solaris, Illumos etc. i.e. everywhere R can go. This means that we can just call the above two functions without worrying about the underlying system implementation.
This means there is no need to specify core counts, devise work plans
and the such beforehand. Also no need to separate writing code that is
ready for parallel execution from how it is ultimately executed. Just
wrap your expressions in eval_mirai()
and run them in
another process.
For scripts, this provides the ultimate control as you can map specific code to a specific process. For example if you have 8 model fits to run, you can send each one to it’s own process. This provides a simpler and more robust solution than leaving it to the system to decide, which also runs the risk of over-optimisation - you may wish to refer to this classic presentation on Python’s GIL (global interpreter lock): http://www.dabeaz.com/python/GIL.pdf. R inherits similar limitations being an interpreted language.
It can be equally handy for interactive work - if you have specified
a model, are now ready to fit it and know this will take an hour to run,
simply eval_mirai()
and have it run in the background
whilst you continue with your work. When you need the results just
call_mirai()
for the return value.
Minimise execution times by performing long-running tasks concurrently in separate processes.
Ensure execution flow of the main process is not blocked.
Multiple long computes (model fits etc.) would take more time than if performed concurrently on available computing cores.
Use eval_mirai()
to evaluate an expression in a separate
R process asynchronously.
A ‘mirai’ object is returned immediately.
mirai <- eval_mirai(rnorm(n) + m, n = 1e8, m = runif(1))
mirai
#> < mirai >
#> ~ use call_mirai() to resolve
Continue running code concurrent to the async operation.
# do more...
Use call_mirai()
to retrieve the evaluated result when
required.
call_mirai(mirai)
mirai
#> < mirai >
#> - $value for evaluated result
str(mirai$value)
#> num [1:100000000] 1.485 -0.804 0.965 -0.128 -0.555 ...
Processing high-frequency real-time data, writing results to file/database can be slow and potentially disrupt the execution flow.
Cache data in memory and use eval_mirai()
to perform
periodic write operations in a separate process.
A ‘mirai’ object is returned immediately.
mirai <- eval_mirai(write.csv(x, file = file), x = rnorm(1e8), file = tempfile())
Use call_mirai()
to confirm the operation has
completed.
call_mirai(mirai)$value
#> NULL
Above, the return value is called directly. NULL is the expected
return value for write.csv()
.
{mirai} website: https://shikokuchuo.net/mirai/
{mirai} on CRAN: https://cran.r-project.org/package=mirai
{nanonext} website: https://shikokuchuo.net/nanonext/
{nanonext} on
CRAN: https://cran.r-project.org/package=nanonext
The following article provides an update: re-introducting mirai
For attribution, please cite this work as
shikokuchuo (2022, Feb. 18). shikokuchuo{net}: Introducing mirai - a minimalist async evaluation framework for R. Retrieved from https://shikokuchuo.net/posts/16-introducing-mirai/
BibTeX citation
@misc{shikokuchuo2022introducing, author = {shikokuchuo, }, title = {shikokuchuo{net}: Introducing mirai - a minimalist async evaluation framework for R}, url = {https://shikokuchuo.net/posts/16-introducing-mirai/}, year = {2022} }