Article Image
Article Image
read

The long awaited promises will be released soon!

Can't wait to play with promises :)

Being as impatient as I am when it comes to new technology, I decided to play with currently available implementation of promises that Joe Cheng shared and presented recently in London at EARL conference.

From this article you’ll get to know the upcoming promises package, how to use it and how it is different from the already existing future package.

Promises/Futures are a concept used in almost every major programming language. We’ve used Tasks in C#, Futures in Scala, Promises in Javascript and they all adhere to a common understanding of what a promise is.

If you are not familiar with the concept of Promises, asynchronous tasks or Futures, I advise you to take a longer moment and dive into the topic. If you’d like to dive deeper and achieve a higher level of understanding, read about Continuation Monads in Haskell. We’ll be comparing the new promises package with the future package, which has been around for a while so I suggest you take a look at it https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html first if you haven’t used it before.

Citing Joe Cheng, our aim is to:

  1. Execute long-running code asynchronously on separate thread.
  2. Be able to do something with the result (if success) or error (if failure), when the task completes, back on the main R thread.

A promise object represents the eventual result of an async task. A promise is an R6 object that knows:

  1. Whether the task is running, succeeded, or failed
  2. The result (if succeeded) or error (if failed)

Without further ado, let’s get our hands on the code! You should be able to just copy-paste code into RStudio and run it.

R is single threaded. This means that user cannot interact with your shiny app if there is a long running task being executed on the server. Let’s take a look at an example:

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- longRunningFunction(1)
b <- longRunningFunction(2)
print("User interaction")                  # Time: 10s
c <- longRunningFunction(10)
print(a)
print(b)
sumAC <- a + c
sumBC <- b + c
print("User interaction")                  # Time: 15s
print(sumAC + sumBC)
print("User interaction")                  # Time: 15s

We’ll use a simplified version of user interaction while there are some additional computations happening on the server. Let’s assume that we can’t just put all the computations in a separate block of code and just run it separately using the future package. There are many cases when it is very difficult or even almost impossible to just gather all computations and run them elsewhere as one big long block of code.

User cannot interact with the app for 10 seconds until the computations are finished and then the user has to wait another 5 seconds for next interaction. This is not a place where we would like to be in. User interactions should be as fast as possible and the user shouldn’t have to wait if it is not required. Let’s fix that using R future package that we know.

install.packages("future")
library(future)
plan(multiprocess)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- future(longRunningFunction(1))
b <- future(longRunningFunction(2))
print("User interaction")                  # Time: 0s
c <- future(longRunningFunction(10))
print(value(a))
print(value(b))
sumAC <- value(a) + value(c)
sumBC <- value(b) + value(c)
print("User interaction")                  # Time: 5s
print(sumAC + sumBC)
print("User interaction")                  # Time: 5s

Nice, now the first user interaction can happen in parallel! But the second interaction is still blocked - we have to wait for the values, to print their sum. In order to fix that we’d like to chain the computation into the summing function instead of waiting synchronously for the result. We can’t do that using pure futures though (assuming we can’t just put all these computations in one single block of code and run it in parallel). Ideally we’d like to be able to write code similar to the one below:

library(future)
plan(multiprocess)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- future(longRunningFunction(1))
b <- future(longRunningFunction(2))
print("User interaction")                  # Time: 0s
c <- future(longRunningFunction(10))
future(print(value(a)))
future(print(value(b)))
sumAC <- future(value(a) + value(c))
sumBC <- future(value(b) + value(c))
print("User interaction")                  # Time: 0s
future(print(value(sumAC) + value(sumBC)))
print("User interaction")                  # Time: 0s

Unfortunately future package won’t allow us to do that.


What we can do, is use the promises package from RStudio!

devtools::install_github("rstudio/promises")

Let’s play with the promises! I simplified our example to let us focus on using promises first:

library(future)
plan(multiprocess)

library(tibble)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- future(longRunningFunction(tibble(number = 1:100)))

print(value(a))

print("User interaction")                  # Time: 5s

We’d like to chain the result of longRunningFunction to a print function so that once the longRunningFunction is finished, its results are printed.

We can achieve that by using %…>% operator. It works like the very popular %>% operator from magrittr. Think of %...>% as “sometime in the future, once I have the result of the operation, pass the result to the next function”. The three dots symbolise the fact that we have to wait and that the result will be passed in future, it’s not happening now.

library(future)
plan(multiprocess)
library(promises)
library(tibble)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- future(longRunningFunction(tibble(number = 1:100)))

a %...>%
  print()                                  # Time: 5s

print("User interaction")                  # Time: 0s

Pure magic.

But what if I want to filter the result first and then print the processed data? Just keep on chaining:

library(future)
plan(multiprocess)
library(promises)
library(tibble)
library(dplyr)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}


a <- future(longRunningFunction(tibble(number = 1:100)))

a %...>%
  filter(number %% 2 == 1) %...>%
  sum() %...>%
  print()

print("User interaction")

Neat. But, how can I print the result of filtering and pass it to the sum function? There is a tee operator, the same as the one magrittr provides (but one that operates on a promise). It will pass the result of the function to the next function. If you chain it further, it will not pass the result of print() function but previous results. Think of it as splitting a railway, printing the value on a side track and ending the run, then getting back to the main track:

library(future)
plan(multiprocess)
library(promises)
library(tibble)
library(dplyr)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}


a <- future(longRunningFunction(tibble(number = 1:100)))

a %...>%
  filter(number %% 2 == 1) %...T>%
  print() %...>%
  sum() %...>%
  print()

print("User interaction")

What about errors? They are being thrown somewhere else than in the main thread, how can I catch them? You guessed it - there is an operator for that as well. Use %...!% to handle errors:

library(future)
plan(multiprocess)
library(promises)
library(tibble)
library(dplyr)

longRunningFunction <- function(value) {
  stop("ERROR")
  return(value)
}


a <- future(longRunningFunction(tibble(number = 1:100)))

a %...>%
  filter(number %% 2 == 1) %...T>%
  print() %...>%
  sum() %...>%
  print() %...!%
  (function(error) {
     print(paste("Unexpected error: ", error$message))
  })

print("User interaction")

But in our example we’re not just chaining one computation. There is a longRunningFunction call that eventually returns 1 and another one that eventually returns 2. We need to somehow join the two. Once both of them are ready, we’d like to use them and return the sum. We can use promise_all function to achieve that. It takes a list of promises as an argument and returns a promise that eventually resolves to a list of results of each of the promises.

Perfect. We know the tools that we can use to chain asynchronous functions. Let’s use them in our example then:

library(future)
plan(multiprocess)
library(promises)
library(purrr)

longRunningFunction <- function(value) {
  Sys.sleep(5)
  return(value)
}

a <- future(longRunningFunction(1))
b <- future(longRunningFunction(2))
print("User interaction")                  # Time: 0s
c <- future(longRunningFunction(10))
a %...>% print()
b %...>% print()
sumAC <- promise_all(a, c) %...>% reduce(`+`)
sumBC <- promise_all(b, c) %...>% reduce(`+`)
print("User interaction")                  # Time: 0s
promise_all(sumAC, sumBC) %...>% reduce(`+`) %...>% print()
print("User interaction")                  # Time: 0s

A task for you - in line sumAC <- promise_all(a, c) %...>% reduce(+), print the list of values from promises a and c before they are summed up.

Handful of useful information:

[1] There is support for promises implemented in shiny but neither CRAN nor GitHub master branch versions of Shiny support promises. Until support is merged, you’ll have to install from async branch:

devtools::install_github("rstudio/shiny@async")

[2] Beta-quality code at https://github.com/rstudio/promises

[3] Early drafts of docs temporarily hosted at: https://medium.com/@joe.cheng

[4] Joe Cheng talk on EARL 2017 in London - https://www.dropbox.com/s/2gf6tfk1t345lyf/async-talk-slides.pdf?dl=0

[5] The plan is to release everything on CRAN by end of this year.

I hope you have as much fun playing with the promises as I did! I’m planning to play with shiny support for promises next.


Till next time!

🎉 Subscribe to our mailing list

Blog Logo

Damian Rodziewicz


Published

Image

Appsilon Data Science Blog

How to create and use technology to deliver business results.

Back to the top