Skip to content

Proxy Futures

Last updated 1 November 2023

This guide walks through the use of the Store.future() interface and associated Future.

Note

Some familiarity with ProxyStore is assumed. Check out the Get Started guide and Concepts page to learn more about ProxyStore's core concepts.

Warning

The Store.future() and Future interfaces are experimental features and may change in future releases.

The Future interface enables a data producer to preemptively send a proxy to a data consumer before the target data has been created. The consumer of the target data proxy will block when the proxy is first used and resolved until the producer has created the target data.

Here is a trivial example using a Store and LocalConnector. The future.proxy() method is used to create a Proxy which will resolve to the result of the future.

example.py
from proxystore.connectors.local import LocalConnector
from proxystore.store import Store
from proxystore.store.future import Future

with Store('proxy-future-example', LocalConnector()) as store:
    future: Future[str] = store.future()
    proxy = future.proxy()

    future.set_result('value')
    assert future.result() == 'value'
    assert proxy == 'value'

Info

Not all Connector implementations are compatible with the Store.future() interface. The Connector instance used to initialize the Store must also implement the DeferrableConnector protocol. A NotImplementedError will be raised when calling Store.future() if the connector is not an instance of DeferrableConnector. Many of the out-of-the-box implementations implement the DeferrableConnector protocol such as the EndpointConnector, FileConnector, and RedisConnector.

The power of Future comes when the data producer and consumer are executing independently in time and space (i.e., execution occurs in different processes, potentially on different systems, and in an undefined order). The Future enables the producer and consumer to share a data dependency, while allowing the consumer to eagerly start execution before the data dependencies are fully satisfied.

Consider the following example where we have a client which invokes two functions, foo() and bar() on remote processes. foo() will produce an object needed by bar(), but we want to start executing foo() and bar() at the same time. (We could even start bar() before foo()!)

client.py
from proxystore.connectors.redis import RedisConnector
from proxystore.store import Store
from proxystore.store.future import Future

class MyData:
    ...

def foo(future: Future[MyData]) -> None:
    data: MyData = compute(...)
    future.set_result(data)

def bar(data: MyData) -> None:
    # Computation not involving data can execute freely.
    compute(...)
    # Computation using data will block until foo
    # sets the result of the future.
    compute(data)


with Store('proxy-future-example', RedisConnector(...)) as store:
    future: Future[MyData] = store.future()

    # The invoke_remote function will execute the function with
    # the provided on arguments on an arbitrary remote process.
    foo_result_future = invoke_remote(foo, future)
    bar_result_future = invoke_remote(bar, future.proxy())

    # Wait on the functions to finish executing.
    foo_result_future.result()
    bar_result_future.result()

In this example, foo() and bar() started executing at the same time. This allows bar() to eagerly execute code which does not depend on the data produced by foo(). bar() will only block once the data is needed by the computation.