Globus Compute with ProxyStore¶
Last updated 20 April 2024
This guide walks through integrating ProxyStore into a
Globus Compute application.
A more complete example of using ProxyStore with Globus Compute can be found
in the examples/
.
Note
Some familiarity with using Globus Compute and ProxyStore is assumed. Check out the Globus Compute Quickstart and ProxyStore Get Started to learn more.
Installation¶
Create a new virtual environment of your choosing and install Globus Compute and ProxyStore.
Note
The below versions represent the latest versions of these packages available when this guide was written. These instructions should generally work with newer versions as well.
$ python -m venv venv
$ . venv/bin/activate
$ pip install globus-compute-sdk==2.18.1 globus-compute-endpoint==2.18.1 proxystore==0.6.5
Using Globus Compute¶
We will first configure and start a Globus Compute endpoint.
$ globus-compute-endpoint configure proxystore-example
$ globus-compute-endpoint start proxystore-example
After configuring the endpoint, you will get back an endpoint UUID which we will need in the next step.
Below is a modified example based on the example Globus Compute app from the Quickstart guide.
- Your endpoint's UUID.
- Define the function that will be executed remotely.
- Create the Globus Compute executor.
- Submit the function for execution.
- Wait on the result future.
Running this script will return 50000
.
Using ProxyStore¶
Now we will update our script to use ProxyStore. This takes three steps:
- Initialize a
Connector
andStore
. TheConnector
is the interface to the byte-level communication channel that will be used, and theStore
is the high-level interface provided by ProxyStore. - Register the
Store
instance globally. This is not strictly necessary, but is an optimization which enables proxies to share the same originalStore
instance, because theStore
andConnector
can have state (e.g., caches, open connections, etc.). - Proxy the function inputs.
- Create a new store using the file system for mediated communication. Register the store instance so states (e.g., caches, etc.) can be shared.
- Proxy the input data.
- Close the
Store
to cleanup any resources.
Tip
The Store
can also be used as a context
manager that will automatically clean up resources.
We can also use ProxyStore to return data via the same communication method.
example.py | |
---|---|
- Globus Compute functions will be executed in a different process so we must import inside the function.
- If our input data was communicated via a proxy, we get the same
Store
that create our input proxy which we then use to proxy the output.
Closing Thoughts¶
While this example is trivial, the target function is still executed on
the local machine and the data sizes are small, the key takeaway is that
the Proxy
model simplifies the process of moving
data via alternate means between the Globus Compute client and executors.
More complex applications where the Globus Compute endpoints live elsewhere
(e.g., on an HPC) cluster or that move larger data will benefit from the
various Connector
implementations provided.
Checkout the other Guides to learn about more advanced ProxyStore features.