Get Started¶
Figure 1: ProxyStore allows developers to communicate objects via proxies. Proxies act as lightweight references that resolve to a target object upon use. Communication via proxies gives applications the illusion that objects are moving through a specified path (e.g., through a network socket, cloud server, workflow engine, etc.) while the true path the data takes is different. Transporting the lightweight proxies through the application or systems can be far more efficient and reduce overheads.
Overview¶
ProxyStore provides a unique interface to object stores through transparent object proxies that is designed to simplify the use of object stores for transferring large objects in distributed applications.
Proxies are used to intercept and redefine operations on a target object. A transparent proxy behaves identically to its target object because the proxy forwards all operations on itself to the target. A lazy proxy provides just-in-time resolution of the target object via a factory function. Factories return the target object when called, and a proxy, initialized with a factory, will delay calling the factory to retrieve the target object until the first time the proxy is accessed.
ProxyStore uses lazy transparent object proxies as the interface to object stores. When an object is proxied, the object is placed in the specified object store, a factory containing the information needed to retrieve the object from the store is created, and a proxy, initialized with the factory, is returned. The resulting proxy is essentially a lightweight reference to the target that will resolve itself to the target and behave as the target once the proxy is first used. Thus, proxies can be used anywhere in-place of the true object and will resolve themselves without the program being aware.
ProxyStore provides the proxy interface to a number of commonly used object
stores as well as the Proxy
and
Factory
building blocks to allow developers
to create powerful just-in-time resolution functionality for Python objects.
Usage¶
ProxyStore is intended to be used via the
Store
interface which provide the
Store.proxy()
method for placing objects
in stores and creating proxies that will resolve to the associated object in
the store.
A Store
is initialized with a
Connector
which serves as the
low-level interface to an byte-level object store.
ProxyStore provides many
Connector
implementations and
third-party code can provide custom implementations provided they meet the
Connector
protocol
specification.
The following example uses the
RedisConnector
to interface
with an already running Redis server using proxies.
- A registered store can be retrieved by name.
- Stores have basic get/put functionality.
- Place an object in the store and return a proxy.
- The proxy, when used, will behave as the target.
This proxy, p
, can be cheaply serialized and communicated to any
arbitrary Python process as if it were the target object itself. Once the
proxy is used on the remote process, the underlying factory function will
be executed to retrieve the target object from the Redis server.
Using the Store
store interface allows
developers to write code without needing to worry about how data communication
is handled and reduces the number of lines of code that need to be changed
when adding or changing the communication methods.
For example, if you want to execute a function and the input data may be passed directly, via a key to an object in Redis, or as a filepath to a serialized object on disk, you will need boilerplate code that looks like:
This function is hard to type and must be extended every time a new communication method is used. With proxies, all of the boilerplate code can be removed because the proxy will contain within itself all of the necessary code to resolve the object.
- Always true even if input is a proxy.
In this model, only the producer of the data needs to be aware of which ProxyStore backend to use, and no modification to consumer code are ever required.
How is this more efficient?
The ProxyStore model can improve application performance in many ways:
- Unused proxies are not resolved so not resources/time were wasted on the communication.
- Object communication always takes place between the producer, the store, and the consumer meaning communication is not wasted on intermediate processes which have a proxy but do not use it.
- Different backends can be used that are optimized for specific usage patterns.
- Proxies have built-in caching for frequently used objects.
See the Concepts to learn more!