Skip to content

proxystore.serialize

Serialization functions.

BytesLike

Bases: Buffer, Sized, Protocol

Protocol for bytes-like objects.

SerializationError

Bases: Exception

Base Serialization Exception.

is_bytes_like

is_bytes_like(obj: Any) -> TypeGuard[BytesLike]

Check if the object is bytes-like.

Source code in proxystore/serialize.py
def is_bytes_like(obj: Any) -> TypeGuard[BytesLike]:
    """Check if the object is bytes-like."""
    if sys.version_info >= (3, 12):  # pragma: >=3.12 cover
        return isinstance(obj, BytesLike)
    else:  # pragma: <3.12 cover
        return isinstance(obj, (bytes, bytearray, memoryview))

serialize

serialize(obj: Any) -> bytes

Serialize object.

Objects are serialized with different mechanisms depending on their type.

Parameters:

  • obj (Any) –

    Object to serialize.

Returns:

Raises:

  • SerializationError

    If serializing the object fails with all available serializers. Cloudpickle is the last resort, so this error will typically be raised from a cloudpickle error.

Source code in proxystore/serialize.py
def serialize(obj: Any) -> bytes:
    """Serialize object.

    Objects are serialized with different mechanisms depending on their type.

      - [bytes][] types are not serialized.
      - [str][] types are encoded to bytes.
      - [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html){target=_blank}
        types are serialized using
        [numpy.save](https://numpy.org/doc/stable/reference/generated/numpy.save.html){target=_blank}.
      - [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html){target=_blank}
        types are serialized using
        [to_pickle](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html){target=_blank}.
      - [polars.DataFrame](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/index.html){target=_blank}
        types are serialized using
        [write_ipc](https://docs.pola.rs/api/python/stable/reference/api/polars.DataFrame.write_ipc.html){target=_blank}.
      - Other types are
        [pickled](https://docs.python.org/3/library/pickle.html){target=_blank}.
        If pickle fails,
        [cloudpickle](https://github.com/cloudpipe/cloudpickle){target=_blank}
        is used as a fallback.

    Args:
        obj: Object to serialize.

    Returns:
        Bytes-like object that can be passed to \
        [`deserialize()`][proxystore.serialize.deserialize].

    Raises:
        SerializationError: If serializing the object fails with all available
            serializers. Cloudpickle is the last resort, so this error will
            typically be raised from a cloudpickle error.
    """
    last_exception: Exception | None = None
    for identifier, serializer in _SERIALIZERS.items():
        if serializer.supported(obj):
            try:
                buffer = io.BytesIO()
                buffer.write(identifier + b'\n')
                serializer.serialize(obj, buffer)
                return buffer.getvalue()
            except Exception as e:
                last_exception = e

    assert last_exception is not None
    raise SerializationError(
        f'Object of type {type(obj)} is not supported.',
    ) from last_exception

deserialize

deserialize(buffer: BytesLike) -> Any

Deserialize object.

Warning

Pickled data is not secure, and malicious pickled object can execute arbitrary code when upickled. Only unpickle data you trust.

Parameters:

Returns:

  • Any

    The deserialized object.

Raises:

  • ValueError

    If buffer is not bytes-like.

  • SerializationError

    If the identifier of data is missing or invalid. The identifier is prepended to the string in serialize() to indicate which serialization method was used (e.g., no serialization, pickle, etc.).

  • SerializationError

    If pickle or cloudpickle raise an exception when deserializing the object.

Source code in proxystore/serialize.py
def deserialize(buffer: BytesLike) -> Any:
    """Deserialize object.

    Warning:
        Pickled data is not secure, and malicious pickled object can execute
        arbitrary code when upickled. Only unpickle data you trust.

    Args:
        buffer: Bytes-like object produced by
            [`serialize()`][proxystore.serialize.serialize].

    Returns:
        The deserialized object.

    Raises:
        ValueError: If `buffer` is not bytes-like.
        SerializationError: If the identifier of `data` is missing or
            invalid. The identifier is prepended to the string in
            [`serialize()`][proxystore.serialize.serialize] to indicate which
            serialization method was used (e.g., no serialization, pickle,
            etc.).
        SerializationError: If pickle or cloudpickle raise an exception
            when deserializing the object.
    """
    if not is_bytes_like(buffer):
        raise ValueError(
            f'Expected data to be a bytes-like type, not {type(buffer)}.',
        )

    with io.BytesIO(buffer) as buffer_io:
        identifier = buffer_io.readline().strip()
        if identifier not in _SERIALIZERS:
            raise SerializationError(
                f'Unknown identifier {identifier!r} for deserialization.',
            )

        serializer = _SERIALIZERS[identifier]
        try:
            return serializer.deserialize(buffer_io)
        except Exception as e:
            raise SerializationError(
                'Failed to deserialize object using the '
                f'{serializer.name} serializer.',
            ) from e