Performance Tracking¶
Last updated 20 April 2024
The Store
can record metrics on executed operations (e.g., get
and put
).
Metric collection is disabled by default and can be enabled by passing metrics=True
to a Store
constructor.
Enabling Metrics¶
- Metric tracking is not enabled by default.
Metrics are accessed via the
Store.metrics
property. This property
will be None
when metrics are disabled.
Warning
Metrics are local to each Store
instance.
In multi-process applications or applications that instantiate multiple
Store
instances,
Store.metrics
will only represent
a partial view of the overall performance.
Warning
ProxyStore v0.6.4 and older have a bug that causes the conversion from
nanoseconds to milliseconds in the
Metrics
class to be incorrect.
This was fixed in v0.6.5 (see
PR #538).
Three types of metrics are collected.
- Attributes: arbitrary attributes associated with an operation.
- Counters: scalar counters that represent the number of times an event occurs.
- Times: durations of events.
A Simple Example¶
Consider executing a get
and put
operation on store
.
We can inspect the metrics recorded for operations on key
.
>>> metrics = store.metrics.get_metrics(key)
>>> tuple(field.name for field in dataclasses.fields(metrics))
('attributes', 'counters', 'times')
metrics
is an instance of Metrics
which
is a dataclass
with three fields:
attributes
, counters
, and times
. We can further inspect these fields.
>>> metrics.attributes
{'store.get.object_size': 219, 'store.put.object_size': 219}
>>> metrics.counters
{'store.get.cache_misses': 1}
>>> metrics.times
{
'store.put.serialize': TimeStats(
count=1, avg_time_ms=9.9, min_time_ms=9.9, max_time_ms=9.9
),
'store.put.connector': TimeStats(
count=1, avg_time_ms=36.9, min_time_ms=36.9, max_time_ms=36.9
),
'store.put': TimeStats(
count=1, avg_time_ms=53.4, min_time_ms=53.4, max_time_ms=53.4
),
'store.get.connector': TimeStats(
count=1, avg_time_ms=16.1, min_time_ms=16.1, max_time_ms=16.1
),
'store.get.deserialize': TimeStats(
count=1, avg_time_ms=7.6, min_time_ms=7.6, max_time_ms=7.6
),
'store.get': TimeStats(
count=1, avg_time_ms=45.6, min_time_ms=45.6, max_time_ms=45.6
),
}
Operations or events are represented by a hierarchical namespace.
E.g., store.get.object_size
is the serialized object size from the call to
Store.get()
.
In metrics.attributes
, we see the serialized object was 219 bytes.
In metrics.counters
, we see we had one cache miss when getting the object.
In metrics.times
, we see statistics about the duration of each operation.
For example, store.get
is the overall time
Store.get()
took, store.get.connector
is
the time spent calling
Connector.get()
, and
store.get.deserialize
is the time spent deserializing the object returned
by Connector.get()
.
If we get the object again, we'll see the metrics change.
>>> store.get(key)
>>> metrics = store.metrics.get_metrics(key)
>>> metrics.counters
{'store.get.cache_hits': 1, 'store.get.cache_misses': 1}
>>> metrics.times['store.get']
TimeStats(count=2, avg_time_ms=24.4, min_time_ms=3.2, max_time_ms=45.6)
store.get
dropped significantly.
Attributes of a TimeStats
instance
can be directly accessed.
Metrics with Proxies¶
Metrics are also tracked on proxy operations.
>>> proxy = store.proxy(target)
# Access the proxy to force it to resolve.
>>> assert target_proxy[0] == 0
>>> metrics = store.metrics.get_metrics(proxy)
>>> metrics.times
{
'factory.call': TimeStats(...)
'factory.resolve': TimeStats(...),
'store.get': TimeStats(...),
'store.get.connector': TimeStats(...),
'store.get.deserialize': TimeStats(...),
'store.proxy': TimeStats(...),
'store.put': TimeStats(...),
'store.put.connector': TimeStats(...),
'store.put.serialize': TimeStats(...),
}
Store.proxy()
internally
called Store.put()
. Accessing the
proxy internally resolved the factory so we also see metrics about the
factory
and store.get
.
Warning
For metrics to appropriately be tracked when a proxy is resolved, the
Store
needs to be registered globally
by setting register=True
in the constructor or by manually registering
with register_store()
. Otherwise,
the factory will initialize a second Store
to register and record its metrics to the second instance.
Metrics for Batch Operations¶
For batch Store
operations, metrics are
recorded for the entire batch. I.e., the batch of keys is treated as a single
super key.
>>> keys = store.put_batch(['value1', 'value2', 'value3'])
>>> metrics = store.metrics.get_metrics(keys)
>>> metrics.times
{
'store.put_batch.serialize': TimeStats(...),
'store.put_batch.connector': TimeStats(...),
'store.put_batch': TimeStats(...)
}
Aggregating Metrics¶
Rather than accessing metrics associated with a specific key (or batched key), time statistics can be aggregated over all keys.
>>> store.metrics.aggregate_times()
{
'factory.call': TimeStats(...),
'factory.resolve': TimeStats(...),
'store.get': TimeStats(...),
'store.get.connector': TimeStats(...),
'store.get.deserialize': TimeStats(...),
'store.proxy': TimeStats(...),
'store.put': TimeStats(...),
'store.put.connector': TimeStats(...),
'store.put.serialize': TimeStats(...),
'store.put_batch': TimeStats(...),
'store.put_batch.connector': TimeStats(...),
'store.put_batch.serialize': TimeStats(...),
}
TimeStats
represents
the aggregate over all keys.
The Python code used to generate the above examples can be found at github.com/proxystore/proxystore/examples/store_metrics.py.