bencher.result_collector

Result collection and storage for benchmarking.

This module provides the ResultCollector class for managing benchmark results, including xarray dataset operations, caching, and metadata management.

Attributes

`logger`
`_MEDIA_RESULT_TYPES`

Classes

ResultCollector

Manages benchmark result collection, storage, and caching.

Functions

`_sentinel_for_result_var`(rv)	Return the sentinel value used for 'missing' entries of this result type.
`_null_old_entries`(dataset, rv, var_limit)	Null out over_time entries older than var_limit for a single result variable.
`set_xarray_multidim`(→ xarray.DataArray)	Set a value in a multi-dimensional xarray at the specified index position.
`_set_result_value`(→ None)	Write a single result value, using pre-cached numpy arrays when available.

Module Contents

bencher.result_collector.logger

bencher.result_collector._MEDIA_RESULT_TYPES

bencher.result_collector._sentinel_for_result_var(rv)

Return the sentinel value used for ‘missing’ entries of this result type.

Thin wrapper over the single source of truth in bencher.variables.results (result_missing_fill); kept as a local alias for the over_time aging path.

bencher.result_collector._null_old_entries(dataset, rv, var_limit)

Null out over_time entries older than var_limit for a single result variable.

Mutates *dataset* in-place by writing sentinel values directly into the backing numpy arrays of the affected data variables.

For media types (images, videos, .rrd files), the referenced files are collected for deferred deletion. Returns a list of file paths to delete; the caller is responsible for removing them after the dataset is cached so that a cache-write failure does not leave orphaned sentinel values.

bencher.result_collector.set_xarray_multidim(data_array: xarray.DataArray, index_tuple: tuple[int, Ellipsis], value: Any) → xarray.DataArray

Set a value in a multi-dimensional xarray at the specified index position.

This function sets a value in an N-dimensional xarray using dynamic indexing that works for any number of dimensions.

Parameters:

data_array (xr.DataArray) – The data array to modify
index_tuple (tuple[int, ...]) – The index coordinates as a tuple
value (Any) – The value to set at the specified position

Returns:

The modified data array

Return type:

xr.DataArray

bencher.result_collector._set_result_value(bench_res: bencher.results.bench_result.BenchResult, rv_arrays: dict[str, numpy.ndarray] | None, name: str, idx: tuple, value: Any) → None: Write a single result value, using pre-cached numpy arrays when available.

class bencher.result_collector.ResultCollector(cache_size: int = DEFAULT_CACHE_SIZE_BYTES)

Manages benchmark result collection, storage, and caching.

This class handles the initialization of xarray datasets for storing benchmark results, storing results from worker jobs, managing caches, and adding metadata.

cache_size

Maximum size of the cache in bytes

Type:: int

ds_dynamic

Dictionary for storing unstructured vector datasets

Type:: dict

cache_size = 0

ds_dynamic: dict

_benchmark_cache: diskcache.Cache | None = None

_history_cache: diskcache.Cache | None = None

get_benchmark_cache() → diskcache.Cache: Return the persistent benchmark_inputs Cache, creating it on first access.

get_history_cache() → diskcache.Cache: Return the persistent history Cache, creating it on first access.

close_caches() → None: Close any open cache instances. Safe to call multiple times.

__enter__() → ResultCollector

__exit__(*exc_info) → None

setup_dataset(bench_cfg: bencher.bench_cfg.BenchCfg, time_src: datetime.datetime | str) → tuple[bencher.results.bench_result.BenchResult, zip, list[str], int]

Initialize an n-dimensional xarray dataset from benchmark configuration parameters.

This function creates the data structures needed to store benchmark results based on the provided configuration. It sets up the xarray dimensions, coordinates, and variables based on input variables and result variables.

Parameters:

bench_cfg (BenchCfg) – Configuration defining the benchmark parameters, inputs, and results
time_src (datetime | str) – Timestamp or event name for the benchmark run

Returns:

A BenchResult object with the initialized dataset
A lazy iterator of function input tuples (index, value pairs)
A list of dimension names for the dataset
The total number of jobs (Cartesian product size)

Return type:

tuple[BenchResult, zip, list[str], int]

define_extra_vars(bench_cfg: bencher.bench_cfg.BenchCfg, repeats: int, time_src: datetime.datetime | str) → list[bencher.variables.inputs.IntSweep]

Define extra meta variables for tracking benchmark execution details.

This function creates variables that aren’t passed to the worker function but are stored in the n-dimensional array to provide context about the benchmark, such as the number of repeat measurements and timestamps.

Parameters:

bench_cfg (BenchCfg) – The benchmark configuration to add variables to
repeats (int) – The number of times each sample point should be measured
time_src (datetime | str) – Either a timestamp or a string event name for temporal tracking

Returns:

A list of additional parameter variables to include in the benchmark

Return type:

list[IntSweep]

static precompute_result_arrays(bench_res: bencher.results.bench_result.BenchResult) → dict[str, numpy.ndarray]

Pre-fetch the underlying numpy arrays for all result variables.

This avoids repeated xarray Dataset.__getitem__ lookups (which trigger _construct_dataarray) during the per-job store loop. The returned arrays are views into the dataset, so writes go directly into bench_res.ds.

store_results(job_result: bencher.job.JobFuture, bench_res: bencher.results.bench_result.BenchResult, worker_job: bencher.worker_job.WorkerJob, bench_run_cfg: bencher.bench_cfg.BenchRunCfg, rv_arrays: dict[str, numpy.ndarray] | None = None) → None

Store the results from a benchmark worker job into the benchmark result dataset.

This method handles unpacking the results from worker jobs and placing them in the correct locations in the n-dimensional result dataset. It supports different types of result variables including scalars, vectors, references, and media.

Parameters:

job_result (JobFuture) – The future containing the worker function result
bench_res (BenchResult) – The benchmark result object to store results in
worker_job (WorkerJob) – The job metadata needed to index the result
bench_run_cfg (BenchRunCfg) – Configuration for how results should be handled
rv_arrays (dict, optional) – Pre-computed numpy arrays from precompute_result_arrays(). Falls back to dataset lookup if None.

Raises:

RuntimeError – If an unsupported result variable type is encountered

cache_results(bench_res: bencher.results.bench_result.BenchResult, bench_cfg_hash: str, bench_cfg_hashes: list[str]) → None

Cache benchmark results for future retrieval.

This method stores benchmark results in the disk cache using the benchmark configuration hash as the key. It temporarily removes non-pickleable objects from the benchmark result before caching.

Parameters:

bench_res (BenchResult) – The benchmark result to cache
bench_cfg_hash (str) – The hash value to use as the cache key
bench_cfg_hashes (list[str]) – List to append the hash to (modified in place)

_load_history_record(cache: diskcache.Cache, bench_cfg_hash: str) → dict | None

Fetch and normalize one history record, or None when absent/unreadable.

Bare xr.Dataset values (hand-seeded or pre-record entries) are wrapped into the record shape with no column metadata, which the reconciler treats as adopt-in-place.

load_history_cache(dataset: xarray.Dataset, bench_cfg_hash: str, clear_history: bool, max_time_events: int | None = None, result_vars: list | None = None, *, on_history_reset: str = 'warn', bench_name: str | None = None, tag: str | None = None, config_summary: dict | None = None) → xarray.Dataset

Load, reconcile, and persist historical benchmark data.

The history key excludes result variables, so the stored record is a superset of every column ever measured under this benchmark’s input space; result-var differences are reconciled per column (retained, retired, resumed, or born — see bencher.history) and consumers receive a projection onto exactly the current result_vars columns. If clear_history is True, existing history is ignored (a fresh series starts and is written back).

Parameters:

dataset (xr.Dataset) – Freshly calculated benchmark data for the current run
bench_cfg_hash (str) – History key — the benchmark identity hash computed with include_result_vars=False
clear_history (bool) – If True, clears historical data instead of loading it
max_time_events (int | None) – Maximum number of over_time events to retain. Oldest events are trimmed. None means unlimited.
result_vars (list | None) – Result variable instances defining the served columns. Also used for per-variable max_time_events aging. When None, column reconciliation and projection are skipped entirely.
on_history_reset (str) – Policy for loss-y schema events — “warn”, “error” (raise HistoryResetError before persisting), or “ignore”.
bench_name (str | None) – Benchmark name for the last-seen index; enables full-reset detection when the history key moves.
tag (str | None) – Benchmark tag for the last-seen index.
config_summary (dict | None) – bencher.history.config_summary of the current config, stored in the last-seen index and diffed on resets.

Returns:

The current config’s view of the accumulated history —: historical plus current data, projected onto the current columns.

Return type:

xr.Dataset

add_metadata_to_dataset(bench_res: bencher.results.bench_result.BenchResult, input_var: Any) → None

Add variable metadata to the xarray dataset for improved visualization.

This method adds metadata like units, long names, and descriptions to the xarray dataset attributes, which helps visualization tools properly label axes and tooltips.

Parameters:

bench_res (BenchResult) – The benchmark result object containing the dataset to display
input_var – The variable to extract metadata from

report_results(bench_res: bencher.results.bench_result.BenchResult, print_xarray: bool, print_pandas: bool) → None

Display the calculated benchmark data in various formats.

This method provides options to display the benchmark results as xarray data structures or pandas DataFrames for debugging and inspection.

Parameters:

bench_res (BenchResult) – The benchmark result containing the dataset to display
print_xarray (bool) – If True, log the raw xarray Dataset structure
print_pandas (bool) – If True, log the dataset converted to a pandas DataFrame