bencher

Attributes

`SUBSAMPLING_DIVISIONS_SAMPLES`
`SCALAR_RESULT_TYPES`
`DEFAULT_CACHE_SIZE_BYTES`
`VideoResult`
`_DEPRECATED_ALIASES`

Exceptions

`RegressionError`	Raised when regression detection finds regressions and regression_fail is True.
`HistoryResetError`	Raised when history is reset or loses a column and on_history_reset='error'.

Classes

`Bench`	A server for display plots of benchmark results
`BenchCfg`	Complete configuration for a benchmark protocol.
`BenchRunCfg`	Configuration class for benchmark execution parameters.
`ShowMode`	Display mode for benchmark reports.
`BenchRunner`	A class to manage running multiple benchmarks in groups, or running the same benchmark but at multiple resolutions.
`Chrome`	Optional page header content (title, provenance, and CI nav links).
`ReportLayout`	Where per-benchmark artifacts live under the reports directory.
`ScorecardConfig`	Project-specific inputs to the scorecard renderer.
`BenchPlotServer`	A server for display plots of benchmark results
`IntSweep`	A class representing a parameter sweep for integer values.
`FloatSweep`	A class representing a parameter sweep for floating point values.
`StringSweep`	A class representing a parameter sweep for string values.
`EnumSweep`	A class representing a parameter sweep for enum values.
`BoolSweep`	A class representing a parameter sweep for boolean values.
`SweepBase`	Base `Parameter` type to hold any type of Python object.
`YamlSweep`	Sweep over configurations stored in a YAML file.
`TimeSnapshot`	A class to capture a time snapshot of benchmark values. Time is represent as a continuous value i.e a datetime which is converted into a np.datetime64. To represent time as a discrete value use the TimeEvent class. The distinction is because holoview and plotly code makes different assumptions about discrete vs continuous variables
`ResultFloat`	A class to represent continuous float result variables and the desired optimisation direction.
`ResultVar`	Deprecated: use ResultFloat instead.
`ResultBool`	A result type for binary outcomes (success/failure, pass/fail, reachable/unreachable).
`ResultVec`	A class to represent fixed size vector result variable
`ResultHmap`	A class to represent a holomap return type.
`ResultPath`	Parameter that can be set to a string specifying the path of a file.
`ResultVideo`	Parameter that can be set to a string specifying the path of a file.
`ResultImage`	Parameter that can be set to a string specifying the path of a file.
`ResultString`	A String Parameter with optional regular expression (regex) validation.
`ResultContainer`	Base `Parameter` type to hold any type of Python object.
`ResultRerun`	Result type for rerun .rrd spatial visualizations.
`ResultReference`	Use this class to save arbitrary objects that are not picklable or native to panel. You can pass a container callback that takes the object and returns a panel pane to be displayed
`ResultVolume`	Base `Parameter` type to hold any type of Python object.
`OptDir`	StrEnum is a Python `enum.Enum` that inherits from `str`. The default
`ResultDataSet`	Base `Parameter` type to hold any type of Python object.
`ComposeType`	StrEnum is a Python `enum.Enum` that inherits from `str`. The default
`ComposableContainerBase`	A base class for renderer backends. A composable renderer
`PaneLayout`	Controls how multi-dimensional data is laid out in panel displays.
`ComposableContainerVideo`	A base class for renderer backends. A composable renderer
`RenderCfg`	Configuration class for video rendering options.
`ComposableContainerPanel`	A base class for renderer backends. A composable renderer
`ComposableContainerDataset`	A base class for renderer backends. A composable renderer
`BoxWhiskerResult`	A class for creating box and whisker plots from benchmark results.
`ViolinResult`	A class for creating violin plots from benchmark results.
`ScatterResult`	A class for creating scatter plots from benchmark results.
`ScatterJitterResult`	A class for creating scatter jitter plots from benchmark results.
`BarResult`	A class for creating bar chart visualizations from benchmark results.
`LineResult`	A class for creating line plot visualizations from benchmark results.
`CurveResult`	A class for creating curve plots with optional standard-deviation spread.
`HeatmapResult`	A class for creating heatmap visualizations from benchmark results.
`BandResult`	Percentile band plot showing distribution spread over a continuous axis.
`SurfaceResult`	A class for creating 3D surface plots from benchmark results.
`TabulatorResult`
`TableResult`
`VolumeResult`
`HistogramResult`
`ExplorerResult`
`DataSetResult`
`RegressionResult`	Result of regression detection for a single variable.
`RegressionReport`	Aggregates regression results for all variables in a benchmark.
`MethodCells`	Per-method rendering of a single regression result.
`HistoryEvent`	One schema-affecting event detected while loading over_time history.
`PerfTracker`	Context-manager based phase timer.
`PerfReport`	Collection of phase timings from a benchmark run.
`VarRange`	A VarRange represents the bounded and unbounded ranges of integers. This class is used to define filters for various variable types. For example by defining cat_var = VarRange(0,0), calling matches(0) will return true, but any other integer will not match. You can also have unbounded ranges for example VarRange(2,None) will match to 2,3,4... up to infinity. for By default the lower and upper bounds are set to -1 so so that no matter what value is passed to matches() will return false. Matches only takes 0 and positive integers.
`PlotFilter`	A class for representing the types of results a plot is able to represent.
`ParametrizedSweep`	Parent class for all Sweep types that need a custom hash
`ParametrizedSweepSingleton`	A minimal per-subclass singleton for ParametrizedSweep.
`SampleOrder`	Controls the sampling traversal order for plot_sweep.
`CacheDirStats`	Statistics for a single cache or media directory.
`CacheStats`	Aggregate cache statistics.
`BenchResult`	Contains the results of the benchmark and has methods to cast the results to various datatypes and graphical representations
`OptimizeResult`	Wraps an `optuna.Study` with bencher-friendly accessors.
`PaneResult`
`ReduceType`	Generic enumeration.
`HoloviewResult`
`BenchReport`	A server for display plots of benchmark results
`GithubPagesCfg`
`Publisher`	Generic publisher protocol for benchmark reports.
`Executors`	Enumeration of available execution strategies for benchmark jobs.
`SweepTimings`	Timing data for a single run_sweep() call.
`VideoWriter`
`ClassEnum`	A string-based enum class that maps enum values to corresponding class instances.
`ExampleEnum`	An example implementation of ClassEnum.
`BenchData`	Frozen value type handed to plot plugins. The stable public contract surface for
`CacheHandle`	Plugin-accessible memoization surface. Bencher core supplies a concrete handle;
`PlotPlugin`	Stable public contract for plot plugins.
`PluginRegistry`	In-process registry of plot plugins, keyed by (name, backend).
`RunMeta`

Functions

`render_report`(→ pathlib.Path)	Render a collected result to an HTML report.
`save_result`(→ pathlib.Path)	Persist a collected `BenchResult` to path via pickle.
`load_result`(→ bencher.results.bench_result.BenchResult)	Load a `BenchResult` previously written by `save_result()`.
`result_to_dict`(→ dict)	Build the stable, JSON-serializable contract for a single result.
`result_to_json`(→ pathlib.Path)	Write `result_to_dict()` for bench_res to path as JSON.
`series_for_var`(→ list[dict])	Per-time-event mean/std/n for a scalar result var across the over_time axis.
`compare_results`(→ dict)	Diff two independently-collected results into an A/B comparison contract.
`comparison_to_json`(→ pathlib.Path)	Write `compare_results()` for the two results to path as JSON.
`sparkline_svg`(→ str)	Return an inline SVG sparkline: ±std band, mean line, a node per run.
`generate_scorecard`(→ pathlib.Path)	Render the scorecard for all summaries under reports_dir.
`hash_sha1`(→ str)	A hash function that avoids the PYTHONHASHSEED 'feature' which returns a different hash value each time the program is run.
`box`(→ FloatSweep)	Create a FloatSweep parameter centered around a value with a given width.
`p`(→ dict[str, ...)	Deprecated: use `bn.sweep()` instead.
`sweep`(→ dict[str, ...)	Create a parameter specification for use in plot_sweep input_vars.
`with_subsampling_divisions`(→ list)	Apply subsampling_divisions-based sampling to a list of values.
`curve`(→ holoviews.Curve)
`hmap_canonical_input`(→ tuple)	From a dictionary of kwargs, return a hashable representation (tuple) that is always the same for the same inputs and retains the order of the input arguments. e.g, {x=1,y=2} -> (1,2) and {y=2,x=1} -> (1,2). This is used so that keywords arguments can be hashed and converted the the tuple keys that are used for holomaps
`get_nearest_coords`(→ dict)	Find the nearest coordinates in an xarray dataset based on provided coordinate values.
`make_namedtuple`(→ collections.namedtuple)	Convenience method for making a named tuple
`gen_path`(→ str)	Generate a path for a file in the cache directory.
`gen_image_path`(→ str)	Generate a unique path for an image file in the cache directory.
`gen_video_path`(→ str)	Generate a unique path for a video file in the cache directory.
`gen_rerun_data_path`(→ str)	Generate a unique path for a rerun data file in the cache directory.
`lerp`(→ float)	Linear interpolation between two ranges.
`tabs_in_markdown`(→ str)	Given a string with tabs in the form convert the to &ensp; which is a double space in markdown
`publish_file`(→ str)	Publish a file to an orphan git branch:
`github_content`(remote, branch_name, filename)
`publish_and_view_rrd`(file_path, remote, branch_name, ...)
`rrd_to_pane`(url[, width, height, version])	Display an .rrd file from a URL using the hosted rerun web viewer.
`rrd_file_to_pane`(file_path[, width, height, ...])	Create a rerun viewer pane from an .rrd file path.
`run_file_server`([directory, port])	Start a background HTTP file server (daemon thread).
`method_cells`(→ MethodCells)	Build the per-method cell bundle for a `RegressionResult`.
`git_time_event`(→ str)	Return a time-event label combining wall-clock time and short commit hash.
`cache_stats`(→ CacheStats)	Collect statistics for all managed caches and media directories.
`print_cache_stats`(→ None)	Print a human-readable cache statistics summary.
`clear_all`(→ None)	Remove the entire cache directory tree.
`clear_media`(→ tuple[int, int])	Delete all files in media directories.
`clean_orphaned_media`(→ tuple[list[str], int])	Find and optionally delete per-job-key media dirs with no cache entry.
`cleanup_job_media`(→ int)	Delete the per-job-key media directories for job_key.
`ensure_cache_version`(→ None)	Check the cache version file; clear everything on mismatch.
`add_image`(→ str)	Creates a file on disk from a numpy array and returns the created image path
`create_bench`(→ bencher.bencher.Bench)	Create a Bench instance from a ParametrizedSweep.
`create_bench_runner`(→ bencher.bench_runner.BenchRunner)	Create a BenchRunner instance from a ParametrizedSweep.
`run`(→ list[bencher.bench_cfg.BenchCfg])	Run a benchmark target with sensible defaults.
`get_registry`(→ PluginRegistry)
`plot_plugin`(...)	Wrap a function as a plot plugin and (by default) register it with the global
`register_plugin`(→ bencher.plugins.plugin.PlotPlugin)
`unregister_plugin`(→ None)
`__getattr__`(name)

Package Contents

class bencher.Bench(bench_name: str | None = None, worker: Callable | bencher.variables.parametrised_sweep.ParametrizedSweep | None = None, worker_input_cfg: bencher.variables.parametrised_sweep.ParametrizedSweep | None = None, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, report: bencher.bench_report.BenchReport | None = None)

Bases: bencher.bench_plot_server.BenchPlotServer

A server for display plots of benchmark results

bench_name = None

cache_size = 0

_worker_mgr

_executor

_collector

run_cfg = None

results = []

bench_cfg_hashes = []

last_run_cfg = None

input_vars = None

result_vars = None

const_vars = None

plot_callbacks = []

plot = True

close() → None: Close sample and collector caches so on-disk resources are released.

property sample_cache: Access the sample cache from the executor (for backward compatibility).

property ds_dynamic: Access the dynamic dataset from the collector (for backward compatibility).

add_plot_callback(callback: Callable[[bencher.results.bench_result.BenchResult], panel.panel], **kwargs) → None

Add a plotting callback to be called on benchmark results.

This method registers a plotting function that will be automatically called on any BenchResult produced when running a sweep. You can pass additional arguments to the plotting function using keyword arguments.

Parameters:

callback (Callable[[BenchResult], pn.panel]) – A function that takes a BenchResult and returns a panel object. For example: BenchResult.to_video_grid
**kwargs – Additional keyword arguments to pass to the callback function

Examples

>>> bench.add_plot_callback(BenchResult.to_video_grid, width=800)

set_worker(worker: Callable | bencher.variables.parametrised_sweep.ParametrizedSweep, worker_input_cfg: bencher.variables.parametrised_sweep.ParametrizedSweep | None = None) → None

Set the benchmark worker function and its input configuration.

This method sets up the worker function to be benchmarked. The worker can be either a callable function that takes a ParametrizedSweep instance or a ParametrizedSweep instance with a __call__ method. In the latter case, worker_input_cfg is not needed.

Parameters:

worker (Callable | ParametrizedSweep) – Either a function that will be benchmarked or a ParametrizedSweep instance with a __call__ method. When a ParametrizedSweep is provided, its __call__ method becomes the worker function.
worker_input_cfg (ParametrizedSweep, optional) – The class defining the input parameters for the worker function. Only needed if worker is a function rather than a ParametrizedSweep instance. Defaults to None.

Raises:

RuntimeError – If worker is a class type instead of an instance.

sweep_sequential(title: str = '', input_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, result_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, const_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, optimise_var: bencher.variables.parametrised_sweep.ParametrizedSweep | None = None, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, group_size: int = 1, iterations: int = 1, relationship_cb: Callable | None = None, plot_callbacks: list[Callable] | bool | None = None, aggregate: bool | int | list[str] | None = None, agg_fn: str = 'mean') → list[bencher.results.bench_result.BenchResult]

Run a sequence of benchmarks by sweeping through groups of input variables.

This method performs sweeps on combinations of input variables, potentially using the optimal value from each sweep as constants for the next iteration.

Parameters:

title (str, optional) – Base title for all the benchmark sweeps. Defaults to “”.
input_vars (list[ParametrizedSweep], optional) – Input variables to sweep through. If None, defaults to all input variables from the worker class instance.
result_vars (list[ParametrizedSweep], optional) – Result variables to collect. Defaults to None.
const_vars (list[ParametrizedSweep], optional) – Variables to keep constant. Defaults to None.
optimise_var (ParametrizedSweep, optional) – Variable to optimize on each sweep iteration. The optimal value will be used as constant input for subsequent sweeps. Defaults to None.
run_cfg (BenchRunCfg, optional) – Run configuration. Defaults to None.
group_size (int, optional) – Number of input variables to sweep together in each run. Defaults to 1.
iterations (int, optional) – Number of optimization iterations to perform. Defaults to 1.
relationship_cb (Callable, optional) – Function to determine how to group variables for sweeping. Defaults to itertools.combinations if None.
plot_callbacks (list[Callable] | bool, optional) – Callbacks for plotting or bool to enable/disable. Defaults to None.

Returns:

A list of results from all the sweep runs

Return type:

list[BenchResult]

plot_sweep(title: str | None = None, input_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, result_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, const_vars: list[bencher.variables.parametrised_sweep.ParametrizedSweep] | None = None, time_src: datetime.datetime | None = None, description: str | None = None, post_description: str | None = None, pass_repeat: bool = False, tag: str = '', run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, plot_callbacks: list[Callable] | bool | None = None, sample_order: bencher.sample_order.SampleOrder = SampleOrder.INORDER, aggregate: bool | int | list[str] | None = None, agg_fn: str = 'mean', auto_plot: bool | None = None) → bencher.results.bench_result.BenchResult

The all-in-one function for benchmarking and results plotting.

This is the main function for performing benchmark sweeps. It handles all the setup, execution, and visualization of benchmarks based on the input parameters.

When input_vars, result_vars, and const_vars are all None (the default), bencher auto-discovers all sweep inputs and result variables from the ParametrizedSweep class definition. This means a bare bench.plot_sweep() call with no arguments will sweep every input and collect every result.

Parameters:

title (str, optional) – The title of the benchmark. If None, a title will be generated based on the input variables. Defaults to None.
input_vars (list[ParametrizedSweep], optional) – Variables to sweep through in the benchmark. If None and worker_class_instance exists, auto-discovers all input sweep variables from the class. Defaults to None.
result_vars (list[ParametrizedSweep], optional) – Variables to collect results for. If None and worker_class_instance exists, auto-discovers all result variables from the class. Defaults to None.
const_vars (list[ParametrizedSweep], optional) – Variables to keep constant with specified values. If None and worker_class_instance exists, uses default input values. Defaults to None.
time_src (datetime, optional) – The timestamp for the benchmark. Used for time-series benchmarks. Defaults to None, which will use the current time.
description (str, optional) – A description displayed before the benchmark plots. Defaults to None.
post_description (str, optional) – A description displayed after the benchmark plots. Defaults to a generic message recommending to set a custom description.
pass_repeat (bool, optional) – If True, passes the ‘repeat’ parameter to the worker function. Defaults to False.
tag (str, optional) – Tag to group different benchmarks together. Defaults to “”.
run_cfg (BenchRunCfg, optional) – Configuration for how the benchmarks are run. If None, uses the instance’s run_cfg or creates a default one. Defaults to None.
plot_callbacks (list[Callable] | bool, optional) – Callbacks for plotting results. If True, uses default plotting. If False, disables plotting. If a list, uses the provided callbacks. Defaults to None.
sample_order (SampleOrder, optional) – Controls the traversal order of sampling only. Defaults to SampleOrder.INORDER. Plotting and dataset dimension order are unchanged.
auto_plot (bool, optional) – Whether to build the holoviews/panel report immediately after the sweep. None (default) respects run_cfg.auto_plot (itself True by default), so behaviour is unchanged unless a caller opts out. False collects samples and computes regression detection WITHOUT constructing any plotting objects — the returned BenchResult is fully populated (dataset + regression_report) and can be rendered later, in a separate process, via bencher.render_report(). Useful when the collecting process holds foreign C-extension state (e.g. ROS/rclpy) that makes in-process holoviews/bokeh garbage collection unsafe. See also Bench.collect(). Because None defers to run_cfg, setting run_cfg.auto_plot = False once disables plotting for every plot_sweep call that uses that config — including calls nested inside benchmark functions you don’t control.

Returns:

An object containing all the benchmark data and results

Return type:

BenchResult

Raises:

RuntimeError – If an unsupported input variable type is provided
TypeError – If variable parameters are not of the correct type
FileNotFoundError – If only_plot=True and no cached results exist

collect(*args, **kwargs) → bencher.results.bench_result.BenchResult

Run a sweep and collect results WITHOUT building any plots.

Equivalent to plot_sweep() with auto_plot=False: it executes the sweep, merges over-time history, and computes regression detection, but constructs no holoviews/panel/bokeh objects. The returned BenchResult is fully populated (dataset + regression_report) and is the safe artifact to persist (bencher.save_result()) and render later — in a separate, clean process — via bencher.render_report().

This is the collection half of a collect/render split, intended for callers whose process holds foreign C-extension state (e.g. ROS/rclpy/DDS) where in-process holoviews/bokeh allocation and the resulting garbage collection can segfault. Accepts the same arguments as plot_sweep() (auto_plot is forced to False).

Returns:: Fully-populated result with no plots built.
Return type:: BenchResult

static filter_overridable_params(bench_cfg: bencher.bench_cfg.BenchCfg, run_cfg: bencher.bench_cfg.BenchRunCfg) → tuple[dict, list[str], list[str]]

Filter run_cfg parameters to only those that can override bench_cfg.

Param 2.3 enforces constant Parameters (e.g. the implicit name), which cannot be overridden. This helper identifies which parameters from run_cfg can be applied to bench_cfg and reports any that must be skipped.

Parameters:

bench_cfg – The benchmark configuration to be updated
run_cfg – The run configuration providing override values

Returns:

valid_params: dict of parameters that can be applied
missing_keys: list of run_cfg keys not found on bench_cfg
constant_keys: list of run_cfg keys that are constant on bench_cfg

Return type:

A tuple of (valid_params, missing_keys, constant_keys) where

run_sweep(bench_cfg: bencher.bench_cfg.BenchCfg, run_cfg: bencher.bench_cfg.BenchRunCfg, time_src: datetime.datetime | None = None, sample_order: bencher.sample_order.SampleOrder = SampleOrder.INORDER) → bencher.results.bench_result.BenchResult

Execute a benchmark sweep based on the provided configuration.

This method handles the caching, execution, and post-processing of a benchmark sweep according to the provided configurations. It’s typically called by plot_sweep rather than directly by users.

Parameters:

bench_cfg (BenchCfg) – Configuration defining inputs, results, and other benchmark parameters
run_cfg (BenchRunCfg) – Configuration for how the benchmark should be executed
time_src (datetime, optional) – The timestamp for the benchmark. Used for time-series benchmarks. Defaults to None, which will use the current time.
sample_order (SampleOrder, optional) – Controls the traversal order of sampling only. Defaults to SampleOrder.INORDER.

Returns:

An object containing all benchmark data, results, and visualization

Return type:

BenchResult

Raises:

FileNotFoundError – If only_plot=True and no cached results exist

_append_result_via_split(bench_res: bencher.results.bench_result.BenchResult) → None

Append a result to the report through the collect/render split.

Used only when the BENCHER_FORCE_SPLIT_RENDER env var is set. Instead of rendering bench_res in-process, it round-trips the result through pickle (bencher.save_result() / bencher.load_result()) and rebuilds the report tab from the deserialized copy — the same serialize then render-from-loaded steps that bencher.render_report() performs out of process.

This lets a dedicated CI job re-run the entire existing test/example suite with the split pipeline forced on, so any divergence between in-process and split rendering (unpicklable result types, render paths that relied on live state) surfaces in the existing assertions. The round-trip stays in-process here so the full suite remains fast; a separate test covers the subprocess boundary.

convert_vars_to_params(variable: param.Parameter | str | dict | tuple, var_type: str, run_cfg: bencher.bench_cfg.BenchRunCfg | None) → param.Parameter: Convert various input formats (str, dict, tuple) to param.Parameter objects.

cache_results(bench_res: bencher.results.bench_result.BenchResult, bench_cfg_hash: str) → None: Cache benchmark results to disk using the config hash as key.

load_history_cache(dataset: xarray.Dataset, bench_cfg_hash: str, clear_history: bool, max_time_events: int | None = None, result_vars: list | None = None, *, on_history_reset: str = 'warn', bench_name: str | None = None, tag: str | None = None, config_summary: dict | None = None) → xarray.Dataset: Load, reconcile, and persist historical benchmark data from cache.

setup_dataset(bench_cfg: bencher.bench_cfg.BenchCfg, time_src: datetime.datetime | str) → tuple[bencher.results.bench_result.BenchResult, zip, list[str], int]: Initialize n-dimensional xarray dataset for storing benchmark results.

define_const_inputs(const_vars: list[tuple[param.Parameter, Any]]) → dict | None: Convert constant variable tuples into a name-value dictionary.

define_extra_vars(bench_cfg: bencher.bench_cfg.BenchCfg, repeats: int, time_src: datetime.datetime | str) → list[bencher.variables.inputs.IntSweep]: Define meta variables (repeat count, timestamps) for benchmark tracking.

calculate_benchmark_results(bench_cfg: bencher.bench_cfg.BenchCfg, time_src: datetime.datetime | str, bench_cfg_sample_hash: str, bench_run_cfg: bencher.bench_cfg.BenchRunCfg, sample_order: bencher.sample_order.SampleOrder = SampleOrder.INORDER, timings: bencher.sweep_timings.SweepTimings | None = None) → bencher.results.bench_result.BenchResult

Execute the benchmark runs and collect results into an n-dimensional array.

This method handles the core benchmark execution process. It sets up the dataset, initializes worker jobs, submits them to the sample cache for execution or retrieval, and collects and stores the results.

Parameters:

bench_cfg (BenchCfg) – Configuration defining the benchmark parameters
time_src (datetime | str) – Timestamp or event name for the benchmark run
bench_cfg_sample_hash (str) – Hash of the benchmark configuration without repeats
bench_run_cfg (BenchRunCfg) – Configuration for how the benchmark should be executed
timings (SweepTimings, optional) – Timing collector to populate. Defaults to None.

Returns:

An object containing all the benchmark data and results

Return type:

BenchResult

store_results(job_result: bencher.job.JobFuture, bench_res: bencher.results.bench_result.BenchResult, worker_job: bencher.worker_job.WorkerJob, bench_run_cfg: bencher.bench_cfg.BenchRunCfg, rv_arrays: dict[str, numpy.ndarray] | None = None) → None: Store worker job results into the n-dimensional result dataset.

init_sample_cache(run_cfg: bencher.bench_cfg.BenchRunCfg) → bencher.job.FutureCache: Initialize the FutureCache for storing benchmark function results.

clear_tag_from_sample_cache(tag: str, run_cfg: bencher.bench_cfg.BenchRunCfg) → None: Clear all cached samples matching a specific tag.

add_metadata_to_dataset(bench_res: bencher.results.bench_result.BenchResult, input_var: bencher.variables.parametrised_sweep.ParametrizedSweep) → None: Add units, long names, and descriptions to xarray dataset attributes.

report_results(bench_res: bencher.results.bench_result.BenchResult, print_xarray: bool, print_pandas: bool) → None: Log benchmark results as xarray or pandas DataFrame.

clear_call_counts() → None: Clear the worker and cache call counts, to help debug and assert caching is happening properly

get_result(index: int = -1) → bencher.results.bench_result.BenchResult

Get a specific benchmark result from the results list.

Parameters:: index (int, optional) – Index of the result to retrieve. Negative indices are supported, with -1 (default) returning the most recent result.
Returns:: The benchmark result at the specified index
Return type:: BenchResult

get_ds(index: int = -1) → xarray.Dataset

Get the xarray Dataset from a specific benchmark result.

This is a convenience method that retrieves a result and returns its dataset.

Parameters:: index (int, optional) – Index of the result to retrieve the dataset from. Negative indices are supported, with -1 (default) returning the most recent result.
Returns:: The xarray Dataset from the benchmark result
Return type:: xr.Dataset

publish(remote_callback: Callable[[str], str]) → str

Publish the benchmark report to a remote location.

Uses the provided callback to publish the benchmark report to a remote location such as a GitHub Pages site.

Parameters:: remote_callback (Callable[[str], str]) – A function that takes a branch name and publishes the report, returning the URL where it’s published
Returns:: The URL where the report has been published
Return type:: str

get_result_vars(as_str: bool = True) → list[str | bencher.variables.parametrised_sweep.ParametrizedSweep]

Retrieve the result variables from the worker class instance.

Parameters:: as_str (bool) – If True, the result variables are returned as strings. If False, they are returned in their original form. Default is True.
Returns:: A list of result variables, either as strings or in their original form.
Return type:: list[str | ParametrizedSweep]
Raises:: RuntimeError – If the worker class instance is not set.

optimize(title: str | None = None, input_vars=None, result_vars=None, const_vars=None, n_trials: int = 100, sampler: optuna.samplers.BaseSampler | None = None, warm_start: bool = True, aggregate: bool | int | list[str] | None = None, agg_fn: str = 'mean', repeats: int = 1, tag: str = '', run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, plot: bool = True, catch: tuple[type[Exception], Ellipsis] = ()) → bencher.results.optimize_result.OptimizeResult | None

Run optuna optimization directly — no full grid sweep required.

Parameters:

title – Study name. Auto-generated when None.
input_vars – Input variables to optimize over. Detected from worker_class_instance when None.
result_vars – Result variables (objectives). Detected when None.
const_vars – Constant variables. Detected when None.
n_trials – Number of new optuna trials to run.
sampler – Optuna sampler. Defaults to TPESampler.
warm_start – Seed the study with previously cached evaluations.
aggregate – Dimensions to aggregate inside the objective function. Same semantics as plot_sweep: True aggregates all but the first input dim, an int N aggregates the last N dims, or a list[str] names specific dims. Aggregated dims are looped over internally so Optuna sees the aggregated value.
agg_fn – Aggregation function name ("mean", "sum", "max", "min", "median"). Applied when aggregate is set or repeats > 1.
repeats – Number of times to evaluate each parameter combination. Each repeat uses a different repeat index and gets its own cache key. Results are aggregated with agg_fn.
tag – Cache tag (same semantics as plot_sweep).
run_cfg – Run configuration. Defaults to BenchRunCfg().
plot – If True, append visualisation to self.report.
catch – Exception types that should not abort the study. Forwarded to optuna.Study.optimize: a trial whose worker raises one of these is recorded as FAILED and the study continues with the remaining trials. Use for flaky or expensive workers (simulator cold starts, network calls). The default () preserves fail-fast behaviour: any worker exception aborts the whole study.

Returns:

OptimizeResult wrapping the completed optuna.Study.

_resolve_optimize_vars(input_vars, result_vars, const_vars, run_cfg): Deep-copy and convert variable lists to param.Parameter objects.

static _build_cache_key(inputs: dict, tag: str) → str: Build a deterministic cache key from an input dict and tag.

_warm_from_results(study: optuna.Study) → int: Seed study from in-memory BenchResult objects. Returns count added.

_warm_from_sample_cache(study: optuna.Study, bench_cfg: bencher.bench_cfg.BenchCfg, input_vars: list, constant_inputs: dict, target_names: list[str]) → int: Seed study from the on-disk sample cache. Returns count added.

_warm_start_from_cache(study: optuna.Study, bench_cfg: bencher.bench_cfg.BenchCfg, input_vars: list, constant_inputs: dict, target_names: list[str]) → int: Seed study with cached evaluations. Returns count of added trials.

static _split_optuna_and_agg_vars(input_vars, aggregate): Partition input_vars into Optuna-tuned and aggregated lists.

_run_optuna_job(trial, job_args, repeat, constant_inputs, tag): Submit a single worker evaluation and return the result dict.

_make_optuna_objective(input_vars, constant_inputs, target_names, tag, agg_vars=None, agg_callable=None, repeats=1): Return an objective function compatible with study.optimize().

to(result_type: bencher.results.bench_result.BenchResult, result_var: param.Parameter | None = None, override: bool = True, **kwargs: Any) → bencher.results.bench_result.BenchResult

Return the current instance of BenchResult.

Returns:: The current instance of the benchmark result
Return type:: BenchResult

add(result_type: bencher.results.bench_result.BenchResult, result_var: param.Parameter | None = None, override: bool = True, **kwargs: Any)

class bencher.BenchCfg(**params: Any)

Bases: BenchRunCfg

Complete configuration for a benchmark protocol.

This class extends BenchRunCfg and provides a comprehensive set of parameters for configuring benchmark runs. It maintains a unique hash value based on its configuration to ensure that benchmark results can be consistently referenced and that plots are uniquely identified across runs.

The class handles input variables, result variables, constant values, meta variables, and various presentation options. It also provides methods for generating descriptive summaries and visualizations of the benchmark configuration.

input_vars

A list of ParameterizedSweep variables to perform a parameter sweep over

Type:: list

result_vars

A list of ParameterizedSweep results to collect and plot

Type:: list

const_vars

Variables to keep constant but are different from the default value

Type:: list

result_hmaps

A list of holomap results

Type:: list

meta_vars

Meta variables such as recording time and repeat id

Type:: list

all_vars

Stores a list of both the input_vars and meta_vars

Type:: list

iv_time

Parameter for sampling the same inputs over time

Type:: list[TimeSnapshot | TimeEvent]

over_time

Controls whether the function is sampled over time

Type:: bool

name

The name of the benchmarkCfg

Type:: str

title

The title of the benchmark

Type:: str

raise_duplicate_exception

Used for debugging filename generation uniqueness

Type:: bool

bench_name

The name of the benchmark and save folder

Type:: str

description

A longer description of the benchmark function

Type:: str

post_description

Comments on the output of the graphs

Type:: str

has_results

Whether this config has results

Type:: bool

pass_repeat

Whether to pass the ‘repeat’ kwarg to the benchmark function

Type:: bool

tag

Tags for grouping different benchmarks

Type:: str

hash_value

Stored hash value of the config

Type:: str

plot_callbacks

Callables that take a BenchResult and return panel representation

Type:: list

input_vars

result_vars

const_vars

result_hmaps

meta_vars

all_vars

iv_time

name: str | None

title: str | None

raise_duplicate_exception: bool

bench_name: str | None

description: str | None

post_description: str | None

has_results: bool

pass_repeat: bool

tag: str

hash_value: str

plot_callbacks

agg_over_dims

agg_fn

plot_lib = None

hmap_kdims = None

iv_repeat = None

hash_persistent(include_repeats: bool, include_result_vars: bool = True) → str

Generate a persistent hash for the benchmark configuration.

Overrides the default hash function because the default hash function does not return the same value for the same inputs. This method references only stable variables that are consistent across instances of BenchCfg with the same configuration.

input_vars are folded in list order because their order determines the dimension layout of the result arrays. result_vars and const_vars contribute as an unordered set (their per-var digests are sorted before hashing): result vars become name-keyed xarray data variables and const order only affects the title string, so reordering either is a presentation change that must not move the cache key.

Parameters:

include_repeats (bool) – Whether to include repeats as part of the hash (True by default except when using the sample cache)
include_result_vars (bool) – Whether result variables contribute to the hash. True for the benchmark-level result cache, where a cached result must match the exact result-var set. False for the over_time history key, so the history survives result-var changes and per-column reconciliation can retain, retire, or backfill individual columns (see bencher.history).

Returns:

A persistent hash value for the benchmark configuration

Return type:

str

inputs_as_str() → list[str]

Get a list of input variable names.

Returns:: List of the names of input variables
Return type:: list[str]

to_latex() → panel.pane.LaTeX | None

Convert benchmark configuration to LaTeX representation.

Returns:: LaTeX representation of the benchmark configuration
Return type:: pn.pane.LaTeX | None

to_cartesian_animation() → str | None

Render an animation of the Cartesian product data collection.

Delegates to bencher.results.manim_cartesian.render_animation(), which currently uses a PIL-based renderer. Returns the filesystem path to the generated animated PNG (or other format, depending on the renderer), or None on failure so callers can degrade gracefully.

Returns:: Path to the rendered animation file, or None on failure.
Return type:: str | None

describe_sweep(width: int = 800, accordion: bool = True) → panel.pane.Markdown | panel.Column

Produce a markdown summary of the sweep settings.

Parameters:

width (int) – Width of the markdown panel in pixels. Defaults to 800.
accordion (bool) – Whether to wrap the description in an accordion. Defaults to True.

Returns:

Panel containing the sweep description

Return type:

pn.pane.Markdown | pn.Column

sweep_sentence() → panel.pane.Markdown

Generate a concise summary sentence of the sweep configuration.

Returns:: A panel containing a markdown summary sentence
Return type:: pn.pane.Markdown

describe_benchmark() → str

Generate a detailed string summary of the inputs and results from a BenchCfg.

Returns:: Comprehensive summary of BenchCfg
Return type:: str

to_title(panel_name: str | None = None) → panel.pane.Markdown

Create a markdown panel with the benchmark title.

Parameters:: panel_name (str | None) – The name for the panel. Defaults to the benchmark title.
Returns:: A panel with the benchmark title as a heading
Return type:: pn.pane.Markdown

to_description(width: int = 800) → panel.pane.Markdown

Create a markdown panel with the benchmark description.

Parameters:: width (int) – Width of the markdown panel in pixels. Defaults to 800.
Returns:: A panel with the benchmark description
Return type:: pn.pane.Markdown

to_post_description(width: int = 800) → panel.pane.Markdown

Create a markdown panel with the benchmark post-description.

Parameters:: width (int) – Width of the markdown panel in pixels. Defaults to 800.
Returns:: A panel with the benchmark post-description
Return type:: pn.pane.Markdown

to_sweep_summary(name: str | None = None, description: bool = True, describe_sweep: bool = True, results_suffix: bool = True, title: bool = True) → panel.Column

Produce panel output summarising the title, description and sweep setting.

Parameters:

name (str | None) – Name for the panel. Defaults to benchmark title or “Data Collection Parameters” if title is False.
description (bool) – Whether to include the benchmark description. Defaults to True.
describe_sweep (bool) – Whether to include the sweep description. Defaults to True.
results_suffix (bool) – Whether to add a “Results:” heading. Defaults to True.
title (bool) – Whether to include the benchmark title. Defaults to True.

Returns:

A panel with the benchmark summary

Return type:

pn.Column

static partition_input_vars(vars_) → tuple[list, list]: Split variables into (optimized, non-optimized) based on the optimize flag.

property optimized_input_vars: list: Return input variables where optimize=True (suggested by Optuna).

property non_optimized_input_vars: list: Return input variables where optimize=False (swept/aggregated, not suggested).

optuna_targets(as_var: bool = False) → list[Any]

Get the list of result variables that are optimization targets.

Parameters:: as_var (bool) – If True, return the variable objects rather than their names. Defaults to False.
Returns:: List of result variable names or objects that are optimization targets
Return type:: list[Any]

class bencher.BenchRunCfg(**params: Any)

Bases: BenchPlotSrvCfg

Configuration class for benchmark execution parameters.

This class extends BenchPlotSrvCfg to provide comprehensive control over benchmark execution, including caching behavior, reporting options, visualization settings, and execution strategy. It defines numerous parameters that control how benchmark runs are performed, cached, and displayed to the user.

Quick-start examples:

# Use defaults — each variable uses its own ``samples`` setting:
run_cfg = BenchRunCfg()

# Set a sampling subsampling_divisions (geometrically increasing sample counts):
run_cfg = BenchRunCfg(subsampling_divisions=5)        # 9 samples per variable
run_cfg = BenchRunCfg(subsampling_divisions=8)        # 65 samples per variable

# Or set an exact sample count directly:
run_cfg = BenchRunCfg(samples_per_var=20)

Subsampling Divisions-to-samples mapping

repeats

The number of times to sample the inputs

Type:: int

over_time

If true each time the function is called it will plot a timeseries of historical and the latest result

Type:: bool

use_optuna

Show optuna plots

Type:: bool

summarise_constant_inputs

Print the inputs that are kept constant when describing the sweep parameters

Type:: bool

print_bench_inputs

Print the inputs to the benchmark function every time it is called

Type:: bool

print_bench_results

Print the results of the benchmark function every time it is called

Type:: bool

clear_history

Clear historical results

Type:: bool

max_time_events

Maximum number of over_time events to retain. None means unlimited.

Type:: int

max_slider_points

Maximum time points in the over_time slider. Defaults to 10, None means all.

Type:: int

show_aggregated_time_tab

Show the aggregated tab for over_time plots. Defaults to False.

Type:: bool

show_aggregate_plots

Show aggregated BandResult plots when aggregate is set.

Type:: bool

print_pandas

Print a pandas summary of the results to the console

Type:: bool

print_xarray

Print an xarray summary of the results to the console

Type:: bool

serve_pandas

Serve a pandas summary on the results webpage

Type:: bool

serve_pandas_flat

Serve a flattened pandas summary on the results webpage

Type:: bool

serve_xarray

Serve an xarray summary on the results webpage

Type:: bool

auto_plot

Automatically deduce the best type of plot for the results

Type:: bool

raise_duplicate_exception

Used to debug unique plot names

Type:: bool

cache_results

Benchmark level cache for completed benchmark results

Type:: bool

clear_cache

Clear the cache of saved input->output mappings

Type:: bool

cache_samples

Enable per-sample caching

Type:: bool

only_hash_tag

Use only the tag hash for cache identification

Type:: bool

clear_sample_cache

Clear the per-sample cache

Type:: bool

overwrite_sample_cache

Recalculate and overwrite cached sample values

Type:: bool

only_plot

Do not calculate benchmarks if no results are found in cache

Type:: bool

use_holoview

Use holoview for plotting

Type:: bool

nightly

Run a more extensive set of tests for a nightly benchmark

Type:: bool

time_event

String representation of a sequence over time

Type:: str

headless

Run the benchmarks headlessly

Type:: bool

dry_run

Preview sweep grid without executing the benchmark function

Type:: bool

subsampling_divisions

Method of defining the number of samples to sweep over

Type:: int

samples_per_var

Explicit sample count per variable (overrides subsampling_divisions)

Type:: int | None

run_tag

Tag for isolating cached results

Type:: str

run_date

Date the benchmark run was performed

Type:: datetime

executor

Executor for running the benchmark

Type:: Executors

plot_size

Sets both width and height of the plot

Type:: int

plot_width

Sets width of the plots

Type:: int

plot_height

Sets height of the plot

Type:: int

repeats: int

subsampling_divisions: int

samples_per_var: int | None

executor

nightly: bool

headless: bool

dry_run: bool

cache_results: bool

cache_samples: bool

clear_cache: bool

clear_sample_cache: bool

overwrite_sample_cache: bool

only_hash_tag: bool

only_plot: bool

cache_size: int

print_bench_inputs: bool

print_bench_results: bool

summarise_constant_inputs: bool

print_pandas: bool

print_xarray: bool

serve_pandas: bool

serve_pandas_flat: bool

serve_xarray: bool

auto_plot: bool

use_holoview: bool

use_optuna: bool

plot_size: int | None

plot_width: int | None

plot_height: int | None

raise_duplicate_exception: bool

pane_layout

backend

over_time: bool

clear_history: bool

on_history_reset: str

max_time_events: int | None

max_slider_points: int | None

show_aggregated_time_tab: bool

show_aggregate_plots: bool

time_event: str | None

run_tag: str

run_date: datetime.datetime

regression_detection: bool

regression_method: str

regression_min_history: int

regression_mad: float

regression_percentage: float

regression_delta: float

regression_absolute: float

regression_overrides: dict

regression_fail: bool

static from_cmd_line() → BenchRunCfg

Create a BenchRunCfg by parsing command line arguments.

Parses command line arguments to create a configuration for benchmark runs.

Returns:: Configuration object with settings from command line arguments
Return type:: BenchRunCfg

static subsampling_divisions_to_samples(subsampling_divisions: int, max_subsampling_divisions: int = 12) → int

Return the number of samples-per-variable for a given subsampling_divisions.

Parameters:

subsampling_divisions – Sampling subsampling_divisions (1-12).
max_subsampling_divisions – Cap applied before lookup. Defaults to 12.

Returns:

The sample count for this subsampling_divisions.

Raises:

ValueError – If subsampling_divisions is out of range.

Example:

>>> BenchRunCfg.subsampling_divisions_to_samples(5)
9

static level_to_samples(level: int, max_level: int = 12) → int: Deprecated: use subsampling_divisions_to_samples() instead.

deep()

classmethod with_defaults(run_cfg=None, **defaults)

Merge defaults into run_cfg, creating a new instance when needed.

When run_cfg is None a fresh BenchRunCfg is created with defaults. When run_cfg is provided, a shallow copy is made and each default is applied only if the corresponding field is still at its param-level default value (i.e. the caller did not explicitly set it). The original run_cfg is never mutated. This lets benchmark functions declare sensible defaults while still allowing callers to override:

run_cfg = bn.BenchRunCfg.with_defaults(run_cfg, repeats=5, subsampling_divisions=4)

Raises:: ValueError – If any key in defaults is not a recognised parameter.

class bencher.ShowMode

Bases: strenum.LowercaseStrEnum

Display mode for benchmark reports.

LIVE

HTML

PUBLISHED

NONE

A class to manage running multiple benchmarks in groups, or running the same benchmark but at multiple resolutions.

BenchRunner provides a framework for organizing, configuring, and executing multiple benchmark runs with different parameters. It supports progressive refinement of benchmark resolution, caching of results, and publication of results to various formats.

bench_fns = []

run_cfg

publisher = None

results = []

servers = []

_generate_name() → str

Generate a unique name for the BenchRunner instance.

Returns:: A unique name based on timestamp, object id, and random value
Return type:: str

static setup_run_cfg(run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, subsampling_divisions=UNSET, cache_samples: bool = False, over_time: bool | None = None, level: int | None = None) → bencher.bench_cfg.BenchRunCfg

Configure benchmark run settings with reasonable defaults.

Creates a copy of the provided configuration with the specified subsampling_divisions and caching behavior settings applied.

Parameters:

run_cfg (BenchRunCfg, optional) – Base configuration to modify. Defaults to None.
subsampling_divisions (int, optional) – Benchmark sampling resolution subsampling_divisions. Defaults to 2.
cache_samples (bool, optional) – Whether to enable sample caching. Defaults to False.
over_time (bool, optional) – Enable time-series benchmarking. None preserves run_cfg value.
level (int, optional) – Deprecated. Use subsampling_divisions instead.

Returns:

A new configuration object with the specified settings

Return type:

BenchRunCfg

static from_parametrized_sweep(class_instance: bencher.variables.parametrised_sweep.ParametrizedSweep, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, report: bencher.bench_report.BenchReport | None = None) → bencher.bencher.Bench

Create a Bench instance from a ParametrizedSweep class.

Parameters:

class_instance (ParametrizedSweep) – The parametrized sweep class instance to benchmark
run_cfg (BenchRunCfg, optional) – Configuration for benchmark execution. Defaults to None.
report (BenchReport, optional) – Report to store benchmark results. Defaults to None.

Returns:

A configured Bench instance ready to run the benchmark

Return type:

Bench

add(bench_fn: Benchable) → BenchRunner

Add a benchmark function to be executed by this runner.

Parameters:: bench_fn (Benchable) – A callable that implements the Benchable protocol
Returns:: Self for method chaining
Return type:: BenchRunner

add_run(bench_fn: Benchable) → None

Add a benchmark function to be executed by this runner.

Deprecated since version Use: add() instead.

Parameters:: bench_fn (Benchable) – A callable that implements the Benchable protocol

add_bench(class_instance: bencher.variables.parametrised_sweep.ParametrizedSweep) → None

Add a parametrized sweep class instance as a benchmark.

Creates and adds a function that will create a Bench instance from the provided parametrized sweep class when executed.

Parameters:: class_instance (ParametrizedSweep) – The parametrized sweep to benchmark

_merge_reports(target: bencher.bench_report.BenchReport, source: bencher.bench_report.BenchReport | None) → None: Append all tabs from source report into the target report.

_execute_bench_fn(bench_fn: Benchable, run_cfg: bencher.bench_cfg.BenchRunCfg, report: bencher.bench_report.BenchReport | None) → tuple[bencher.bench_cfg.BenchCfg, bencher.bench_report.BenchReport | None]: Execute a bench function handling legacy and new signatures.

run(subsampling_divisions=UNSET, repeats: int = 1, max_subsampling_divisions: int | None = None, max_repeats: int | None = None, min_level: int | None = None, start_repeats: int | None = None, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, publish: bool = False, debug: bool = False, show: bool | str | bencher.bench_cfg.ShowMode = False, save: bool = False, grouped: bool = False, cache_samples: bool | None = None, over_time: bool | None = None, backend: str | None = None, **kwargs) → list[bencher.bench_cfg.BenchCfg]

Unified interface for running benchmarks.

This function provides a single entry point for benchmark runs: - Single runs: Use subsampling_divisions and repeats parameters only - Progressive runs: Set max_subsampling_divisions and/or max_repeats for automatic progression

Parameters:

parameters (# Legacy)
subsampling_divisions (int) – Starting benchmark subsampling_divisions. Defaults to 2.
repeats (int) – Starting number of repeats. Defaults to 1.
max_subsampling_divisions (int, optional) – Maximum subsampling_divisions for progression. If None, uses single subsampling_divisions.
max_repeats (int, optional) – Maximum repeats for progression. If None, uses single repeat count.
parameters
min_level (int, optional) – DEPRECATED - use ‘subsampling_divisions’ parameter instead.
start_repeats (int, optional) – DEPRECATED - use ‘repeats’ parameter instead.
run_cfg (BenchRunCfg, optional) – benchmark run configuration. Defaults to None.
publish (bool, optional) – Publish the results to git, requires a publish url to be set up. Defaults to False.
debug (bool, optional) – Enable debug output during publishing. Defaults to False.
show (bool | str | ShowMode, optional) – How to display results. True/ShowMode.LIVE starts a Panel server (blocks); ShowMode.HTML saves HTML and opens it in the browser (returns); ShowMode.PUBLISHED opens the published URL (requires publish=True); False/ShowMode.NONE displays nothing. Defaults to False.
save (bool, optional) – save the results to disk in index.html. Defaults to False.
grouped (bool, optional) – Produce a single html page with all the benchmarks included. Defaults to False.
cache_samples (bool | None, optional) – Use the sample cache to reuse previous results. None (default) auto-enables for progressive runs. Pass False to disable even for progressive runs.
over_time (bool, optional) – Enable time-series benchmarking. None preserves run_cfg value.
backend (str, optional) – Visualization backend (‘panel’ or ‘rerun’). None preserves run_cfg value.

Returns:

A list of benchmark configuration objects with results

Return type:

list[BenchCfg]

show_publish(report: bencher.bench_report.BenchReport, show: bool | str | bencher.bench_cfg.ShowMode, publish: bool, save: bool, debug: bool) → None

Handle publishing, saving, and displaying of a benchmark report.

Parameters:

report (BenchReport) – The benchmark report to process
show (bool | str | ShowMode) – How to display the report. See bencher.bench_cfg.normalize_show() for accepted values.
publish (bool) – Whether to publish the report
save (bool) – Whether to save the report to disk
debug (bool) – Whether to enable debug mode for publishing

show(report: bencher.bench_report.BenchReport | None = None, show: bool | str | bencher.bench_cfg.ShowMode = True, publish: bool = False, save: bool = False, debug: bool = False) → None

Display or publish a specific benchmark report.

This is a convenience method to show, publish, or save a specific report. If no report is provided, it will use the most recent result.

Parameters:

report (BenchReport, optional) – The report to process. Defaults to None (most recent).
show (bool | str | ShowMode, optional) – How to display. See run() for accepted values. Defaults to True.
publish (bool, optional) – Whether to publish the report. Defaults to False.
save (bool, optional) – Whether to save to disk. Defaults to False.
debug (bool, optional) – Enable debug mode for publishing. Defaults to False.

Raises:

RuntimeError – If no report is specified and no results are available

shutdown() → None

Stop all running panel servers launched by this benchmark runner.

This method ensures that any web servers started to display benchmark results are properly shut down.

__del__() → None

Destructor that ensures proper cleanup of resources.

Automatically calls shutdown() to stop any running servers when the BenchRunner instance is garbage collected.

bencher.render_report(result_or_path: bencher.results.bench_result.BenchResult | str | pathlib.Path, output_dir: str | pathlib.Path, *, report: bencher.bench_report.BenchReport | None = None, filename: str | None = None, in_html_folder: bool = False, portable: bool = False, emit_json: bool | str = False) → pathlib.Path

Render a collected result to an HTML report.

Reconstructs the holoviews/panel report from a result produced by Bench.collect() (or a path to one saved with save_result()) and writes it under output_dir. This is the only step that constructs plotting objects, and it is designed to run in a process free of foreign C-extension state.

The result already carries its regression_report (computed during collection), so no sweep re-execution happens here.

Parameters:

result_or_path – A BenchResult, or a path to a saved one.
output_dir – Directory to write the report into.
report – An existing BenchReport to append to. A new one is created (named after the benchmark) when omitted.
filename – Output HTML filename. Defaults to <bench_name>.html.
in_html_folder – Forwarded to BenchReport.save().
portable – Forwarded to BenchReport.save() (base64-inline assets).
emit_json – Forwarded to BenchReport.save(); when truthy also writes a machine-readable result.json next to the HTML.

Returns:

The path to the saved report.

bencher.save_result(bench_res: bencher.results.bench_result.BenchResult, path: str | pathlib.Path) → pathlib.Path

Persist a collected BenchResult to path via pickle.

Mirrors how bencher already caches results internally: the (potentially non-pickleable) object_index is stripped before writing and restored afterwards, so the live object is unchanged.

Parameters:

bench_res – A result from Bench.collect() / plot_sweep.
path – Destination file path.

Returns:

The path written.

bencher.load_result(path: str | pathlib.Path) → bencher.results.bench_result.BenchResult: Load a BenchResult previously written by save_result().

bencher.result_to_dict(bench_res: bencher.results.bench_result.BenchResult, *, include_series: bool = False) → dict

Build the stable, JSON-serializable contract for a single result.

Parameters:

bench_res – A collected BenchResult (e.g. from plot_sweep(auto_plot=False) / Bench.collect()).
include_series – When True and the result carries an over_time axis, attach a per-time-event series (series_for_var()) to each scalar metric — the trend behind the regression verdict, for callers that render sparklines. Off by default so the base contract stays byte-stable.

Returns:

A dict with schema_version, bench_name, provenance, input_vars, over_time, metrics, and regressions.

bencher.result_to_json(bench_res: bencher.results.bench_result.BenchResult, path: str | pathlib.Path, *, indent: int = 2, include_series: bool = False) → pathlib.Path: Write result_to_dict() for bench_res to path as JSON.

bencher.series_for_var(ds: xarray.Dataset, var_name: str) → list[dict]

Per-time-event mean/std/n for a scalar result var across the over_time axis.

Reduces over every dim except over_time (the sweep inputs + repeat) with NaN-aware reductions, mirroring the history reduction used elsewhere. The over_time coordinate labels can carry embedded newlines (long labels are wrapped in place), so strip them back to single-line strings.

Returns one {time_event, mean, std, n} record per over-time event, with mean/std coerced finite-or-None so the output stays strict-JSON safe.

bencher.compare_results(baseline: bencher.results.bench_result.BenchResult, candidate: bencher.results.bench_result.BenchResult, *, run_cfg=None) → dict

Diff two independently-collected results into an A/B comparison contract.

Stacks baseline and candidate on a synthetic 2-point over_time axis (baseline first, candidate last) and runs the regular detect_regressions() over it, so the A/B verdict uses identical direction/threshold logic to the over-time path.

Parameters:

baseline – The reference result.
candidate – The result being compared against the baseline.
run_cfg – Optional BenchRunCfg controlling the detector. When omitted, a percentage comparison (regression_method='percentage') is used — the natural choice for a two-point A/B.

Returns:

A dict with schema_version, baseline/candidate provenance, per-metric metrics (with a verdict), and a summary count.

Raises:

ValueError – when the two results share no comparable scalar metric.

bencher.comparison_to_json(baseline: bencher.results.bench_result.BenchResult, candidate: bencher.results.bench_result.BenchResult, path: str | pathlib.Path, *, run_cfg=None, indent: int = 2) → pathlib.Path: Write compare_results() for the two results to path as JSON.

bencher.sparkline_svg(means: list[float | None], stds: list[float | None], *, width: int = 120, height: int = 28, pad: int = 3) → str

Return an inline SVG sparkline: ±std band, mean line, a node per run.

Pure-numeric input (no caller strings interpolated), so the output is safe to render unescaped. Auto-scales to the band extent; degenerates gracefully to a single node when only one finite point exists, and to an empty <svg> when none do.

The SVG carries a viewBox and preserveAspectRatio="none" so CSS can stretch it to whatever width its container ends up at. vector-effect keeps the line/band stroke crisp under that non-uniform scaling.

A small node marks every event on the line so the eye can see where the individual runs sit; nodes are drawn identically (the latest is simply the rightmost) and the sparkline itself stays uncolored — the caller (e.g. a cell background) owns any verdict color.

With more than one point, a narrow right-margin column collapses every run’s mean onto the (shared) value axis as identical alpha-blended dots, so value regions where runs cluster read darker — surfacing the run-to-run spread, and any bimodality, that the mean line alone hides.

means is the trend and drives the x-axis; stds is the noise band and is paired positionally. The two are zipped with itertools.zip_longest() so a length mismatch degrades gracefully rather than silently dropping trailing points: a missing std collapses that point’s band to zero, and a surplus std (no matching mean) is ignored.

class bencher.Chrome

Optional page header content (title, provenance, and CI nav links).

Every field is optional; each nav link renders only when supplied, so the default template carries CI-flavored links harmlessly for callers that leave them blank.

title: str = 'Benchmark Health Scorecard'

commit_sha: str = ''

branch: str = ''

pr_number: str = ''

run_url: str = ''

repo_url: str = ''

nightly_url: str = ''

main_url: str = ''

stable_url: str = ''

class bencher.ReportLayout

Where per-benchmark artifacts live under the reports directory.

root is the subdirectory holding one folder per benchmark tag ("" means the reports directory itself). link_pattern builds the relative href to a benchmark’s HTML report; {root}, {tag} and {bench_name} are substituted.

root: str = ''

link_pattern: str = '{root}/{tag}/{bench_name}.html'

link(tag: str, bench_name: str) → str

class bencher.ScorecardConfig

Project-specific inputs to the scorecard renderer.

Parameters:

registry – tag -> (category, display_name, description) for known benchmarks. Unregistered tags fall back to an auto-generated name in other_category.
aliases – raw_metric_name -> canonical_name so equivalent metrics from different benchmarks share one column.
percent_metrics – metric names whose value is a 0..1 fraction to be rendered as a percentage rather than a bare number.
layout – on-disk report layout (see ReportLayout).
other_category – fallback category for unregistered tags.

registry: Mapping[str, tuple[str, str, str]]

aliases: Mapping[str, str]

percent_metrics: frozenset[str]

layout: ReportLayout

other_category: str = 'Other'

category_order() → list[str]: Category display order: first-appearance in the registry, Other last.

bencher.generate_scorecard(reports_dir: pathlib.Path | str, config: bencher.scorecard.config.ScorecardConfig | None = None, *, chrome: bencher.scorecard.config.Chrome | None = None, output_name: str = 'index.html') → pathlib.Path

Render the scorecard for all summaries under reports_dir.

Parameters:

reports_dir – Directory containing <layout.root>/<tag>/*.summary.json.
config – Project specifics (registry, aliases, layout, …). Defaults to a zero-config ScorecardConfig (auto-named benchmarks).
chrome – Optional page header / CI nav content.
output_name – File written under reports_dir (the scorecard is usually published as index.html so it is the landing page).

Returns:

The path to the written HTML file.

class bencher.BenchPlotServer

A server for display plots of benchmark results

plot_server(bench_name: str, plot_cfg: bencher.bench_cfg.BenchPlotSrvCfg | None = None, plots_instance=None) → threading.Thread

Load previously calculated benchmark data from the database and start a plot server to display it

Parameters:

bench_name (str) – The name of the benchmark and output folder for the figures
plot_cfg (BenchPlotSrvCfg, optional) – Options for the plot server. Defaults to None.

Raises:

FileNotFoundError – No data found was found in the database to plot

load_data_from_cache(bench_name: str) → tuple[bencher.bench_cfg.BenchCfg, list[panel.panel]] | None

Load previously calculated benchmark data from the database and start a plot server to display it

Parameters:: bench_name (str) – The name of the benchmark and output folder for the figures
Returns:: benchmark result data and any additional panels
Return type:: tuple[BenchCfg, list[pn.panel]] | None
Raises:: FileNotFoundError – No data found was found in the database to plot

static _find_free_port() → int

Find a free port by testing random ports in the dynamic/private range.

Using port=0 with Tornado/Bokeh can fail on some Linux kernels (notably 6.x) because the kernel deterministically assigns the same ephemeral port, causing EADDRINUSE when a previous server is still running. Picking a random port from the IANA dynamic range avoids this.

Note: there is an inherent TOCTOU race between probing the port here and the actual bind() inside Panel/Bokeh. In practice the window is very small and the random selection makes collisions unlikely, but callers should be prepared for a rare OSError on server start.

serve(bench_name: str, plots_instance: list[panel.panel], port: int | None = None, show: bool = True) → threading.Thread

Launch a panel server to view results

Parameters:

bench_cfg (BenchCfg) – benchmark results
plots_instance (list[pn.panel]) – list of panel objects to display
port (int) – use a fixed port to launch the server

static _rrd_extra_patterns() → list

Return Tornado route patterns for serving .rrd files with CORS headers.

Mounts cachedir/rrd/ at /rrd_static/ so that the local rerun viewer can fetch .rrd files from the Panel server. See the module docstring in utils_rerun.py for the full architecture explanation.

bencher.hash_sha1(var: Any) → str

A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run.

Converts input to a consistent SHA1 hash string.

Parameters:: var (Any) – The variable to hash
Returns:: A hexadecimal SHA1 hash of the string representation of the variable
Return type:: str

bencher.SUBSAMPLING_DIVISIONS_SAMPLES = [0, 1, 2, 3, 5, 9, 17, 33, 65, 129, 257, 513, 1025, 2049]

class bencher.IntSweep(units: str = 'ul', samples: int | None = None, sample_values: list[int] | None = None, optimize: bool = True, **params)

Bases: param.Integer, bencher.variables.sweep_base.SweepBase

A class representing a parameter sweep for integer values.

This class extends both Integer and SweepBase to provide parameter sweeping capabilities specifically for integer values within specified bounds or with custom sample values.

units

The units of measurement for the parameter

Type:: str

samples

The number of samples to take from the range

Type:: int

sample_values

Specific integer values to use as samples instead of generating them from bounds. If provided, overrides the samples parameter.

Type:: list[int], optional

__slots__ = ['units', 'samples', 'optimize', 'sample_values']

units = 'ul'

optimize = True

_coerce_bound(value): Override in subclasses to coerce bound values to the correct type.

_sweep_identity() → tuple: Include bounds and sample_values so a reshaped sweep busts the cache.

values() → list[int]

Return all the values for the parameter sweep.

If sample_values is provided, returns those values. Otherwise generates values within the specified bounds.

Returns:: A list of integer values to sweep through
Return type:: list[int]

_validate_value(value, allow_None)

Validate the parameter value against constraints.

Parameters:

value (Any) – The value to be validated.
allow_None (bool) – Whether None is allowed as a valid value.

Raises:

ValueError – If the value does not meet the parameter’s constraints.

_validate_step(val, step)

class bencher.FloatSweep(units: str = 'ul', samples: int = 10, sample_values: list[float] | None = None, step: float | None = None, optimize: bool = True, **params)

Bases: param.Number, bencher.variables.sweep_base.SweepBase

A class representing a parameter sweep for floating point values.

This class extends both Number and SweepBase to provide parameter sweeping capabilities specifically for floating point values within specified bounds or with custom sample values.

units

The units of measurement for the parameter

Type:: str

samples

The number of samples to take from the range

Type:: int

sample_values

Specific float values to use as samples instead of generating them from bounds. If provided, overrides the samples parameter.

Type:: list[float], optional

step

Step size between samples when generating values from bounds

Type:: float, optional

__slots__ = ['units', 'samples', 'optimize', 'sample_values']

units = 'ul'

optimize = True

sample_values

_coerce_bound(value): Override in subclasses to coerce bound values to the correct type.

_sweep_identity() → tuple: Include bounds, sample_values, and step so a reshaped sweep busts the cache.

values() → list[float]

Return all the values for the parameter sweep.

If sample_values is provided, returns those values. Otherwise generates values within the specified bounds, either using linspace (when step is None) or arange.

Returns:: A list of float values to sweep through
Return type:: list[float]

class bencher.StringSweep(string_list: list[str], units: str = 'ul', samples: int | None = None, optimize: bool = True, **params)

Bases: SweepSelector

A class representing a parameter sweep for string values.

This class extends SweepSelector to provide parameter sweeping capabilities specifically for a list of string values.

units

The units of measurement for the parameter

Type:: str

samples

The number of samples to take from the available strings

Type:: int

classmethod dynamic(*, placeholder: str | None = None, units: str = 'ul', doc: str | None = None, **params) → StringSweep

Create a StringSweep intended for later population.

Parameters:

placeholder – Optional text to show before real values are loaded. Defaults to the sentinel’s displayed text.
units – Units label (optional, passed through).
doc – Documentation string for the parameter.
params – Additional param overrides.

Returns:

A sweep with a single sentinel placeholder value.

Return type:

StringSweep

class bencher.EnumSweep(enum_type: enum.Enum | list[enum.Enum], units: str = 'ul', samples: int | None = None, optimize: bool = True, **params)

Bases: SweepSelector

A class representing a parameter sweep for enum values.

This class extends SweepSelector to provide parameter sweeping capabilities specifically for enumeration types, supporting both enum types and lists of enum values.

units

The units of measurement for the parameter

Type:: str

samples

The number of samples to take from the available enum values

Type:: int

__slots__ = ['units', 'samples', 'optimize']

class bencher.BoolSweep(units: str = 'ul', samples: int | None = None, default: bool = True, optimize: bool = True, **params)

Bases: SweepSelector

A class representing a parameter sweep for boolean values.

This class extends SweepSelector to provide parameter sweeping capabilities specifically for boolean values (True and False).

units

The units of measurement for the parameter

Type:: str

samples

The number of samples to take (typically 2 for booleans)

Type:: int

class bencher.SweepBase(default: Any = ..., *, doc: str | None = None, label: str | None = None, precedence: float | None = None, instantiate: bool = False, constant: bool = False, readonly: bool = False, pickle_default_value: bool = True, allow_None: Literal[False] = False, per_instance: bool = True, allow_refs: bool = False, nested_refs: bool = False, default_factory: collections.abc.Callable[[], Any] | None = None, metadata: dict[str, Any] | None = None)

Bases: param.Parameter

Base Parameter type to hold any type of Python object.

Parameters are a special kind of class attribute implemented as descriptor. Setting a Parameterized class attribute to a Parameter instance enables enhanced functionality, including type and range validation at assignment, support for constant and read-only parameters, documentation strings and dynamic parameter values.

Parameters can only be used as class attributes of Parameterized classes. Using them in standalone contexts or with non-Parameterized classes will not provide the described behavior.

Notes

Parameters provide lots of features.

Dynamic Behavior:

Parameters provide support for dynamic values, type validation, and range checking.
Parameters can be declared as constant or read-only.

Automatic Initialization:

Parameters can be set during object construction using keyword arguments. For example: myfoo = Foo(alpha=0.5); print(myfoo.alpha).
If custom constructors are implemented, they can still pass keyword arguments to the superclass to allow Parameter initialization.

Inheritance:

Parameterized classes automatically inherit parameters from their superclasses. Attributes can be selectively overridden.

Subclassing:

The Parameter class can be subclassed to create custom behavior, such as validating specific ranges or generating values dynamically.

GUI Integration:

Parameters provide sufficient metadata for auto-generating property sheets in graphical user interfaces, enabling user-friendly parameter editing.

Examples

Define a Parameterized class with parameters:

>>> import param
>>> class Foo(param.Parameterized):
...     alpha = param.Parameter(default=0.1, doc="The starting value.")
...     beta = param.Parameter(default=0.5, doc="The standard deviation.", constant=True)

When no initial value is provided the default is used:

>>> Foo().alpha
0.1

When an initial value is provided it is used:

>>> foo = Foo(alpha=0.5)
>>> foo.alpha
0.5

Constant parameters cannot be modified:

>>> foo.beta = 0.1  # Cannot be changed since it's constant
...
TypeError: Constant parameter 'beta' cannot be modified

property sweep_bounds: tuple | None

Return the sweep range (low, high).

FloatSweep/IntSweep store user-supplied bounds as param softbounds (not hard bounds) so that values outside the range are not rejected. This property provides a single access point.

abstract values() → list[Any]

All sweep classes must implement this method. This generates sample values from based on the parameters bounds and sample number.

Returns:: A list of samples from the variable
Return type:: list[Any]

_sweep_hash_exclude: tuple[str, Ellipsis] = ('optimize',)

_sweep_identity() → tuple

Return the tuple of values that uniquely identifies this sweep for the benchmark-level and over_time history caches.

Subclasses MUST override and call super()._sweep_identity() + (...) to append any shape-affecting fields: bounds, sample_values, step, objects, etc. Any field that changes the set of sampled values or the coordinate labels of the resulting xarray must contribute here, otherwise the benchmark-level cache and over_time history will silently serve stale data for a reshaped sweep.

The class name is included so that different Sweep subclasses with the same identity tuple do not collide. The variable name is also included: the stored history keys its dims/coords by name, so two same-shaped sweeps with different names are different experiments — without the name in the key, renaming an input var would silently concat a fresh dataset against history whose dimension no longer exists, broadcasting both dims and fabricating never-measured points.

Note: the sample cache is keyed solely by concrete input values (see bencher.worker_job.WorkerJob) and is unaffected by this hash, so widening a sweep range still reuses per-sample cache entries for overlapping inputs.

hash_persistent() → str

Deterministic hash based on _sweep_identity().

Avoids Python’s per-process hash randomisation so two Bench runs (or two processes) compute identical cache keys for equivalent sweeps.

sampling_str() → str: Generate a string representation of the of the sampling procedure

as_slider() → panel.widgets.slider.DiscreteSlider

given a sweep variable (self), return the range of values as a panel slider

Parameters:: debug (bool, optional) – pass to the sweepvar to produce a full set of variables, or when debug=True, a reduces number of sweep vars. Defaults to False.
Returns:: A panel slider with the values() of the sweep variable
Return type:: pn.widgets.slider.DiscreteSlider

as_dim(compute_values=False) → holoviews.Dimension

Takes a sweep variable and turns it into a holoview dimension

Return type:: hv.Dimension

indices_to_samples(desires_num_samples, sample_values)

with_samples(samples: int) → SweepBase

_coerce_bound(value): Override in subclasses to coerce bound values to the correct type.

with_bounds(low: float, high: float, samples: int | None = None) → SweepBase

Create a copy with overridden sweep bounds (and optionally sample count).

Parameters:

low – Lower bound of the sweep range.
high – Upper bound of the sweep range.
samples – Number of samples. When None the existing sample count is kept.

Returns:

A new sweep with the specified bounds.

Return type:

SweepBase

Raises:

ValueError – If low >= high or the sweep has no bounds attributes.

with_sample_values(sample_values: list) → SweepBase

__call__(values: list | None = None, *, samples: int | None = None, bounds: tuple[float, float] | None = None) → SweepBase

Shorthand for creating a sweep with specific values, sample count, or bounds.

Usage:

Cfg.param.theta([0, 0.5, 1.0])            # explicit values
Cfg.param.theta(samples=5)                  # override sample count
Cfg.param.theta(bounds=(0, 1))              # override range
Cfg.param.theta(bounds=(0, 1), samples=10)  # override range and count

Parameters:

values – Explicit list of values to sweep through.
samples – Number of samples to take from the sweep range.
bounds – (low, high) tuple to override the sweep range.

Returns:

A copy of this sweep with the specified configuration.

Return type:

SweepBase

with_const(const_value: Any) → tuple[SweepBase, Any]

Create a new instance of SweepBase with a constant value.

Parameters:: const_value (Any) – The constant value to be associated with the new instance.
Returns:: A tuple containing the new instance of SweepBase and the constant value.
Return type:: tuple[SweepBase, Any]

with_subsampling_divisions(subsampling_divisions: int = 1, max_subsampling_divisions: int = 12) → SweepBase

with_level(level: int = 1, max_level: int = 12) → SweepBase: Deprecated: use with_subsampling_divisions() instead.

class bencher.YamlSweep(yaml_path: str | pathlib.Path, units: str = 'ul', samples: int | None = None, default_key: str | None = None, optimize: bool = True, **params)

Bases: SweepSelector

Sweep over configurations stored in a YAML file.

Loads the YAML mapping once during initialisation and exposes each top-level key as a sweep choice. Each sampled value is a YamlSelection instance that exposes the underlying YAML content via the value attribute (and dict-like helpers).

__slots__ = ['units', 'samples', 'optimize', 'yaml_path', '_entries', 'default_key']

_sweep_hash_exclude = ('yaml_path', '_entries', 'default_key')

yaml_path = ''

_entries

default_key = None

static _load_yaml(path: pathlib.Path) → Any

keys() → list[str]

items() → list[tuple[str, Any]]

values() → list[Any]

Return all the values for the parameter sweep.

Returns:: A list of parameter values to sweep through
Return type:: list[Any]

key_for_value(value: Any) → str | None

class bencher.TimeSnapshot(datetime_src: datetime.datetime | str, units: str = 'time', samples: int | None = None, **params)

Bases: TimeBase

A class to capture a time snapshot of benchmark values. Time is represent as a continuous value i.e a datetime which is converted into a np.datetime64. To represent time as a discrete value use the TimeEvent class. The distinction is because holoview and plotly code makes different assumptions about discrete vs continuous variables

__slots__ = ['units', 'samples', 'optimize']

units = 'time'

optimize = False

bencher.box(name: str, center: float, width: float) → FloatSweep

Create a FloatSweep parameter centered around a value with a given width.

This is a convenience function to create a bounded FloatSweep parameter with bounds centered on a specific value, extending by the width in both directions.

Parameters:

name (str) – The name of the parameter
center (float) – The center value of the parameter
width (float) – The distance from the center to the bounds in both directions

Returns:

A FloatSweep parameter with the specified name, default, and bounds

Return type:

FloatSweep

bencher.p(name: str | bencher.variables.sweep_base.SweepBase, values: list[Any] | None = None, *, samples: int | None = None, bounds: tuple[float, float] | None = None, max_subsampling_divisions: int | None = None) → dict[str, Any] | bencher.variables.sweep_base.SweepBase: Deprecated: use bn.sweep() instead.

Create a parameter specification for use in plot_sweep input_vars.

Accepts either a string parameter name (returns a dict for deferred lookup) or a SweepBase parameter object (returns a configured sweep directly).

Examples:

bn.sweep("theta", [0, 0.5, 1.0])                  # explicit values
bn.sweep("theta", samples=5)                        # override sample count
bn.sweep("theta", bounds=(0, 1))                    # override range
bn.sweep("theta", bounds=(0, 1), samples=10)        # override range + count
bn.sweep(Cfg.param.theta, bounds=(0, 1), samples=5) # SweepBase object

Parameters:

name – The parameter name (str) or a param object (e.g. Cfg.param.theta).
values – A list of values for the parameter.
samples – The number of samples. Must be > 0 if provided.
bounds – (low, high) tuple to override the sweep range.
max_subsampling_divisions – The maximum subsampling_divisions. Must be > 0 if provided.

Returns:

A parameter dict (for string names) or configured sweep object.

Return type:

dict[str, Any] | SweepBase

bencher.with_subsampling_divisions(arr: list, subsampling_divisions: int) → list

Apply subsampling_divisions-based sampling to a list of values.

Uses the same subsampling_divisions→sample-count table as SweepBase.with_subsampling_divisions and picks evenly spaced items from arr by index.

Parameters:

arr (list) – list of values to sample from
subsampling_divisions (int) – The sampling subsampling_divisions to apply (higher subsampling_divisions provides more samples)

Returns:

The subsampling_divisions-sampled values

Return type:

list

class bencher.ResultFloat(units='ul', direction: OptDir = OptDir.minimize, share_axis=True, max_time_events=None, default=float('nan'), meaning_version=1, **params)

Bases: param.Number

A class to represent continuous float result variables and the desired optimisation direction.

For boolean (success/failure) outcomes, use ResultBool instead — it locks bounds to [0, 1] and produces correct boolean-style plots.

__slots__ = ['units', 'direction', 'share_axis', 'max_time_events', 'meaning_version']

_hash_exclude = ('direction', 'share_axis', 'max_time_events')

units = 'ul'

meaning_version = 1

default

direction

share_axis = True

max_time_events = None

as_dim() → holoviews.Dimension

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultVar(*args, **kwargs)

Bases: ResultFloat

Deprecated: use ResultFloat instead.

class bencher.ResultBool(units='ratio', direction: OptDir = OptDir.minimize, default=float('nan'), **params)

Bases: ResultFloat

A result type for binary outcomes (success/failure, pass/fail, reachable/unreachable).

Bounds are locked to [0, 1] and plots use boolean-style rendering. For continuous scalar metrics (time, distance, score), use ResultFloat instead.

default

bounds = (0, 1)

_validate_bounds(val, bounds, inclusive_bounds)

bencher.SCALAR_RESULT_TYPES

class bencher.ResultVec(size, units='ul', direction: OptDir = OptDir.minimize, max_time_events=None, default=float('nan'), **params)

Bases: param.List

A class to represent fixed size vector result variable

__slots__ = ['units', 'direction', 'size', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'ul'

default

direction

size

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

index_name(idx: int) → str

given the index of the vector, return the column name that

Parameters:: idx (int) – index of the result vector
Returns:: column name of the vector for the xarray dataset
Return type:: str

index_names() → list[str]

Returns a list of all the xarray column names for the result vector

Returns:: column names
Return type:: list[str]

class bencher.ResultHmap(default: Any = ..., *, doc: str | None = None, label: str | None = None, precedence: float | None = None, instantiate: bool = False, constant: bool = False, readonly: bool = False, pickle_default_value: bool = True, allow_None: Literal[False] = False, per_instance: bool = True, allow_refs: bool = False, nested_refs: bool = False, default_factory: collections.abc.Callable[[], Any] | None = None, metadata: dict[str, Any] | None = None)

Bases: param.Parameter

A class to represent a holomap return type.

Note: this class has no __slots__, so _hash_slots hashes only the class name. Every ResultHmap instance produces the same hash. This is intentional — there are no configuration attributes that would differentiate instances. If a slot is added in the future, _hash_slots will automatically include it.

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultPath(default=None, units='path', max_time_events=None, **params)

Bases: param.Filename

Parameter that can be set to a string specifying the path of a file.

The string should be specified in UNIX style, but it will be returned in the format of the user’s operating system.

The specified path can be absolute, or relative to either:

any of the paths specified in the search_paths attribute (if search_paths is not None);
any of the paths searched by resolve_path() (if search_paths is None).

__slots__ = ['units', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'path'

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

to_container(): Returns a partial function for creating a FileDownload widget with embedding enabled. This function is used to create a panel container to represent the ResultPath object

class bencher.ResultVideo(default=None, units='path', max_time_events=None, **params)

Bases: param.Filename

Parameter that can be set to a string specifying the path of a file.

The string should be specified in UNIX style, but it will be returned in the format of the user’s operating system.

The specified path can be absolute, or relative to either:

any of the paths specified in the search_paths attribute (if search_paths is not None);
any of the paths searched by resolve_path() (if search_paths is None).

__slots__ = ['units', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'path'

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultImage(default=None, units='path', max_time_events=None, **params)

Bases: param.Filename

Parameter that can be set to a string specifying the path of a file.

The string should be specified in UNIX style, but it will be returned in the format of the user’s operating system.

The specified path can be absolute, or relative to either:

any of the paths specified in the search_paths attribute (if search_paths is not None);
any of the paths searched by resolve_path() (if search_paths is None).

__slots__ = ['units', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'path'

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultString(default=None, units='str', max_time_events=None, **params)

Bases: param.String

A String Parameter with optional regular expression (regex) validation.

The String class extends the Parameter class to specifically handle string values and provides additional support for validating values against a regular expression.

Parameters:

default (str, optional) – The default value of the parameter. Default is an empty string ("").
regex (str or None, optional) – A regular expression used to validate the string value. If None, no regex validation is applied. Default is None.

Examples

Define a String parameter with regex validation:

>>> import param
>>> class MyClass(param.Parameterized):
...     user_name = param.String(default="John Doe", regex=r"^[A-Za-z ]+$", doc="Name of a person.")
>>> instance = MyClass()

Access the default value:

>>> instance.user_name
'John Doe'

Set a valid value:

>>> instance.user_name = "Jane Smith"
>>> instance.user_name
'Jane Smith'

Attempt to set an invalid value (non-alphabetic characters):

>>> instance.user_name = "Jane123"
...
ValueError: String parameter 'MyClass.user_name' value 'Jane123' does not match regex '^[A-Za-z ]+$'.

__slots__ = ['units', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'str'

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultContainer(default=None, units='container', max_time_events=None, **params)

Bases: param.Parameter

Base Parameter type to hold any type of Python object.

Parameters are a special kind of class attribute implemented as descriptor. Setting a Parameterized class attribute to a Parameter instance enables enhanced functionality, including type and range validation at assignment, support for constant and read-only parameters, documentation strings and dynamic parameter values.

Parameters can only be used as class attributes of Parameterized classes. Using them in standalone contexts or with non-Parameterized classes will not provide the described behavior.

Notes

Parameters provide lots of features.

Dynamic Behavior:

Parameters provide support for dynamic values, type validation, and range checking.
Parameters can be declared as constant or read-only.

Automatic Initialization:

Parameters can be set during object construction using keyword arguments. For example: myfoo = Foo(alpha=0.5); print(myfoo.alpha).
If custom constructors are implemented, they can still pass keyword arguments to the superclass to allow Parameter initialization.

Inheritance:

Parameterized classes automatically inherit parameters from their superclasses. Attributes can be selectively overridden.

Subclassing:

The Parameter class can be subclassed to create custom behavior, such as validating specific ranges or generating values dynamically.

GUI Integration:

Parameters provide sufficient metadata for auto-generating property sheets in graphical user interfaces, enabling user-friendly parameter editing.

Examples

Define a Parameterized class with parameters:

>>> import param
>>> class Foo(param.Parameterized):
...     alpha = param.Parameter(default=0.1, doc="The starting value.")
...     beta = param.Parameter(default=0.5, doc="The standard deviation.", constant=True)

When no initial value is provided the default is used:

>>> Foo().alpha
0.1

When an initial value is provided it is used:

>>> foo = Foo(alpha=0.5)
>>> foo.alpha
0.5

Constant parameters cannot be modified:

>>> foo.beta = 0.1  # Cannot be changed since it's constant
...
TypeError: Constant parameter 'beta' cannot be modified

__slots__ = ['units', 'max_time_events']

_hash_exclude = ('max_time_events',)

units = 'container'

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultRerun(default=None, units='rerun', width=600, height=600, max_time_events=None, **params)

Bases: ResultContainer

Result type for rerun .rrd spatial visualizations.

Stores a path to an .rrd file (like ResultContainer) but carries viewer sizing metadata and provides a dedicated to_container() that renders the file with the rerun web viewer.

Usage in a ParametrizedSweep:

out_rerun = ResultRerun(width=600, height=600)

def benchmark(self):
    rr.log("boxes", rr.Boxes2D(half_sizes=[self.theta, 1]))
    self.out_rerun = bn.capture_rerun_window(width=600, height=600)

__slots__ = ['width', 'height']

_hash_exclude = ('width', 'height')

width = 600

height = 600

to_container(): Return a callable that renders an .rrd file path as a rerun viewer pane.

class bencher.ResultReference(obj: Any | None = None, container: Callable[Any, panel.pane.panel] | None = None, default: Any | None = None, units: str = 'container', max_time_events=None, **params)

Bases: param.Parameter

Use this class to save arbitrary objects that are not picklable or native to panel. You can pass a container callback that takes the object and returns a panel pane to be displayed

__slots__ = ['units', 'obj', 'container', 'max_time_events']

_hash_exclude = ('obj', 'container', 'max_time_events')

units = 'container'

obj = None

container = None

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.ResultVolume(obj=None, default=None, units='container', max_time_events=None, **params)

Bases: param.Parameter

Base Parameter type to hold any type of Python object.

Parameters are a special kind of class attribute implemented as descriptor. Setting a Parameterized class attribute to a Parameter instance enables enhanced functionality, including type and range validation at assignment, support for constant and read-only parameters, documentation strings and dynamic parameter values.

Parameters can only be used as class attributes of Parameterized classes. Using them in standalone contexts or with non-Parameterized classes will not provide the described behavior.

Notes

Parameters provide lots of features.

Dynamic Behavior:

Parameters provide support for dynamic values, type validation, and range checking.
Parameters can be declared as constant or read-only.

Automatic Initialization:

Parameters can be set during object construction using keyword arguments. For example: myfoo = Foo(alpha=0.5); print(myfoo.alpha).
If custom constructors are implemented, they can still pass keyword arguments to the superclass to allow Parameter initialization.

Inheritance:

Parameterized classes automatically inherit parameters from their superclasses. Attributes can be selectively overridden.

Subclassing:

The Parameter class can be subclassed to create custom behavior, such as validating specific ranges or generating values dynamically.

GUI Integration:

Parameters provide sufficient metadata for auto-generating property sheets in graphical user interfaces, enabling user-friendly parameter editing.

Examples

Define a Parameterized class with parameters:

>>> import param
>>> class Foo(param.Parameterized):
...     alpha = param.Parameter(default=0.1, doc="The starting value.")
...     beta = param.Parameter(default=0.5, doc="The standard deviation.", constant=True)

When no initial value is provided the default is used:

>>> Foo().alpha
0.1

When an initial value is provided it is used:

>>> foo = Foo(alpha=0.5)
>>> foo.alpha
0.5

Constant parameters cannot be modified:

>>> foo.beta = 0.1  # Cannot be changed since it's constant
...
TypeError: Constant parameter 'beta' cannot be modified

__slots__ = ['units', 'obj', 'max_time_events']

_hash_exclude = ('obj', 'max_time_events')

units = 'container'

obj = None

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

class bencher.OptDir

Bases: strenum.StrEnum

StrEnum is a Python enum.Enum that inherits from str. The default auto() behavior uses the member name as its value.

Example usage:

class Example(StrEnum):
    UPPER_CASE = auto()
    lower_case = auto()
    MixedCase = auto()

assert Example.UPPER_CASE == "UPPER_CASE"
assert Example.lower_case == "lower_case"
assert Example.MixedCase == "MixedCase"

minimize

maximize

none

class bencher.ResultDataSet(obj: Any | None = None, default: Any | None = None, units: str = 'dataset', max_time_events=None, **params)

Bases: param.Parameter

Base Parameter type to hold any type of Python object.

Parameters are a special kind of class attribute implemented as descriptor. Setting a Parameterized class attribute to a Parameter instance enables enhanced functionality, including type and range validation at assignment, support for constant and read-only parameters, documentation strings and dynamic parameter values.

Parameters can only be used as class attributes of Parameterized classes. Using them in standalone contexts or with non-Parameterized classes will not provide the described behavior.

Notes

Parameters provide lots of features.

Dynamic Behavior:

Parameters provide support for dynamic values, type validation, and range checking.
Parameters can be declared as constant or read-only.

Automatic Initialization:

Parameters can be set during object construction using keyword arguments. For example: myfoo = Foo(alpha=0.5); print(myfoo.alpha).
If custom constructors are implemented, they can still pass keyword arguments to the superclass to allow Parameter initialization.

Inheritance:

Parameterized classes automatically inherit parameters from their superclasses. Attributes can be selectively overridden.

Subclassing:

The Parameter class can be subclassed to create custom behavior, such as validating specific ranges or generating values dynamically.

GUI Integration:

Parameters provide sufficient metadata for auto-generating property sheets in graphical user interfaces, enabling user-friendly parameter editing.

Examples

Define a Parameterized class with parameters:

>>> import param
>>> class Foo(param.Parameterized):
...     alpha = param.Parameter(default=0.1, doc="The starting value.")
...     beta = param.Parameter(default=0.5, doc="The standard deviation.", constant=True)

When no initial value is provided the default is used:

>>> Foo().alpha
0.1

When an initial value is provided it is used:

>>> foo = Foo(alpha=0.5)
>>> foo.alpha
0.5

Constant parameters cannot be modified:

>>> foo.beta = 0.1  # Cannot be changed since it's constant
...
TypeError: Constant parameter 'beta' cannot be modified

__slots__ = ['units', 'obj', 'max_time_events']

_hash_exclude = ('obj', 'max_time_events')

units = 'dataset'

obj = None

max_time_events = None

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

bencher.curve(x_vals: list[float], y_vals: list[float], x_name: str, y_name: str, label: str | None = None, **kwargs) → holoviews.Curve

class bencher.ComposeType

Bases: strenum.StrEnum

StrEnum is a Python enum.Enum that inherits from str. The default auto() behavior uses the member name as its value.

Example usage:

class Example(StrEnum):
    UPPER_CASE = auto()
    lower_case = auto()
    MixedCase = auto()

assert Example.UPPER_CASE == "UPPER_CASE"
assert Example.lower_case == "lower_case"
assert Example.MixedCase == "MixedCase"

right

down

sequence

overlay

flip()

static from_horizontal(horizontal: bool)

class bencher.ComposableContainerBase

A base class for renderer backends. A composable renderer

compose_method: ComposeType

container: list[Any] = []

label_len: int = 0

static label_formatter(var_name: str, var_value: int | float | str) → str

Take a variable name and values and return a pretty version with approximate fixed width

Parameters:

var_name (str) – The name of the variable, usually a dimension
var_value (int | float | str) – The value of the dimension

Returns:

Pretty string representation with fixed width

Return type:

str

append(obj: Any) → None

Add an object to the container. The relationship between the objects is defined by the ComposeType

Parameters:: obj (Any) – Object to add to the container

render()

Return a representation of the container that can be composed with other render() results. This function can also be used to defer layout and rending options until all the information about the container content is known. You may need to override this method depending on the container. See composable_container_video as an example.

Returns:: Visual representation of the container that can be combined with other containers
Return type:: Any

class bencher.PaneLayout

Bases: strenum.StrEnum

Controls how multi-dimensional data is laid out in panel displays.

grid: Use rows/columns for all dimensions (default, existing behavior) tabs: Use tabs for all outer dimensions, only the innermost uses grid tabs_and_grid: Use tabs for the outermost dimension, grid for inner dimensions

grid

tabs

tabs_and_grid

classmethod all() → list[PaneLayout]: Return all layout values. Use this instead of hard-coded name lists.

class bencher.ComposableContainerVideo

Bases: bencher.results.composable_container.composable_container_base.ComposableContainerBase

A base class for renderer backends. A composable renderer

append(obj: moviepy.VideoClip | moviepy.ImageClip | str | numpy.ndarray) → None

Appends an image or video to the container

Parameters:: obj (VideoClip | ImageClip | str | np.ndarray) – Any representation of an image or video
Raises:: RuntimeWarning – if file format is not recognised

calculate_duration(frames, render_cfg: RenderCfg)

render(render_cfg: RenderCfg | None = None, **kwargs) → moviepy.CompositeVideoClip

Composes the images/videos into a single image/video based on the type of compose method

Parameters:: compose_method (ComposeType, optional) – optionally override the default compose type. Defaults to None.
Returns:: A composite video clip containing the images/videos added via append()
Return type:: CompositeVideoClip

to_video(render_args: RenderCfg | None = None) → str

Returns the composite video clip as a webm file path

Returns:: webm filepath
Return type:: str

deep()

extend_clip(clip: moviepy.VideoClip, desired_duration: float)

class bencher.RenderCfg

Configuration class for video rendering options.

This class controls how videos and images are composed and rendered together. It provides options for timing, layout, appearance, and labeling of the output.

compose_method

Method to compose multiple clips (sequence, right, down, overlay). Defaults to ComposeType.sequence.

Type:: ComposeType

var_name

Variable name for labeling. Defaults to None.

Type:: str, optional

var_value

Variable value for labeling. Defaults to None.

Type:: str, optional

background_col

RGB color for background. Defaults to white (255, 255, 255).

Type:: tuple[int, int, int]

duration

Target duration for the composed video in seconds. Defaults to 10.0.

Type:: float

default_duration

Fallback duration when duration is None. Defaults to 10.0.

Type:: float

duration_target

If True, tries to match target duration while respecting frame duration constraints. If False, uses exact duration. Defaults to True.

Type:: bool

min_frame_duration

Minimum duration for each frame in seconds. Defaults to 1/30.

Type:: float

max_frame_duration

Maximum duration for each frame in seconds. Defaults to 2.0.

Type:: float

margin

Margin size in pixels to add around clips. Defaults to 0.

Type:: int

compose_method: bencher.results.composable_container.composable_container_base.ComposeType

var_name: str | None = None

var_value: str | None = None

background_col: tuple[int, int, int] = (255, 255, 255)

duration: float = 10.0

default_duration: float = 10.0

duration_target: bool = True

min_frame_duration: float = 0.03333333333333333

max_frame_duration: float = 2.0

margin: int = 0

class bencher.ComposableContainerPanel

Bases: bencher.results.composable_container.composable_container_base.ComposableContainerBase

A base class for renderer backends. A composable renderer

name: str | None = None

var_name: str | None = None

var_value: str | None = None

width: int | None = None

background_col: str | None = None

horizontal: bool | None = None

__post_init__() → None

append(obj)

Add an object to the container. The relationship between the objects is defined by the ComposeType

Parameters:: obj (Any) – Object to add to the container

render()

Return a representation of the container that can be composed with other render() results. This function can also be used to defer layout and rending options until all the information about the container content is known. You may need to override this method depending on the container. See composable_container_video as an example.

Returns:: Visual representation of the container that can be combined with other containers
Return type:: Any

class bencher.ComposableContainerDataset

Bases: bencher.results.composable_container.composable_container_base.ComposableContainerBase

A base class for renderer backends. A composable renderer

var_name: str | None = None

var_value: str | None = None

render(**kwargs)

Return a representation of the container that can be composed with other render() results. This function can also be used to defer layout and rending options until all the information about the container content is known. You may need to override this method depending on the container. See composable_container_video as an example.

Returns:: Visual representation of the container that can be combined with other containers
Return type:: Any

class bencher.BoxWhiskerResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.distribution_result.distribution_result.DistributionResult

A class for creating box and whisker plots from benchmark results.

Box and whisker plots are useful for visualizing the distribution of data, including the median, quartiles, and potential outliers. This class provides methods to generate these plots from benchmark data, particularly useful for comparing distributions across different categorical variables or between different repetitions of the same benchmark.

Box plots show: - The median (middle line in the box) - The interquartile range (IQR) as a box (25th to 75th percentile) - Whiskers extending to the furthest data points within 1.5*IQR - Outliers as individual points beyond the whiskers

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs: Any) → panel.panel | None

Generates a box and whisker plot from benchmark data.

This method applies filters to ensure the data is appropriate for a box plot and then passes the filtered data to to_boxplot_ds for rendering.

Parameters:

result_var (Parameter | None) – The result variable to plot. If None, uses the default.
override (bool) – Whether to override filter restrictions. Defaults to True.
**kwargs (Any) – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the box plot if data is appropriate, otherwise returns filter match results.

Return type:

pn.panel | None

to_boxplot_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs: Any) → holoviews.BoxWhisker

Creates a box and whisker plot from the provided dataset.

Given a filtered dataset, this method generates a box and whisker visualization showing the distribution of values for a result variable, potentially grouped by a categorical variable.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs (Any) – Additional keyword arguments for plot customization such as: - box_fill_color: Color for the box - whisker_color: Color for the whiskers - outlier_color: Color for outlier points - line_width: Width of lines in the plot

Returns:

A HoloViews BoxWhisker plot of the benchmark data.

Return type:

hv.BoxWhisker

class bencher.ViolinResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.distribution_result.distribution_result.DistributionResult

A class for creating violin plots from benchmark results.

Violin plots combine aspects of box plots with kernel density plots, showing the distribution shape of the data. This class provides methods to generate these plots from benchmark data, which is particularly useful for visualizing the distribution of metrics across different configurations or repetitions.

Violin plots display: - The full probability density of the data (the width of the “violin” at each point) - Summary statistics like median and interquartile ranges - The overall distribution shape, revealing features like multi-modality that

box plots might miss

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs: Any) → panel.panel | None

Generates a violin plot from benchmark data.

This method applies filters to ensure the data is appropriate for a violin plot and then passes the filtered data to to_violin_ds for rendering.

Parameters:

result_var (Parameter | None) – The result variable to plot. If None, uses the default.
override (bool) – Whether to override filter restrictions. Defaults to True.
**kwargs (Any) – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the violin plot if data is appropriate, otherwise returns filter match results.

Return type:

pn.panel | None

to_violin_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs: Any) → holoviews.Violin

Creates a violin plot from the provided dataset.

Given a filtered dataset, this method generates a violin plot visualization showing the distribution of values for a result variable, potentially grouped by a categorical variable.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs (Any) – Additional keyword arguments for plot customization, such as: - violin_color: Color for the violin body - inner_color: Color for inner statistics markers - line_width: Width of outline lines - bandwidth: Controls the smoothness of the density estimate

Returns:

A HoloViews Violin plot of the benchmark data.

Return type:

hv.Violin

class bencher.ScatterResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating scatter plots from benchmark results.

Scatter plots are useful for visualizing the distribution of individual data points and identifying patterns, clusters, or outliers. This class generates scatter plots that can be grouped by categorical variables.

to_plot(**kwargs) → panel.panel | None: Creates a scatter plot. See to_scatter for parameters.

to_scatter(result_var: param.Parameter | None = None, override: bool = True, **kwargs) → panel.panel | None

Creates a standard scatter plot from benchmark data.

Parameters:

result_var (Parameter, optional) – The result variable to plot.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
**kwargs – Additional keyword arguments passed to the scatter plot options.

Returns:

A panel containing the scatter plot, or filter match results.

Return type:

pn.panel | None

_to_scatter_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → panel.panel | None

Creates a scatter plot from the provided dataset.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs – Additional keyword arguments passed to the scatter plot.

Returns:

A scatter plot visualization.

Return type:

pn.panel | None

class bencher.ScatterJitterResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.distribution_result.distribution_result.DistributionResult

A class for creating scatter jitter plots from benchmark results.

Scatter jitter plots display individual data points with slight random offsets to avoid overlapping, making it easier to visualize the distribution of data. This is particularly useful for smaller datasets where showing individual points provides more insight than aggregate statistics, or alongside box plots to show the actual data distribution.

Key features: - Displays individual data points rather than statistical summaries - Applies controlled random offsets to avoid point overlap - Useful for revealing the actual sample size and distribution - Complements statistical plots like box plots or violin plots

to_plot(result_var: param.Parameter | None = None, override: bool = True, jitter: float = 0.1, target_dimension: int | None = None, **kwargs: Any) → panel.panel | None

Generates a scatter jitter plot from benchmark data.

This method applies filters to ensure the data is appropriate for a scatter plot and then passes the filtered data to to_scatter_jitter_ds for rendering.

Parameters:

result_var – The result variable to plot. If None, uses the default.
override – Whether to override filter restrictions. Defaults to True.
jitter – Amount of jitter to apply to points. Defaults to 0.1.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the scatter jitter plot if data is appropriate, otherwise returns filter match results.

to_scatter_jitter_ds(dataset: xarray.Dataset, result_var: param.Parameter, jitter: float = 0.1, **kwargs: Any) → holoviews.Scatter

Creates a scatter jitter plot from the provided dataset.

Given a filtered dataset, this method generates a scatter visualization showing individual data points with random jitter to avoid overlapping, making the distribution of values more visible.

Parameters:

dataset – The dataset containing benchmark results.
result_var – The result variable to plot.
jitter – Amount of jitter to apply to points. Defaults to 0.1.
**kwargs – Additional keyword arguments for plot customization, such as: - color: Color for data points - size: Size of data points - alpha: Transparency of data points - marker: Shape of data points (‘o’, ‘s’, ‘d’, etc.)

Returns:

A HoloViews Scatter plot of the benchmark data with jittered points.

class bencher.BarResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating bar chart visualizations from benchmark results.

Bar charts are effective for comparing values across categorical variables or discrete data points. This class provides methods to generate bar charts that display benchmark results, particularly useful for comparing performance metrics between different configurations or categories.

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs) → panel.panel | None

to_bar(result_var: param.Parameter | None = None, override: bool = True, target_dimension: int = 2, **kwargs) → panel.panel | None

Generates a bar chart from benchmark data.

This method applies filters to ensure the data is appropriate for a bar chart and then passes the filtered data to to_bar_ds for rendering.

Parameters:

result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
target_dimension (int, optional) – The target dimensionality for data filtering. Defaults to 2.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the bar chart if data is appropriate,: otherwise returns filter match results.

Return type:

pn.panel | None

to_bar_ds(dataset: xarray.Dataset, result_var: param.Parameter | None = None, **kwargs)

Creates a bar chart from the provided dataset.

Given a filtered dataset, this method generates a bar chart visualization showing values of the result variable, potentially grouped by categorical variables.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
**kwargs – Additional keyword arguments passed to the bar chart options.

Returns:

A bar chart visualization of the benchmark data.

Return type:

hvplot.element.Bars | hv.HoloMap

class bencher.LineResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating line plot visualizations from benchmark results.

Line plots are effective for visualizing trends in data over a continuous variable. This class provides methods to generate interactive line plots from benchmark data, with options for adding interactive tap functionality to display detailed information about specific data points.

to_plot(**kwargs) → panel.panel | None: Generates a line plot. See to_line for parameters.

to_line(result_var: param.Parameter | None = None, tap_var=None, tap_container: panel.pane.panel = None, target_dimension=2, override: bool = True, use_tap: bool = _USE_TAP, **kwargs) → panel.panel | None

Generates a line plot from benchmark data.

Parameters:

result_var (Parameter, optional) – The result variable to plot.
tap_var – Variables to display when tapping on line plot points.
tap_container (pn.pane.panel, optional) – Container to hold tapped information.
target_dimension (int, optional) – Target dimensionality for the plot. Defaults to 2.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
use_tap (bool, optional) – Whether to enable tap functionality.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the line plot, or filter match results.

Return type:

pn.panel | None

to_line_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs)

Creates a basic line plot from the provided dataset.

When over_time is active with multiple time points, creates an hv.HoloMap with a slider.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs – Additional keyword arguments passed to the line plot options.

Returns:

A line plot visualization.

Return type:

hvplot.element.Curve | pn.Column

_to_line_tap_ds(dataset: xarray.Dataset, result_var: param.Parameter, result_var_plots: list[param.Parameter] | None = None, container: panel.pane.panel = pn.pane.panel, **kwargs) → panel.Row

Creates an interactive line plot with tap functionality.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The primary result variable to plot.
result_var_plots (list[Parameter], optional) – Additional result variables to display when a point is tapped.
container (pn.pane.panel, optional) – Container to display tapped information.
**kwargs – Additional keyword arguments passed to the line plot options.

Returns:

A panel row containing the interactive line plot and tap info.

Return type:

pn.Row

class bencher.CurveResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating curve plots with optional standard-deviation spread.

Curve plots show the relationship between a continuous input variable and a result variable. When multiple benchmark repetitions are available, standard deviation bounds are displayed using an hv.Spread overlay.

to_plot(**kwargs) → holoviews.Curve | None: Generates a curve plot. See to_curve for parameters.

to_curve(result_var: param.Parameter | None = None, override: bool = True, **kwargs) → holoviews.Curve | None

Generates a curve plot from benchmark data.

Parameters:

result_var (Parameter, optional) – The result variable to plot.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A curve plot, or filter match results.

Return type:

hv.Curve | None

to_curve_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → holoviews.Curve | None

Creates a curve plot from the provided dataset.

Generates a curve with optional standard deviation spread overlay.

When over_time is active with multiple time points, builds per-time-point curves inside an hv.HoloMap so the slider controls the time dimension.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs – Additional keyword arguments passed to the curve plot options.

Returns:

A curve plot with optional standard deviation spread.

Return type:

hv.Curve | None

class bencher.HeatmapResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating heatmap visualizations from benchmark results.

Heatmaps are effective for visualizing the relationship between two input variables and a result variable by using color intensity to represent the result values. This class provides methods for generating interactive heatmaps that can display additional information when hovering over or selecting points on the heatmap.

to_plot(**kwargs) → panel.panel | None: Generates a heatmap visualization. See to_heatmap for parameters.

to_heatmap(result_var: param.Parameter | None = None, tap_var=None, tap_container: panel.pane.panel = None, tap_container_direction: panel.Column | panel.Row | None = None, target_dimension=2, override: bool = True, use_tap: bool = _USE_TAP, **kwargs) → panel.panel | None

Generates a heatmap visualization from benchmark data.

Parameters:

result_var (Parameter, optional) – The result variable to plot.
tap_var – Variables to display when tapping on heatmap points.
tap_container (pn.pane.panel, optional) – Container to hold tapped information.
tap_container_direction (pn.Column | pn.Row, optional) – Layout direction for tap containers.
target_dimension (int, optional) – Target dimensionality. Defaults to 2.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
use_tap (bool, optional) – Whether to enable tap functionality.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the heatmap, or filter match results.

Return type:

pn.panel | None

_pick_xy_axes() → tuple[str, str]: Pick x/y axis names, preferring float vars then falling back to cat vars.

to_heatmap_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → holoviews.HeatMap | holoviews.HoloMap | None

Creates a basic heatmap from the provided dataset.

When over_time is active with multiple time points, creates an hv.HoloMap with a slider.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
**kwargs – Additional keyword arguments passed to the heatmap options.

Returns:

A heatmap visualization, or None if: the dataset has fewer than 2 dimensions.

Return type:

hv.HeatMap | hv.HoloMap | None

_to_heatmap_tap_ds(dataset: xarray.Dataset, result_var: param.Parameter, result_var_plots: list[param.Parameter] | None = None, container: panel.pane.panel = None, tap_container_direction: panel.Column | panel.Row | None = None, **kwargs) → panel.Row

Creates an interactive heatmap with tap functionality.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The primary result variable to plot.
result_var_plots (list[Parameter], optional) – Additional result variables to display when a point is tapped.
container (pn.pane.panel, optional) – Container to display tapped information.
tap_container_direction (pn.Column | pn.Row, optional) – Layout direction for tap containers.
**kwargs – Additional keyword arguments passed to the heatmap options.

Returns:

A panel row containing the interactive heatmap and tap info.

Return type:

pn.Row

to_heatmap_tap(result_var: param.Parameter, reduce: bencher.results.bench_result_base.ReduceType = ReduceType.AUTO, width=800, height=800, **kwargs)

Creates a tappable heatmap that shows details when tapped.

Uses hv.streams.Tap for static click coordinates rather than PointerXY hover tracking.

Parameters:

result_var (Parameter) – The result variable to plot.
reduce (ReduceType, optional) – How to reduce the data. Defaults to ReduceType.AUTO.
width (int, optional) – Width of the plot in pixels. Defaults to 800.
height (int, optional) – Height of the plot in pixels. Defaults to 800.
**kwargs – Additional keyword arguments.

Returns:

A layout containing the heatmap and a dynamically updated detail view.

Return type:

hv.Layout

class bencher.BandResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

Percentile band plot showing distribution spread over a continuous axis.

Displays nested shaded bands (e.g. 10th-90th and 25th-75th percentiles) with a median line and individual scatter points, giving a richer view of the distribution than mean +/- std. Particularly useful with agg_over_dims to show how a high-dimensional sweep’s distribution evolves over time.

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs) → holoviews.Overlay | None

to_band(result_var: param.Parameter | None = None, override: bool = True, **kwargs)

to_band_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → holoviews.Overlay | None

Create a percentile band plot from the provided dataset.

Flattens all dimensions except the continuous axis (typically over_time) into a sample pool and computes percentiles across those samples.

_band_over_time(dataset: xarray.Dataset, var: str, title: str | None, units: str = '', **kwargs) → holoviews.Overlay | None: Build percentile bands with time on x-axis.

_band_static(dataset: xarray.Dataset, var: str, title: str | None, agg_over_dims: list[str] | None, units: str = '', **kwargs) → holoviews.Overlay | None: Build percentile bands over a non-time continuous axis.

static _build_scatter_data(x_coords, values, **kwargs) → tuple[numpy.ndarray | None, numpy.ndarray | None]

Build scatter arrays from the 2-D values grid, with optional downsampling.

Parameters:

x_coords – 1-D array of x-axis coordinates.
values – 2-D array of shape (n_x, n_samples).
**kwargs – Optional max_scatter_points (int, default 50_000) to cap the number of scatter points, and enable_scatter (bool, default True) to disable the scatter layer entirely.

Returns:

(scatter_x, scatter_y) arrays, or (None, None) when scatter is disabled.

static _build_band_overlay(x_coords, p10, p25, p50, p75, p90, scatter_x, scatter_y, var: str, title: str, x_dim: str = 'x', units: str = '', **_kwargs) → holoviews.Overlay

Construct the overlay of Area bands + median Curve + scatter points.

Parameters:: x_dim – Name of the x-axis dimension, used as the kdim label so that axis labels reflect the original coordinate (e.g. over_time).

class bencher.SurfaceResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

A class for creating 3D surface plots from benchmark results.

This class provides methods to visualize benchmark data as 3D surface plots, which are useful for showing relationships between two input variables and a result variable. Surface plots can also display standard deviation bounds when benchmark runs include multiple repetitions.

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs) → panel.pane.Pane | None

Generates a 3D surface plot from benchmark data.

This is a convenience method that calls to_surface() with the same parameters.

Parameters:

result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the surface plot if data is appropriate,: otherwise returns filter match results.

Return type:

pn.pane.Pane | None

to_surface(result_var: param.Parameter | None = None, override: bool = True, target_dimension: int = 2, **kwargs) → panel.pane.Pane | None

Generates a 3D surface plot from benchmark data.

This method applies filters to ensure the data is appropriate for a surface plot and then passes the filtered data to to_surface_ds for rendering.

Parameters:

result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
target_dimension (int, optional) – The target dimensionality for data filtering. Defaults to 2.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the surface plot if data is appropriate,: otherwise returns filter match results.

Return type:

pn.pane.Pane | None

to_surface_ds(dataset: xarray.Dataset, result_var: param.Parameter, override: bool = True, alpha: float = 0.3, width: int = 600, height: int = 600) → panel.panel | None

Creates a 3D surface plot from the provided dataset.

Uses plotly directly (like VolumeResult) to avoid HoloViews backend contamination issues while ensuring reliable 3D rendering. Coordinates are sorted to guarantee monotonic x/y grids for plotly.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
alpha (float, optional) – The transparency for std-dev surfaces. Defaults to 0.3.
width (int, optional) – Plot width in pixels. Defaults to 600.
height (int, optional) – Plot height in pixels. Defaults to 600.

Returns:

A panel containing the surface plot if data matches criteria,: otherwise returns filter match results.

Return type:

pn.panel | None

class bencher.TabulatorResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

to_plot(**kwargs) → panel.widgets.Tabulator

Create an interactive table visualization of the data.

Passes the data to the panel Tabulator type to display an interactive table. See https://panel.holoviz.org/reference/widgets/Tabulator.html for extra options.

Parameters:: **kwargs – Additional parameters to pass to the Tabulator constructor.
Returns:: An interactive table widget.
Return type:: pn.widgets.Tabulator

to_tabulator(result_var: param.Parameter | None = None, **kwargs) → panel.widgets.Tabulator

Generates a Tabulator widget from benchmark data.

This is a convenience method that calls to_tabulator_ds() with the same parameters.

Parameters:

result_var (Parameter, optional) – The result variable to include in the table. If None, uses the default.
**kwargs – Additional keyword arguments passed to the Tabulator constructor.

Returns:

An interactive table widget.

Return type:

pn.widgets.Tabulator

to_tabulator_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → panel.widgets.Tabulator

Creates a Tabulator widget from the provided dataset.

Given a filtered dataset, this method generates an interactive table visualization.

Parameters:

dataset (xr.Dataset) – The filtered dataset to visualize.
result_var (Parameter) – The result variable to include in the table.
**kwargs – Additional keyword arguments passed to the Tabulator constructor.

Returns:

An interactive table widget.

Return type:

pn.widgets.Tabulator

class bencher.TableResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

to_plot(**kwargs) → holoviews.Table

Convert the dataset to a Table visualization.

Returns:: A HoloViews Table object.
Return type:: hv.Table

class bencher.VolumeResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.bench_result_base.BenchResultBase

to_plot(result_var: param.Parameter | None = None, override: bool = True, **kwargs: Any) → panel.panel | None

Generates a 3d volume plot from benchmark data.

Parameters:

result_var (Parameter | None) – The result variable to plot. If None, uses the default.
override (bool) – Whether to override filter restrictions. Defaults to True.
**kwargs (Any) – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the volume plot if data is appropriate, otherwise returns filter match results.

Return type:

pn.panel | None

to_volume(result_var: param.Parameter | None = None, override: bool = True, target_dimension: int = 3, **kwargs)

Generates a 3D volume plot from benchmark data.

This method applies filters to ensure the data is appropriate for a volume plot and then passes the filtered data to to_volume_ds for rendering.

Parameters:

result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
override (bool, optional) – Whether to override filter restrictions. Defaults to True.
target_dimension (int, optional) – The target dimensionality for data filtering. Defaults to 3.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the volume plot if data is appropriate,: otherwise returns filter match results.

Return type:

pn.pane.Plotly | None

to_volume_ds(dataset: xarray.Dataset, result_var: param.Parameter, width=600, height=600) → panel.pane.Plotly | None: Given a benchCfg generate a 3D surface plot :returns: A 3d volume plot as a holoview in a pane :rtype: pn.pane.Plotly

class bencher.HistogramResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.holoview_results.holoview_result.HoloviewResult

to_plot(result_var: param.Parameter | None = None, target_dimension: int = 2, **kwargs) → panel.pane.Pane | None

Generates a histogram plot from benchmark data.

This method applies filters to ensure the data is appropriate for a histogram and then passes the filtered data to to_histogram_ds for rendering.

Parameters:

result_var (Parameter, optional) – The result variable to plot. If None, uses the default.
target_dimension (int, optional) – The target dimensionality for data filtering. Defaults to 2.
**kwargs – Additional keyword arguments passed to the plot rendering.

Returns:

A panel containing the histogram if data is appropriate,: otherwise returns filter match results.

Return type:

pn.pane.Pane | None

_make_histogram(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs): Render a single histogram from a dataset (no over_time handling).

to_histogram_ds(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs)

Creates a histogram from the provided dataset.

Given a filtered dataset, this method generates a histogram visualization showing the distribution of values for the result variable. When over_time is active with multiple time points, produces per-time-point and pooled-aggregate tabs.

Parameters:

dataset (xr.Dataset) – The dataset containing benchmark results.
result_var (Parameter) – The result variable to plot in the histogram.
**kwargs – Additional keyword arguments passed to the histogram plot options.

Returns:

A histogram visualization of the benchmark data distribution.

Return type:

hvplot.element.Histogram

class bencher.ExplorerResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.pane_result.PaneResult

to_plot(**kwargs) → panel.pane.Pane

Produces a hvplot explorer instance to explore the generated dataset see: https://hvplot.holoviz.org/getting_started/explorer.html

Returns:: A dynamic pane for exploring a dataset
Return type:: pn.pane.Pane

class bencher.DataSetResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.bench_result_base.BenchResultBase

to_plot(result_var: param.Parameter | None = None, hv_dataset=None, target_dimension: int = 0, container=None, subsampling_divisions: int | None = None, **kwargs) → panel.pane.panel | None

bencher.hmap_canonical_input(dic: dict) → tuple

From a dictionary of kwargs, return a hashable representation (tuple) that is always the same for the same inputs and retains the order of the input arguments. e.g, {x=1,y=2} -> (1,2) and {y=2,x=1} -> (1,2). This is used so that keywords arguments can be hashed and converted the the tuple keys that are used for holomaps

Parameters:: dic (dict) – dictionary with keyword arguments and values in any order
Returns:: values of the dictionary always in the same order and hashable
Return type:: tuple

bencher.get_nearest_coords(dataset: xarray.Dataset, collapse_list: bool = False, **kwargs) → dict

Find the nearest coordinates in an xarray dataset based on provided coordinate values.

Given an xarray dataset and kwargs of key-value pairs of coordinate values, return a dictionary of the nearest coordinate name-value pair that was found in the dataset.

Parameters:

dataset (xr.Dataset) – The xarray dataset to search in
collapse_list (bool, optional) – If True, when a coordinate value is a list, only the first item is returned. Defaults to False.
**kwargs – Key-value pairs where keys are coordinate names and values are points to find the nearest match for

Returns:

Dictionary of coordinate name-value pairs with the nearest values found in the dataset

Return type:

dict

bencher.make_namedtuple(class_name: str, **fields) → collections.namedtuple

Convenience method for making a named tuple

Parameters:: class_name (str) – name of the named tuple
Returns:: a named tuple with the fields as values
Return type:: namedtuple

bencher.gen_path(filename: str, folder: str = 'generic', suffix: str = '.dat') → str

Generate a path for a file in the cache directory.

When called inside a benchmark sweep, files are placed in a per-job-key subdirectory so that cache overwrites can cleanly delete old media. Outside a sweep, falls back to UUID-based naming.

Parameters:

filename (str) – Base name for the file
folder (str, optional) – Subfolder within cachedir. Defaults to “generic”.
suffix (str, optional) – File extension. Defaults to “.dat”.

Returns:

Absolute path to a file location

Return type:

str

bencher.gen_image_path(image_name: str = 'img', filetype: str = '.png') → str

Generate a unique path for an image file in the cache directory.

Parameters:

image_name (str, optional) – Base name for the image file. Defaults to “img”.
filetype (str, optional) – Image file extension. Defaults to “.png”.

Returns:

Absolute path to a unique image file location

Return type:

str

bencher.gen_video_path(video_name: str = 'vid', extension: str = '.mp4') → str

Generate a unique path for a video file in the cache directory.

Parameters:

video_name (str, optional) – Base name for the video file. Defaults to “vid”.
extension (str, optional) – Video file extension. Defaults to “.mp4”.

Returns:

Absolute path to a unique video file location

Return type:

str

bencher.gen_rerun_data_path(rrd_name: str = 'rrd', filetype: str = '.rrd') → str

Generate a unique path for a rerun data file in the cache directory.

Parameters:

rrd_name (str, optional) – Base name for the rerun data file. Defaults to “rrd”.
filetype (str, optional) – File extension. Defaults to “.rrd”.

Returns:

Absolute path to a unique rerun data file location

Return type:

str

bencher.lerp(value: float, input_low: float, input_high: float, output_low: float, output_high: float) → float

Linear interpolation between two ranges.

Maps a value from one range [input_low, input_high] to another range [output_low, output_high].

Parameters:

value (float) – The input value to interpolate
input_low (float) – The lower bound of the input range
input_high (float) – The upper bound of the input range
output_low (float) – The lower bound of the output range
output_high (float) – The upper bound of the output range

Returns:

The interpolated value in the output range

Return type:

float

bencher.tabs_in_markdown(regular_str: str, spaces: int = 2) → str

Given a string with tabs in the form convert the to &ensp; which is a double space in markdown

Parameters:

regular_str (str) – A string with tabs in it
spaces (int) – the number of spaces per tab

Returns:

A string with sets of   to represent the tabs in markdown

Return type:

str

bencher.publish_file(filepath: str, remote: str, branch_name: str) → str

Publish a file to an orphan git branch:

def publish_args(branch_name) -> tuple[str, str]:
    return (
        "https://github.com/blooop/bencher.git",
        f"https://github.com/blooop/bencher/blob/{branch_name}")

Parameters:: remote (Callable) – A function the returns a tuple of the publishing urls. It must follow the signature def publish_args(branch_name) -> tuple[str, str]. The first url is the git repo name, the second url needs to match the format for viewable html pages on your git provider. The second url can use the argument branch_name to point to the file on a specified branch.
Returns:: the url of the published file
Return type:: str

bencher.github_content(remote: str, branch_name: str, filename: str)

bencher.publish_and_view_rrd(file_path: str, remote: str, branch_name: str, content_callback: callable, version: str | None = None)

bencher.rrd_to_pane(url: str, width: int = 500, height: int = 600, version: str | None = None): Display an .rrd file from a URL using the hosted rerun web viewer.

bencher.rrd_file_to_pane(file_path, width: int = 300, height: int = 300, viewer_version: str | None = None, report_dir: str | pathlib.Path | None = None)

Create a rerun viewer pane from an .rrd file path.

Uses an HTML iframe to display the .rrd file. By default the viewer is loaded from the @rerun-io/web-viewer CDN at the installed rerun-sdk version. The viewer page and the .rrd data are both served from the Panel server’s /rrd_static/ route, keeping everything on a single origin (no CORS, no extra ports).

The file must be located under cachedir/rrd/.

Parameters:

file_path – Path to the .rrd file (must be under cachedir/rrd/).
width – Width of the viewer iframe in pixels.
height – Height of the viewer iframe in pixels.
viewer_version – Rerun web-viewer version to load from CDN. Defaults to the installed rerun-sdk version. Set explicitly when the .rrd was recorded with a different SDK version.
report_dir – When set, copies the .rrd and viewer HTML into this directory and uses relative URLs in the iframe. This makes the report portable — it works when served from any HTTP origin without a live Panel server.

bencher.run_file_server(directory=None, port=8001)

Start a background HTTP file server (daemon thread).

If port is already in use the existing server is assumed to be running and the function returns None instead of raising.

Returns the ThreadingHTTPServer so callers can query server.server_address[1] for the actual port (useful when port=0).

class bencher.RegressionResult

Result of regression detection for a single variable.

variable: str

method: str

regressed: bool

current_value: float

baseline_value: float

change_percent: float

threshold: float

direction: str

details: str

band_lower: float | None = None

band_upper: float | None = None

percent_band_lower: float | None = None

percent_band_upper: float | None = None

historical: numpy.ndarray | None = None

current_samples: numpy.ndarray | None = None

historical_all: numpy.ndarray | None = None

historical_all_x: numpy.ndarray | None = None

historical_x: numpy.ndarray | None = None

current_x: numpy.ndarray | None = None

young_baseline: bool = False

render_png(historical: numpy.ndarray | None = None, current: numpy.ndarray | float | None = None, path: str | pathlib.Path | None = None, figsize: tuple[float, float] = (8.0, 5.0), dpi: int = 100) → str: Render this result as a diagnostic PNG (see render_regression_png()).

render_overlay(historical: numpy.ndarray | None = None, current: numpy.ndarray | float | None = None): Build a holoviews.Overlay of this result (see build_regression_overlay()).

to_dict() → dict

Return a JSON-serializable summary of this result.

Emits only scalar fields — the numpy historical/current_samples arrays (kept for replotting) are intentionally omitted. Non-finite floats (NaN/inf, e.g. a zero-baseline percent change) become None so the output is strict, json.dumps-able JSON.

class bencher.RegressionReport

Aggregates regression results for all variables in a benchmark.

results: list[RegressionResult] = []

property has_regressions: bool

property has_blocking_regressions: bool

True when any regression has a mature baseline and may fail the run.

Regressions on young baselines (see regression_min_history) are notify-only: reported in the summary/export but never blocking.

property regressed_variables: list[RegressionResult]

summary() → str

to_markdown() → str: Return a nicely formatted Markdown summary of all regression results.

to_dict() → dict

Return a JSON-serializable summary of all regression results.

Mirrors to_markdown()/summary() but emits structured data for agents and CI to consume instead of prose.

append_to_report(report) → None: Append a formatted regression summary to a BenchReport.

prepend_to_result(report, bench_res) → None: Insert a formatted regression summary at the top of bench_res’s tab.

exception bencher.RegressionError

Bases: Exception

Raised when regression detection finds regressions and regression_fail is True.

class bencher.MethodCells

Per-method rendering of a single regression result.

Each detector has a different gate — percent ratio, MAD-sigma, absolute delta, hard limit — so the report cells must describe it in its own units. This bundle is the single source of truth consumed by both the built-in text summary and the markdown table, and is exposed as public API so downstream report builders can produce their own layouts (custom columns, non-markdown output, templated HTML, GitHub PR comments with status decoration, etc.) without reimplementing method dispatch and drifting when new detection methods are added.

Example — building a minimal custom row from a RegressionResult:

from bencher import method_cells
cells = method_cells(result)
row = f"{result.variable}: {cells.change} (gate {cells.threshold})"

change: Change column (markdown) — gated quantity in its own units.

baseline: Baseline column (markdown) — em-dash for absolute (no historical baseline exists).

threshold: Threshold column (markdown) — carries the gate’s native units (±T%, Tσ, ±T, or a direction-aware inequality).

summary_lead: First clause of the summary line, before the details parenthesis. Captures the gated quantity in sentence form.

summary_standalone: When True, the summary line skips the (baseline=…, current=…, threshold=…) tail because summary_lead already contains the relevant values. Used by the absolute method (no baseline, limit is in the lead).

change: str

baseline: str

threshold: str

summary_lead: str

summary_standalone: bool = False

bencher.method_cells(r: RegressionResult) → MethodCells

Build the per-method cell bundle for a RegressionResult.

Returns a MethodCells with pre-rendered display strings for the result’s change, baseline, and threshold, plus the summary lead clause. Dispatches on r.method so each gate describes itself in its native units. Safe to call on any RegressionResult — unknown methods fall back to the percentage-style rendering.

Intended for consumers that want to embed regression results in a custom layout while staying consistent with how the built-in RegressionReport.summary() and RegressionReport.to_markdown() present each method.

Notes on the absolute branch: baseline_value and threshold both hold the limit for this detector (see detect_absolute()); the code reads from threshold to make the intent (“this is the gate value”) explicit.

exception bencher.HistoryResetError

Bases: Exception

Raised when history is reset or loses a column and on_history_reset=’error’.

class bencher.HistoryEvent

One schema-affecting event detected while loading over_time history.

kind: str

detail: str

column: str | None = None

property lossy: bool: True when the event removes data from what consumers will see.

class bencher.PerfTracker

Context-manager based phase timer.

Usage:

tracker = PerfTracker()
with tracker.phase("setup"):
    do_setup()
with tracker.phase("compute"):
    do_compute()
report = tracker.report()

_phases: list[PhaseTime] = []

phase(name: str): Time a block and record it as a named phase.

report() → PerfReport: Build a PerfReport from all recorded phases.

log_summary() → None: Log the report summary at INFO level.

class bencher.PerfReport

Collection of phase timings from a benchmark run.

phases: list[PhaseTime] = []

property total_s: float

property total_ms: float

get_phase(name: str) → PhaseTime | None: Return the first phase matching name, or None.

summary() → str: Human-readable summary of all phases.

to_dict() → dict: Return phases as {name: duration_ms} plus a total.

bencher.git_time_event(repo_path: str | None = None) → str

Return a time-event label combining wall-clock time and short commit hash.

Example return value: "2024-06-15 14:59 abc1234"

The SHA portion is git’s canonical abbreviated hash (rev-parse --short), which is typically 7 characters but may be longer in large repositories to avoid ambiguity.

Intended to be used with BenchRunCfg(over_time=True, time_event=...) so the over-time slider shows when and which commit produced the data.

Wall-clock time is used instead of commit time so that multiple benchmark runs on the same commit produce distinct labels.

For fork-safety in multithreaded environments (ROS 2, DDS, etc.), call this at module level before starting background threads:

_TIME_EVENT = bn.git_time_event()  # safe: no threads yet

Falls back to "<timestamp> unknown" if not inside a git repository or git is unavailable, keeping the label format consistent.

class bencher.VarRange(lower_bound: int = 0, upper_bound: int = -1)

A VarRange represents the bounded and unbounded ranges of integers. This class is used to define filters for various variable types. For example by defining cat_var = VarRange(0,0), calling matches(0) will return true, but any other integer will not match. You can also have unbounded ranges for example VarRange(2,None) will match to 2,3,4… up to infinity. for By default the lower and upper bounds are set to -1 so so that no matter what value is passed to matches() will return false. Matches only takes 0 and positive integers.

lower_bound = 0

upper_bound = -1

matches(val: int) → bool

Checks that a value is within the variable range. lower_bound and upper_bound are inclusive (lower_bound<=val<=upper_bound )

Parameters:: val (int) – A positive integer representing a number of items
Returns:: True if the items is within the range, False otherwise.
Return type:: bool
Raises:: ValueError – If val < 0

matches_info(val: int, name: str) → tuple[bool, str]

Get matching info for a value with a descriptive name.

Parameters:

val (int) – A positive integer to check against the range
name (str) – A descriptive name for the value being checked, used in the output string

Returns:

A tuple containing:

bool: True if the value matches the range, False otherwise
str: A formatted string describing the match result

Return type:

tuple[bool, str]

__str__() → str

class bencher.PlotFilter

A class for representing the types of results a plot is able to represent.

float_range: VarRange

cat_range: VarRange

vector_len: VarRange

result_vars: VarRange

panel_range: VarRange

repeats_range: VarRange

input_range: VarRange

classmethod match_all() → PlotFilter

A filter that matches every sweep shape.

The default PlotFilter() ranges are restrictive (VarRange() matches nothing), which suits plots that opt in to specific shapes. Plugins that do their own internal shape handling should use this instead.

matches_result(plt_cnt_cfg: bencher.plotting.plt_cnt_cfg.PltCntCfg, plot_name: str, override: bool) → PlotMatchesResult

Checks if the result data signature matches the type of data the plot is able to display.

Parameters:

plt_cnt_cfg (PltCntCfg) – Configuration containing counts of different plot elements
plot_name (str) – Name of the plot being checked
override (bool) – Whether to override filter matching rules

Returns:

Object containing match results and information

Return type:

PlotMatchesResult

class bencher.ParametrizedSweep(**params)

Bases: param.Parameterized

Parent class for all Sweep types that need a custom hash

static param_hash(param_type: param.Parameterized, hash_value: bool = True) → int

A custom hash function for parametrised types with options for hashing the value of the type and hashing metadata

Parameters:

param_type (Parameterized) – A parameter
hash_value (bool, optional) – use the value as part of the hash. Defaults to True.
hash_meta (#) – use metadata as part of the hash. Defaults to False.

Returns:

a hash

Return type:

int

hash_persistent() → str: A hash function that avoids the PYTHONHASHSEED ‘feature’ which returns a different hash value each time the program is run

update_params_from_kwargs(**kwargs) → None: Given a dictionary of kwargs, set the parameters of the passed class ‘self’ to the values in the dictionary.

classmethod get_input_and_results(include_name: bool = False) → tuple[dict, dict]

Get dictionaries of input parameters and result parameters

Parameters:

cls – A parametrised class
include_name (bool) – Include the name parameter that all parametrised classes have. Default False

Returns:

A tuple containing the inputs and result parameters as dictionaries

Return type:

tuple[dict, dict]

get_inputs_as_dict() → dict: Get the key:value pairs for all the input variables

get_results_values_as_dict(holomap=None) → dict: Get a dictionary of result variables with the name and the current value

classmethod get_inputs_only() → list[param.Parameter]

Return a list of input parameters

Returns:: A list of input parameters
Return type:: list[param.Parameter]

static filter_fn(item, p_name)

classmethod get_input_defaults(override_defaults: list | None = None) → list[tuple[param.Parameter, Any]]

classmethod get_input_defaults_override(**kwargs) → dict[str, Any]

classmethod get_results_only() → list[param.Parameter]

Return a list of result parameters

Returns:: A list of result parameters
Return type:: list[param.Parameter]

classmethod get_inputs_as_dims(compute_values=False, remove_dims: str | list[str] | None = None) → list[holoviews.Dimension]

to_dynamic_map(callback=None, name=None, remove_dims: str | list[str] | None = None, result_var: str | None = None) → holoviews.DynamicMap

to_gui(result_var: str | None = None, **kwargs)

to_holomap(callback, remove_dims: str | list[str] | None = None) → holoviews.DynamicMap

__call__(**kwargs) → dict

Dispatch to benchmark() if overridden, otherwise use legacy path.

Returns:: a dictionary with all the result variables as named key value pairs.
Return type:: dict

benchmark()

Override this with your benchmark logic.

When called, all sweep parameters (self.x, etc.) are already set. Set result variables (self.result, etc.) directly on self. No need to call update_params_from_kwargs or super().__call__().

plot_hmap(**kwargs)

to_bench(run_cfg=None, report=None, name=None): Create a Bench instance from this ParametrizedSweep.

to_optimize(n_trials=100, run_cfg=None, **kwargs)

Create a Bench and run optimization in one call.

Parameters:

n_trials – Number of optuna trials.
run_cfg – Optional BenchRunCfg.
**kwargs – Forwarded to Bench.optimize().

Returns:

OptimizeResult wrapping the completed study.

to_bench_runner(run_cfg=None, name=None)

Create a BenchRunner instance from this ParametrizedSweep.

Enables fluent chaining like:: MyConfig().to_bench_runner().add(func).run(subsampling_divisions=2, max_subsampling_divisions=4)

class bencher.ParametrizedSweepSingleton(**params)

Bases: bencher.variables.parametrised_sweep.ParametrizedSweep

A minimal per-subclass singleton for ParametrizedSweep.

Repeated construction returns the same instance for each subclass.
Ensures the Parametrized __init__ chain runs only once.
init_singleton() returns a result that is truthy once per subclass and doubles as a context manager for automatic rollback on failure.
reset_singleton() explicitly clears singleton state for a subclass.
Thread-safe: all shared state is protected by _lock.

_instances

_seen

_lock

_singleton_inited = True

classmethod init_singleton() → _SingletonInitResult

Mark cls as seen and return a _SingletonInitResult.

The result is truthy the first time a subclass calls this and falsy on every subsequent call — identical to the previous boolean return value.

It can also be used as a context manager:

with self.init_singleton() as is_first:
    if is_first:
        self._fallible_setup()

If the with block raises during a first-time init, the singleton bookkeeping is rolled back so the next construction can retry cleanly.

classmethod reset_singleton() → None: Clear singleton state for cls, allowing re-initialisation.

class bencher.SampleOrder

Bases: strenum.StrEnum

Controls the sampling traversal order for plot_sweep.

INORDER: Traverse inputs in the natural Cartesian product order
(right-most dimension varies fastest).
REVERSED: Traverse the same set of samples in the reverse order.

Note: This only affects sampling order, not plotting or dataset dimension order.

INORDER

REVERSED

bencher.DEFAULT_CACHE_SIZE_BYTES = 0

class bencher.CacheDirStats

Statistics for a single cache or media directory.

path: str

entries: int

size_bytes: int

size_limit_bytes: int | None = None

summary_line() → str

class bencher.CacheStats

Aggregate cache statistics.

managed: list[CacheDirStats]

media: list[CacheDirStats]

total_bytes: int

summary() → str

bencher.cache_stats(cachedir: str = 'cachedir') → CacheStats: Collect statistics for all managed caches and media directories.

bencher.print_cache_stats(cachedir: str = 'cachedir') → None: Print a human-readable cache statistics summary.

bencher.clear_all(cachedir: str = 'cachedir') → None: Remove the entire cache directory tree.

bencher.clear_media(cachedir: str = 'cachedir') → tuple[int, int]

Delete all files in media directories.

Returns (files_deleted, bytes_freed).

bencher.clean_orphaned_media(cachedir: str = 'cachedir', dry_run: bool = True) → tuple[list[str], int]

Find and optionally delete per-job-key media dirs with no cache entry.

Walks the media tree looking for job-key subdirectories. If the key is not present in the sample cache, the directory is an orphan (its cache entry was evicted by LRU or cleared).

Parameters:

cachedir – Root cache directory.
dry_run – If True, only report orphans without deleting.

Returns:

(orphan_dirs, total_bytes) — list of orphaned directory paths and their combined size.

bencher.cleanup_job_media(job_key: str, cachedir: str = 'cachedir') → int

Delete the per-job-key media directories for job_key.

Called automatically before a cache entry is overwritten so that stale media files from the previous run are removed.

Returns the number of directories removed.

bencher.ensure_cache_version(cachedir: str = 'cachedir') → None

Check the cache version file; clear everything on mismatch.

Called automatically when a Bench is instantiated. If the version file is missing or doesn’t match CACHE_VERSION, the entire cache tree is deleted so stale data from incompatible layouts doesn’t linger.

class bencher.BenchResult(bench_cfg)

Contains the results of the benchmark and has methods to cast the results to various datatypes and graphical representations

timings = None

classmethod from_existing(original: BenchResult) → BenchResult

to(result_type: BenchResult, result_var: param.Parameter | None = None, override: bool = True, reduce: bencher.results.bench_result_base.ReduceType | None = None, aggregate: bool | int | list[str] | None = None, agg_fn: Literal['mean', 'sum', 'max', 'min', 'median'] = 'mean', **kwargs: Any) → BenchResult

Return the current instance of BenchResult.

Returns:: The current instance of the benchmark result
Return type:: BenchResult

static default_plot_callbacks() → list[callable]

Get the default list of plot callback functions.

These callbacks are used by default in the to_auto method if no specific plot list is provided.

Returns:: A list of plotting callback functions
Return type:: list[callable]

static plotly_callbacks() → list[callable]

Get the list of Plotly-specific callback functions.

Returns:: A list of Plotly-based visualization callback functions
Return type:: list[callable]

plot() → panel.panel

Plots the benchresult using the plot callbacks defined by the bench run.

This method uses the plot_callbacks defined in the bench_cfg to generate plots for the benchmark results.

Returns:: A panel representation of the results, or None if no plot_callbacks defined
Return type:: pn.panel

to_bench_data(render_kwargs: dict | None = None) → bencher.plugins.bench_data.BenchData

Snapshot this result as the frozen plugin data contract.

The transitional legacy_result/render_kwargs fields carry the live result object and the plot kwargs for the wrapped built-in renderers; they disappear once renderers consume BenchData directly.

Returns:: The frozen data handle plot plugins receive.
Return type:: BenchData

Automatically generate plots by dispatching through the plot plugin registry.

Every registered plugin whose match rule fits this sweep renders, in priority order — the built-in chart types (registered in bencher.plugins.builtins) plus any user plugins registered with bencher.register_plugin / @bencher.plot_plugin or discovered via the bencher.plot_plugins entry-point group.

Parameters:

plot_list (list[callable | str], optional) – Restrict to these plots. Entries are plugin names (“line”, “heatmap”, …) or, for backward compatibility, legacy plot callbacks (e.g. LineResult.to_plot); unrecognized callables are invoked directly as before. Defaults to None (all matching plugins).
remove_plots (list[callable | str], optional) – Plots to exclude, same entry forms as plot_list. Defaults to None.
default_container (type, optional) – Default container type for the plots. Defaults to pn.Column.
override (bool, optional) – Whether to override unsupported plots. Defaults to False.
numeric_only (bool, optional) – When True, skip the pane-type result plugin (images, videos, rerun, etc.) that cannot be numerically aggregated. Defaults to False.
backend (str, optional) – Preferred rendering backend. Chart types the preferred backend implements render through it; the rest keep their best other implementation. Defaults to None (highest priority wins).
**kwargs – Additional keyword arguments for plot configuration.

Returns:

A list of panel objects containing the generated plots.

Return type:

list[pn.panel]

Why each registered plugin would or wouldn’t render for this result.

Runs the same selection to_auto uses (same plot_list/remove_plots normalization) and renders the full decision table — chosen plugins first, each rejected one with the first gate that dropped it (named-only, missing capability, shape-filter mismatch, superseded backend, …).

Returns:: A text table, one row per registered plugin.
Return type:: str

static _normalize_plot_list(plot_list: list[callable | str] | None) → tuple[list[str] | None, list[callable]]

Split a to_auto plot_list into registry names and legacy callables.

Known callbacks translate to their plugin names; unknown callables keep working through the legacy direct-call path. None means “no restriction” (all registered plugins participate).

static _plot_exclusions(remove_plots: list[callable | str] | None, extra_callbacks: list[callable], numeric_only: bool) → tuple[set[str], list[callable]]: Compute the plugin names to exclude and drop removed legacy callables.

to_auto_plots(extra_panels: collections.abc.Sequence[collections.abc.Callable[[BenchResult], panel.viewable.Viewable] | panel.viewable.Viewable] | None = None, **kwargs) → panel.panel

Given the dataset result of a benchmark run, automatically deduce how to plot the data based on the types of variables that were sampled.

Parameters:

extra_panels – Extra panel callables or static panels to inject after the sweep summary and before aggregate/auto plots. Each item is either a callable(BenchResult) -> panel, or a static panel object.
**kwargs – Additional keyword arguments for plot configuration.

Returns:

A panel containing plot results.

Return type:

pn.panel

_scalar_aggregate_summary() → panel.pane.Markdown: Render a Markdown table for a fully-aggregated (scalar) result.

class bencher.OptimizeResult

Wraps an optuna.Study with bencher-friendly accessors.

study: The underlying optuna study.

n_warm_start_trials: Number of trials seeded from cache / prior results.

n_new_trials: Number of new trials evaluated during optimization.

target_names: Names of the optimization target variables.

bench_cfg: Optional BenchCfg for rich report generation.

study: optuna.Study

n_warm_start_trials: int = 0

n_new_trials: int = 0

target_names: list[str] = []

bench_cfg: bencher.bench_cfg.BenchCfg | None = None

_ensure_single_objective() → None: Raise if study is multi-objective.

property best_params: dict[str, Any]: Best parameters found (single-objective only).

property best_value: float: Best objective value (single-objective only).

property best_trials: list[optuna.trial.FrozenTrial]: Pareto-optimal trials (multi-objective).

summary() → str: Return a human-readable summary of the optimization.

class bencher.PaneResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.bench_result_base.BenchResultBase

to_video(result_var: param.Parameter | None = None, **kwargs)

to_panes(result_var: param.Parameter | None = None, hv_dataset=None, target_dimension: int = 0, container=None, subsampling_divisions: int | None = None, **kwargs) → panel.pane.panel | None

bencher.VideoResult

class bencher.ReduceType

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

AUTO

SQUEEZE

REDUCE

MINMAX

NONE

class bencher.HoloviewResult(bench_cfg: bencher.bench_cfg.BenchCfg)

Bases: bencher.results.pane_result.PaneResult

DEFAULT_SIZED_ELEMENTS

static set_default_opts(width: int = 600, height: int = 600) → dict

Set default options for HoloViews visualizations.

Parameters:

width (int, optional) – Default width for visualizations. Defaults to 600.
height (int, optional) – Default height for visualizations. Defaults to 600.

Returns:

Dictionary containing width, height, and tools settings.

Return type:

dict

to_hv_type(hv_type: type, reduce: bencher.results.bench_result_base.ReduceType = ReduceType.AUTO, **kwargs) → holoviews.Chart

Convert the dataset to a specific HoloViews visualization type.

Parameters:

hv_type (type) – The HoloViews chart type to convert to (e.g., hv.Points, hv.Curve).
reduce (ReduceType, optional) – How to reduce dataset dimensions. Defaults to ReduceType.AUTO.
**kwargs – Additional parameters to pass to the chart constructor.

Returns:

A HoloViews chart of the specified type.

Return type:

hv.Chart

overlay_plots(plot_callback: callable) → holoviews.Overlay | panel.Row | None

Create an overlay of plots by applying a callback to each result variable.

Parameters:: plot_callback (callable) – Function to apply to each result variable to create a plot.
Returns:: An overlay of plots or Row of plots, or None if no results.
Return type:: hv.Overlay | pn.Row | None

layout_plots(plot_callback: callable) → holoviews.Layout | None

Create a layout of plots by applying a callback to each result variable.

Parameters:: plot_callback (callable) – Function to apply to each result variable to create a plot.
Returns:: A layout of plots or None if no results.
Return type:: hv.Layout | None

time_widget(title: str) → dict

Create widget configuration for time-based visualizations.

Parameters:: title (str) – Title for the widget.
Returns:: Widget configuration dictionary with title.
Return type:: dict

_use_holomap_for_time(dataset: xarray.Dataset) → bool

Check whether over_time should be rendered via an hv.HoloMap slider.

Returns True when over_time is active and the dataset has >1 time points.

static _apply_opts(plot, **opts_kwargs)

Apply .opts() to a plot, handling panel wrappers and layout containers.

hvplot may return any of:

a bare HoloViews element/DynamicMap/Overlay (has .opts),
a pn.pane.HoloViews wrapper whose underlying .object is the actual holoviews element, or
a panel layout container (Row/Column/WidgetBox) — this happens when widget_location splits the plot from its widgets, e.g. an over_time time-series line with a categorical by widget. The HoloViews pane is then nested inside .objects.

Without the container case, options such as xrotation, title and ylabel were silently dropped for those split plots (the over_time x-axis kept its default horizontal labels). Recurse into containers so the options reach the nested pane.

static _over_time_kdims() → list: Return the kdim list for over_time HoloMaps.

static _holomap_with_slider_bottom(hvobj, widgets=None)

Wrap a HoloViews object so any scrubber/slider appears below the plot.

pn.pane.HoloViews(holomap, widget_location="bottom") does not embed correctly in static HTML (the widget is lost). Instead we let Panel auto-create the widget via pn.panel(hvobj) (which produces a Row(plot, widget_box)), then rearrange into a Column(plot, widget_box) so the slider sits underneath.

Force DiscreteSlider for the over_time dimension so that string-based TimeEvent coordinates get a slider instead of the default dropdown Select widget.

Safe to call on any HoloViews object; if no widgets are produced the original object is returned unchanged.

The slider defaults to the most recent (last) time point by setting the widget value in Python. Panel’s embed system computes JSON patches relative to this default, so every other position gets a valid patch and the last position is the initial state.

_build_curve_overlay(dataset: xarray.Dataset, result_var: param.Parameter, **kwargs) → holoviews.Overlay

Build a Curve (+ optional Spread) overlay for a single time slice or aggregated data.

When _std exists in the dataset the spread band is rendered automatically. This is used by both the curve renderer and the line renderer (for aggregated data that gained _std from _mean_over_time).

Performance: avoids to_dataframe() when there are no categorical groupby dimensions by constructing hv.Dataset directly from the xarray Dataset. The heavier DataFrame path is only used when manual groupby is required.

static _mean_over_time(dataset, result_var_name)

Average a dataset across all time points.

Always produces a _std variable so that downstream renderers (e.g. curve spread, error bars) can visualise the aggregation uncertainty. When a per-time-point _std already exists the pooled standard deviation is computed via the law of total variance; otherwise the standard deviation of the means across time points is used.

static subsample_indices(n, max_points)

Return evenly-spaced indices into a length-n array.

Always includes the first and last index. When max_points is None or >= n, returns range(n) (no subsampling).

_build_time_holomap(dataset, result_var_name, make_plot_fn)

Build per-time-point HoloMap + optional aggregated plot.

make_plot_fn receives a Dataset without the over_time dimension. The aggregated dataset produced by _mean_over_time always contains a _std variable; callbacks that are _std-aware (e.g. delegating to _build_curve_overlay) will automatically render spread bands on the aggregated tab.

When bench_cfg.max_slider_points is set, only that many evenly-spaced time points are rendered for the slider (first and last always included). The aggregated tab still uses all data.

When bench_cfg.show_aggregated_time_tab is False, the aggregation is skipped entirely for faster rendering.

_build_time_holomap_raw(da, make_plot_fn)

Build per-time-point HoloMap + optional aggregated plot for distributions.

make_plot_fn receives a DataArray that retains the over_time dimension (a single-element slice for per-time-point entries, or the full array for the aggregated tab). Callers should flatten via .to_dataframe().reset_index() or equivalent.

Respects bench_cfg.max_slider_points and bench_cfg.show_aggregated_time_tab.

_build_tap_plot(plot: holoviews.Element, dataset: xarray.Dataset, result_var_plots: list[param.Parameter], container: type | list[type] | None = None, tap_container_direction: type | None = None) → panel.Row

Wrap a plot element with interactive PointerXY tap functionality.

Sets up hv.streams.PointerXY and hv.streams.MouseLeave on the given plot, updating the supplied containers with the nearest data point values as the user hovers.

Parameters:

plot – The base HoloViews element to attach tap streams to.
dataset – The full xarray Dataset for value look-ups.
result_var_plots – Result variables whose values are shown on tap.
container – Panel container type(s) for displaying tapped values.
tap_container_direction – Layout class (pn.Row or pn.Column) for the tap containers. Defaults to pn.Column.

Returns:

A pn.Row containing the interactive plot and tap info panel.

hv_container_ds(dataset: xarray.Dataset, result_var: param.Parameter, container: holoviews.Chart | None = None, **kwargs) → holoviews.Chart

Convert an xarray Dataset to a HoloViews container for a specific result variable.

Parameters:

dataset (xr.Dataset) – The xarray Dataset containing the data.
result_var (Parameter) – The result variable to visualize.
container (hv.Chart, optional) – The HoloViews container type to use. Defaults to None.
**kwargs – Additional options to pass to the chart’s opts() method.

Returns:

A HoloViews chart containing the visualization.

Return type:

hv.Chart

to_hv_container(container: panel.pane.panel, reduce_type: bencher.results.bench_result_base.ReduceType = ReduceType.AUTO, target_dimension: int = 2, result_var: param.Parameter | None = None, result_types: tuple | None = (ResultFloat,), **kwargs) → panel.pane.panel | None

Convert the data to a HoloViews container with specified dimensions and options.

Parameters:

container (pn.pane.panel) – The panel container type to use.
reduce_type (ReduceType, optional) – How to reduce the dataset dimensions. Defaults to ReduceType.AUTO.
target_dimension (int, optional) – Target dimension for the visualization. Defaults to 2.
result_var (Parameter, optional) – Specific result variable to visualize. Defaults to None.
result_types (tuple, optional) – Types of result variables to include. Defaults to (ResultFloat).
**kwargs – Additional visualization options.

Returns:

A panel containing the visualization, or None if no valid results.

Return type:

pn.pane.panel | None

result_var_to_container(result_var: param.Parameter) → type

Determine the appropriate container type for a given result variable.

Parameters:: result_var (Parameter) – The result variable to find a container for.
Returns:: The appropriate panel container type (PNG, Video, or Column).
Return type:: type

setup_results_and_containers(result_var_plots: param.Parameter | list[param.Parameter], container: type | list[type] | None = None, **kwargs) → tuple[list[param.Parameter], list[panel.pane.panel]]

Set up appropriate containers for result variables.

Parameters:

result_var_plots (Parameter | list[Parameter]) – Result variables to visualize.
container (type | list[type], optional) – Container types to use. Defaults to None.
**kwargs – Additional options to pass to the container constructors.

Returns:

Tuple containing:

List of result variables
List of initialized container instances

Return type:

tuple[list[Parameter], list[pn.pane.panel]]

to_error_bar(result_var: param.Parameter | str | None = None, **kwargs) → holoviews.Bars

Convert the dataset to an ErrorBars visualization for a specific result variable.

Parameters:

result_var (Parameter | str, optional) – Result variable to visualize. Defaults to None.
**kwargs – Additional options for dataset reduction.

Returns:

A HoloViews ErrorBars object showing error ranges.

Return type:

hv.Bars

to_points(reduce: bencher.results.bench_result_base.ReduceType = ReduceType.AUTO) → holoviews.Points

Convert the dataset to a Points visualization with optional error bars.

Parameters:: reduce (ReduceType, optional) – How to reduce the dataset dimensions. Defaults to ReduceType.AUTO.
Returns:: A HoloViews Points object, potentially with ErrorBars if reduction is applied.
Return type:: hv.Points

to_nd_layout(hmap_name: str) → holoviews.NdLayout

Convert a HoloMap to an NdLayout for multi-dimensional visualization.

Parameters:: hmap_name (str) – Name of the HoloMap to convert.
Returns:: A HoloViews NdLayout object with the visualization.
Return type:: hv.NdLayout

to_holomap(name: str | None = None) → holoviews.HoloMap

Convert an NdLayout to a HoloMap for animated/interactive visualization.

Parameters:: name (str, optional) – Name of the HoloMap to use. Defaults to None.
Returns:: A HoloViews HoloMap object with the visualization.
Return type:: hv.HoloMap

to_holomap_list(hmap_names: list[str] | None = None) → panel.Column

Create a column of HoloMaps from multiple named maps.

Parameters:: hmap_names (list[str], optional) – list of HoloMap names to include. If None, uses all result_hmaps. Defaults to None.
Returns:: A panel Column containing multiple HoloMaps.
Return type:: pn.Column

get_nearest_holomap(name: str | None = None, **kwargs) → holoviews.HoloMap

Get the HoloMap element closest to the specified coordinates.

Parameters:

name (str, optional) – Name of the HoloMap to use. Defaults to None.
**kwargs – Coordinate values to find nearest match for.

Returns:

The nearest matching HoloMap element.

Return type:

hv.HoloMap

to_dynamic_map(name: str | None = None) → holoviews.DynamicMap

Create a DynamicMap from the HoloMap dictionary.

Uses the values stored in the holomap dictionary to populate a dynamic map. This is much faster than passing the holomap to a holomap object as the values are calculated on the fly.

Parameters:: name (str, optional) – Name of the HoloMap to use. Defaults to None.
Returns:: A HoloViews DynamicMap for interactive visualization.
Return type:: hv.DynamicMap

to_grid(inputs: list[str] | None = None) → holoviews.GridSpace

Create a grid visualization from a HoloMap.

Parameters:: inputs (list[str], optional) – Input variable names to use for the grid dimensions. If None, uses bench_cfg.inputs_as_str(). Defaults to None.
Returns:: A HoloViews GridSpace object showing the data as a grid.
Return type:: hv.GridSpace

class bencher.BenchReport(bench_name: str | None = None)

Bases: bencher.bench_plot_server.BenchPlotServer

A server for display plots of benchmark results

bench_name = None

pane

last_save_ms: float = 0.0

bench_results: list[bencher.results.bench_result.BenchResult] = []

clear() → None

Remove all tabs and results so the report can be reused between runs.

Not safe to call while the report is being served to a live Panel session.

append_title(title: str, new_tab: bool = True)

append_markdown(markdown: str, name: str | None = None, width: int = 800, **kwargs) → panel.pane.Markdown

append(pane: panel.panel, name: str | None = None) → None

append_col(pane: panel.panel, name: str | None = None) → None

static _time_event_label(bench_res: bencher.results.bench_result.BenchResult) → str | None: Extract a human-readable label for the latest time event from a result.

append_result(bench_res: bencher.results.bench_result.BenchResult, render_from: bencher.results.bench_result.BenchResult | None = None) → None

append_to_result(bench_res: bencher.results.bench_result.BenchResult, pane: panel.panel) → None: Append pane to the tab that belongs to bench_res.

prepend_to_result(bench_res: bencher.results.bench_result.BenchResult, pane: panel.panel) → None: Insert pane at the beginning of the tab that belongs to bench_res.

append_tab(pane: panel.panel, name: str | None = None) → None

save_index(directory: str = '', filename: str = 'index.html') → pathlib.Path

Saves the result to index.html in the root folder so that it can be displayed by github pages.

Returns:: save path
Return type:: Path

save(directory: str | pathlib.Path = 'cachedir', filename: str | None = None, in_html_folder: bool = True, portable: bool = False, emit_json: bool | str = False, **kwargs) → pathlib.Path

Save the result to a html file.

When the report contains multiple tabs, each tab is saved to its own embedded HTML file and the index page uses iframes to display them. This prevents HoloMap slider widgets from colliding across tabs.

Parameters:

directory (str | Path, optional) – base folder to save to. Defaults to “cachedir” which should be ignored by git.
filename (str, optional) – The name of the html file. Defaults to the name of the benchmark
in_html_folder (bool, optional) – Put the saved files in a html subfolder to help keep the results separate from source code. Defaults to True.
emit_json (bool | str, optional) – When truthy, also write a machine-readable result.json (see bencher.report_export.result_to_dict()) next to the HTML for each contained result. A string sets the filename when the report holds a single result. Defaults to False (no JSON).
portable (bool, optional) – When True, base64-encode .rrd data directly into the viewer HTML so the report works from file:// without any server. When False (default), .rrd files are copied as sidecar files and loaded via relative URLs — the report must be served over HTTP.

Returns:

the save path

Return type:

Path

_emit_json(base_path: pathlib.Path, emit_json: bool | str) → None

Write a machine-readable result.json for each contained result.

A string emit_json sets the filename when there is exactly one result; with multiple results each is named <bench_name>.result.json so they do not collide.

static _write_iframe_index(index_path: pathlib.Path, tab_files: list) → None: Write a lightweight HTML index with tab buttons and an iframe.

show(run_cfg: bencher.bench_cfg.BenchRunCfg | None = None) → threading.Thread

Launches a webserver with plots of the benchmark results, blocking

Parameters:: run_cfg (BenchRunCfg, optional) – Options for the webserve such as the port. Defaults to None.

publish_gh_pages(github_user: str, repo_name: str, folder_name: str = 'report', branch_name: str = 'gh-pages') → str

publish(remote_callback: Callable, branch_name: str | None = None, debug: bool = False) → str

Publish the results as an html file by committing it to the bench_results branch in the current repo. If you have set up your repo with github pages or equivalent then the html file will be served as a viewable webpage. This is an example of a callable to publish on github pages:

def publish_args(branch_name) -> tuple[str, str]:
    return (
        "https://github.com/blooop/bencher.git",
        f"https://github.com/blooop/bencher/blob/{branch_name}")

Parameters:: remote (Callable) – A function the returns a tuple of the publishing urls. It must follow the signature def publish_args(branch_name) -> tuple[str, str]. The first url is the git repo name, the second url needs to match the format for viewable html pages on your git provider. The second url can use the argument branch_name to point to the report on a specified branch.
Returns:: the url of the published report
Return type:: str

class bencher.GithubPagesCfg

github_user: str

repo_name: str

folder_name: str = 'report'

branch_name: str = 'gh-pages'

class bencher.Publisher

Bases: Protocol

Generic publisher protocol for benchmark reports.

Any object with a publish(report) method satisfies this protocol. Downstream projects implement their own publishers (GCS, S3, etc.) without modifying bencher.

publish(report: BenchReport) → str | None: Publish a report. Returns the published URL, or None.

class bencher.Executors

Bases: strenum.StrEnum

Enumeration of available execution strategies for benchmark jobs.

This enum defines the execution modes for running benchmark jobs and provides a factory method to create appropriate executors.

SERIAL

MULTIPROCESSING

SCOOP

static factory(provider: Executors) → concurrent.futures.Future | None

Create an executor instance based on the specified execution strategy.

Parameters:: provider (Executors) – The type of executor to create
Returns:: The executor instance, or None for serial execution
Return type:: Future | None

class bencher.SweepTimings

Timing data for a single run_sweep() call.

Each field records the wall-clock duration in milliseconds of a major phase inside Bench.run_sweep() or Bench.calculate_benchmark_results().

cache_check_ms: float = 0.0

sample_cache_init_ms: float = 0.0

dataset_setup_ms: float = 0.0

job_submission_ms: float = 0.0

job_execution_ms: float = 0.0

history_merge_ms: float = 0.0

post_setup_ms: float = 0.0

render_ms: float = 0.0

report_save_ms: float = 0.0

total_ms: float = 0.0

compute_total() → float: Compute total_ms as the sum of all phase timing fields.

summary() → dict[str, float]: Return all phase timings as a dict.

class bencher.VideoWriter(filename: str = 'vid')

images = []

image_files = []

video_files = []

filename = '/vid.mp4'

append(img)

write() → str

static create_label(label, width=None, height=16, color=(255, 255, 255))

static label_image(path: pathlib.Path, label, padding=20, color=(255, 255, 255)) → pathlib.Path

static convert_to_compatible_format(video_path: str) → str

write_video_raw(video_clip: moviepy.video.VideoClip, fps: int = 30) → str

static extract_frame(video_path: str, time: float | None = None, output_path: str | None = None) → str

Extract a frame from a video at a specific time.

Parameters:

video_path – Path to the video file
time – Time in seconds to extract frame. If None, uses last frame
output_path – Optional path where to save the image. If None, uses video name with _frame.png

Returns:

Path to the saved PNG image

Return type:

str

bencher.add_image(np_array: numpy.ndarray, name: str = 'img') → str: Creates a file on disk from a numpy array and returns the created image path

class bencher.ClassEnum

Bases: strenum.StrEnum

A string-based enum class that maps enum values to corresponding class instances.

ClassEnum is a pattern to make it easier to create a factory method that converts from an enum value to a corresponding class instance. Subclasses should implement the to_class() method which takes an enum value and returns an instance of the corresponding class.

This pattern is useful for configuration-driven class instantiation, allowing classes to be selected via string configuration values that match enum names.

classmethod to_class_generic(module_import: str, class_name: str) → Any

Create an instance of a class from its module path and class name.

This utility method dynamically imports a module and instantiates a class from it.

Parameters:

module_import (str) – The module path to import (e.g., “bencher.class_enum”)
class_name (str) – The name of the class to instantiate

Returns:

A new instance of the specified class

Return type:

Any

classmethod to_class(enum_val: ClassEnum) → Any

Abstractmethod:

Convert an enum value to its corresponding class instance.

Subclasses must override this method to implement the mapping from enum values to class instances.

Parameters:: enum_val (ClassEnum) – The enum value to convert to a class instance
Returns:: An instance of the class corresponding to the enum value
Return type:: Any
Raises:: NotImplementedError – If this method is not overridden by a subclass

class bencher.ExampleEnum

Bases: ClassEnum

An example implementation of ClassEnum.

This enum demonstrates how to use ClassEnum to map enum values to class instances. Each enum value corresponds to a class name that can be instantiated.

Class1

Class2

classmethod to_class(enum_val: ExampleEnum) → BaseClass

Convert an ExampleEnum value to its corresponding class instance.

Parameters:: enum_val (ExampleEnum) – The enum value to convert
Returns:: An instance of either Class1 or Class2, depending on the enum value
Return type:: BaseClass

bencher.create_bench(sweep: bencher.variables.parametrised_sweep.ParametrizedSweep, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, report: bencher.bench_report.BenchReport | None = None, name: str | None = None) → bencher.bencher.Bench

Create a Bench instance from a ParametrizedSweep.

Parameters:

sweep – The ParametrizedSweep instance to benchmark.
run_cfg – Optional benchmark run configuration.
report – Optional existing report to append results to.
name – Optional name for the benchmark. If None, derived from sweep’s class name.

Returns:

A configured Bench instance.

bencher.create_bench_runner(sweep: bencher.variables.parametrised_sweep.ParametrizedSweep, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, name: str | None = None) → bencher.bench_runner.BenchRunner

Create a BenchRunner instance from a ParametrizedSweep.

Enables fluent chaining like:: MyConfig().to_bench_runner().add(func).run(subsampling_divisions=2, max_subsampling_divisions=4)

Parameters:

sweep – The ParametrizedSweep instance to use as the benchmark class.
run_cfg – Optional benchmark run configuration. Created if not provided.
name – Optional name for the runner. If None, auto-generated.

Returns:

A configured BenchRunner instance.

bencher.run(target: Callable | type | bencher.variables.parametrised_sweep.ParametrizedSweep, *, subsampling_divisions=UNSET, repeats: int = 1, max_subsampling_divisions: int | None = None, max_repeats: int | None = None, run_cfg: bencher.bench_cfg.BenchRunCfg | None = None, show: bool | str | bencher.bench_cfg.ShowMode = True, save: bool = False, publish: bool = False, publisher: bencher.bench_report.Publisher | bencher.bench_report.GithubPagesCfg | Callable | None = None, grouped: bool = False, cache_samples: bool | None = None, over_time: bool | None = None, backend: str | None = None, optimise: int | bool = 0, sampling_context: contextlib.AbstractContextManager[Any] | None = None, **kwargs) → list[bencher.bench_cfg.BenchCfg]

Run a benchmark target with sensible defaults.

Handles three cases: 1. Callable (e.g. bn.run(example_fn)) — wraps in BenchRunner. 2. ParametrizedSweep subclass (e.g. bn.run(SimpleFloat)) — instantiates, calls

to_bench() + plot_sweep().

ParametrizedSweep instance (e.g. bn.run(SimpleFloat())) — same as above without instantiation.

Parameters:

target – A benchmark function, ParametrizedSweep class, or ParametrizedSweep instance.
subsampling_divisions – Benchmark sampling resolution subsampling_divisions. Defaults to 2.
repeats – Number of repeats. Defaults to 1.
max_subsampling_divisions – Maximum subsampling_divisions for progressive runs. Defaults to None (single subsampling_divisions).
max_repeats – Maximum repeats for progressive runs. Defaults to None (single repeat count).
run_cfg – Optional explicit BenchRunCfg. Defaults to None.
show – Where to view the report. Accepts True/ShowMode.LIVE (default — start a Panel server and block on input() until the user presses Enter), ShowMode.HTML (save an embedded HTML file and open it in the browser, then return), ShowMode.PUBLISHED (open the URL returned by publish — requires publish=True), or False/ShowMode.NONE (display nothing).
save – Save results to disk. Defaults to False.
publish – Publish results. Defaults to False.
publisher – An object conforming to the Publisher protocol (i.e. has a publish(report) method). Passed to BenchRunner and called after each progressive iteration when publish is True.
grouped – Produce a single HTML page with all benchmarks. Defaults to False.
cache_samples – Use sample cache for previous results. None (default) auto-enables for progressive runs. Pass False to disable even for progressive runs.
over_time – Enable time-series benchmarking. None preserves run_cfg value.
backend – Visualization backend (‘panel’ or ‘rerun’). None preserves run_cfg value.
optimise – When > 0, appends optuna analysis plots (parameter importance, with/without repeats comparison, best parameters) from the sweep results to the report. Defaults to 0 (no optimisation analysis).
sampling_context –
An optional context manager that wraps only the sampling phase (br.run(...)). Its __exit__ is guaranteed to run before the Panel/Bokeh server starts, so resources held by the context (DB pools, GPU handles, simulators, etc.) are released while nothing blocks. save and publish still execute inside the context (they happen during br.run(show=False, ...)). Defaults to None (no wrapper, fully backward-compatible).

Anti-pattern — wrapping the whole call keeps resources held during the interactive viewing session:
```
with gpu_context():          # held for the entire viewing session!
    bn.run(target, show=True)
```
Recommended — use sampling_context so the context exits before the server starts:
```
bn.run(target, show=True, sampling_context=gpu_context())
```

Returns:

A list of benchmark configuration objects with results.

Return type:

list[BenchCfg]

class bencher.BenchData

Frozen value type handed to plot plugins. The stable public contract surface for plugin authors — internal bencher refactors must preserve this shape.

dataset: xarray.Dataset

input_vars: tuple = ()

result_vars: tuple = ()

plt_cnt_cfg: bencher.plotting.plt_cnt_cfg.PltCntCfg | None = None

run_meta: RunMeta

optimizer_study: Any | None = None

baseline_runs: tuple[BenchData, Ellipsis] = ()

cache: CacheHandle | None = None

legacy_result: Any | None = None

render_kwargs: dict

has(capability: str) → bool

True when an optional context field is populated.

Used by PlotFilter.requires to gate plugins that need fields beyond dataset+vars.

with_changes(**kwargs) → BenchData

classmethod fake(*, dataset: xarray.Dataset | None = None, input_vars: tuple = (), result_vars: tuple = (), plt_cnt_cfg: bencher.plotting.plt_cnt_cfg.PltCntCfg | None = None, **overrides) → BenchData

Construct a minimal BenchData for plugin unit tests.

Defaults dataset to an empty xr.Dataset and plt_cnt_cfg to a zero-counted config so plugin authors can construct a usable handle in one line.

class bencher.CacheHandle

Bases: Protocol

Plugin-accessible memoization surface. Bencher core supplies a concrete handle; plugins treat it as opaque key/value storage.

get(key: str) → Any | None

set(key: str, value: Any) → None

class bencher.PlotPlugin

Bases: Protocol

Stable public contract for plot plugins.

A plugin renders a BenchData handle into a Panel-embeddable view. The plugin owns internal composition (linked hv.Layout, plotly.subplots, full Rerun blueprints, …); bencher only does outer Panel-level composition over plugin outputs.

name: str

backend: str

match: bencher.plotting.plot_filter.PlotFilter

priority: int

requires: frozenset[str]

render(data: bencher.plugins.bench_data.BenchData) → panel.viewable.Viewable

class bencher.PluginRegistry

In-process registry of plot plugins, keyed by (name, backend).

name is the chart type (“line”, “heatmap”, …); backend is the rendering library namespace (“holoviews”, “rerun”, …). Several backends may implement the same chart type; selection resolves each chart type to one implementation — the preferred backend when given, otherwise the highest-priority one. Registering an existing (name, backend) pair replaces it, which is the documented override mechanism (a user plugin replaces a built-in by sharing its name and backend, or outranks it from a different backend via priority/preference).

_plugins: dict[tuple[str, str], bencher.plugins.plugin.PlotPlugin]

_entry_points_loaded = False

register(plugin: bencher.plugins.plugin.PlotPlugin) → None

unregister(name: str, backend: str | None = None) → None: Remove a plugin. With no backend, removes every backend’s implementation of that chart type.

clear() → None

mark_entry_points_loaded() → None: Skip the entry-point scan on next lookup. Test-only helper.

get(name: str, backend: str | None = None) → bencher.plugins.plugin.PlotPlugin | None

Resolve a chart type to one implementation.

With a backend, exact lookup. Without, the preferred implementation: highest priority among all backends providing name (ties broken by backend string for determinism).

implementations(name: str) → tuple[bencher.plugins.plugin.PlotPlugin, Ellipsis]: Every backend’s implementation of a chart type, highest priority first.

all() → tuple[bencher.plugins.plugin.PlotPlugin, Ellipsis]

_ensure_entry_points_loaded() → None

_register_loaded(ep_name: str, obj) → None

select(data: bencher.plugins.bench_data.BenchData, *, include: Iterable[str] | None = None, exclude: Iterable[str] | None = None, backend: str | None = None, only: str | None = None) → tuple[bencher.plugins.plugin.PlotPlugin, Ellipsis]

Return one matching implementation per chart type, by descending priority.

only short-circuits to a single named chart type (no match-filter check; explicit opt-in by name implies the user knows what they want).
include / exclude filter the candidate set by chart-type name.
Named-only plugins (auto=False) are skipped during automatic selection (include is None); naming them via include or only selects them.
backend states the preferred backend: where a chart type is implemented by several backends, the preferred one is chosen when it matches; chart types the preferred backend does not provide still render through their best other implementation. This is what lets a config flag swap the rendering library under the same set of plotters.

explain(data: bencher.plugins.bench_data.BenchData, *, include: Iterable[str] | None = None, exclude: Iterable[str] | None = None, backend: str | None = None, only: str | None = None) → tuple[PluginDecision, Ellipsis]: The full selection decision table: one entry per registered plugin, chosen entries first (in select() order — descending priority, then name), each rejected entry carrying the first gate that dropped it. select() is exactly the chosen subset, so this is the authoritative record of why a plot did or did not appear (A2 Phase S2).

render(data: bencher.plugins.bench_data.BenchData, *, include: Iterable[str] | None = None, exclude: Iterable[str] | None = None, backend: str | None = None, only: str | None = None, strict: bool = False) → tuple[tuple[str, panel.viewable.Viewable], Ellipsis]

Run every selected plugin, returning (name, pane) pairs in priority order.

With strict=False (default) a render exception is caught and replaced with a visible error pane so one broken plugin doesn’t kill the report. strict=True re-raises the first failure — intended for development.

class bencher.RunMeta

name: str = ''

timestamp: datetime.datetime

sweep_hash: str = ''

bencher.get_registry() → PluginRegistry

bencher.plot_plugin(*, name: str, backend: str = 'user', match: bencher.plotting.plot_filter.PlotFilter | None = None, priority: int = 0, requires: frozenset[str] | set[str] | tuple[str, Ellipsis] | None = None, register: bool = True, auto: bool = True) → Callable[[Callable[[bencher.plugins.bench_data.BenchData], panel.viewable.Viewable]], _FunctionPlugin]

Wrap a function as a plot plugin and (by default) register it with the global registry. Returns the plugin object so callers can also register manually with register=False.

auto=False makes the plugin named-only: it never appears in automatic selection (a default to_auto report) but is selected when requested by name via plot_list/include/only.

bencher.register_plugin(plugin: bencher.plugins.plugin.PlotPlugin) → bencher.plugins.plugin.PlotPlugin

bencher.unregister_plugin(name: str, backend: str | None = None) → None

bencher._DEPRECATED_ALIASES

bencher.__getattr__(name: str)

bencher

Submodules

Attributes

Exceptions

Classes

Functions

Package Contents

Subsampling Divisions-to-samples mapping