bencher.utils
=============

.. py:module:: bencher.utils


Attributes
----------

.. autoapisummary::

   bencher.utils._current_job_key
   bencher.utils._gen_path_counter
   bencher.utils.AGG_FN_MAP
   bencher.utils.UNSET


Classes
-------

.. autoapisummary::

   bencher.utils._Unset


Functions
---------

.. autoapisummary::

   bencher.utils.hmap_canonical_input
   bencher.utils.make_namedtuple
   bencher.utils.get_nearest_coords
   bencher.utils.get_nearest_coords1D
   bencher.utils.hash_sha1
   bencher.utils.capitalise_words
   bencher.utils.un_camel
   bencher.utils.mult_tuple
   bencher.utils.tabs_in_markdown
   bencher.utils.int_to_col
   bencher.utils.lerp
   bencher.utils.color_tuple_to_css
   bencher.utils.color_tuple_to_255
   bencher.utils.gen_path
   bencher.utils.gen_video_path
   bencher.utils.gen_image_path
   bencher.utils.gen_rerun_data_path
   bencher.utils.resolve_aggregate
   bencher.utils.callable_name
   bencher.utils.listify
   bencher.utils.get_name
   bencher.utils.params_to_str
   bencher.utils.publish_file
   bencher.utils.normalize_subsampling_divisions_kwargs
   bencher.utils.github_content


Module Contents
---------------

.. py:function:: hmap_canonical_input(dic: dict) -> tuple

   From a dictionary of kwargs, return a hashable representation (tuple) that is always the same for the same inputs and retains the order of the input arguments.  e.g, {x=1,y=2} -> (1,2) and {y=2,x=1} -> (1,2).  This is used so that keywords arguments can be hashed and converted the the tuple keys that are used for holomaps

   :param dic: dictionary with keyword arguments and values in any order
   :type dic: dict

   :returns: values of the dictionary always in the same order and hashable
   :rtype: tuple


.. py:function:: make_namedtuple(class_name: str, **fields) -> collections.namedtuple

   Convenience method for making a named tuple

   :param class_name: name of the named tuple
   :type class_name: str

   :returns: a named tuple with the fields as values
   :rtype: namedtuple


.. py:function:: get_nearest_coords(dataset: xarray.Dataset, collapse_list: bool = False, **kwargs) -> dict

   Find the nearest coordinates in an xarray dataset based on provided coordinate values.

   Given an xarray dataset and kwargs of key-value pairs of coordinate values, return a dictionary
   of the nearest coordinate name-value pair that was found in the dataset.

   :param dataset: The xarray dataset to search in
   :type dataset: xr.Dataset
   :param collapse_list: If True, when a coordinate value is a list, only the first
                         item is returned. Defaults to False.
   :type collapse_list: bool, optional
   :param \*\*kwargs: Key-value pairs where keys are coordinate names and values are points to find
                      the nearest match for

   :returns: Dictionary of coordinate name-value pairs with the nearest values found in the dataset
   :rtype: dict


.. py:function:: get_nearest_coords1D(val: Any, coords: list[Any]) -> Any

   Find the closest coordinate to a given value in a list of coordinates.

   For numeric values, finds the value in coords that is closest to val.
   For non-numeric values, returns the exact match if found, otherwise returns val.

   :param val: The value to find the closest coordinate for
   :type val: Any
   :param coords: The list of coordinates to search in
   :type coords: list[Any]

   :returns: The closest coordinate value from the list
   :rtype: Any


.. py:function:: hash_sha1(var: Any) -> str

   A hash function that avoids the PYTHONHASHSEED 'feature' which returns a different hash value each time the program is run.

   Converts input to a consistent SHA1 hash string.

   :param var: The variable to hash
   :type var: Any

   :returns: A hexadecimal SHA1 hash of the string representation of the variable
   :rtype: str


.. py:function:: capitalise_words(message: str) -> str

   Given a string of lowercase words, capitalise them.

   :param message: lower case string
   :type message: str

   :returns: capitalised string where each word starts with an uppercase letter
   :rtype: str


.. py:function:: un_camel(camel: str) -> str

   Given a snake_case string return a CamelCase string

   :param camel: camelcase string
   :type camel: str

   :returns: uncamelcased string
   :rtype: str


.. py:function:: mult_tuple(inp: tuple[float, Ellipsis], val: float) -> tuple[float, Ellipsis]

   Multiply each element in a tuple by a scalar value.

   :param inp: The input tuple of floats to multiply
   :type inp: tuple[float, ...]
   :param val: The scalar value to multiply each element by
   :type val: float

   :returns: A new tuple with each element multiplied by val
   :rtype: tuple[float, ...]


.. py:function:: tabs_in_markdown(regular_str: str, spaces: int = 2) -> str

   Given a string with tabs in the form     convert the to &ensp; which is a double space in markdown

   :param regular_str: A string with tabs in it
   :type regular_str: str
   :param spaces: the number of spaces per tab
   :type spaces: int

   :returns: A string with sets of &nbsp; to represent the tabs in markdown
   :rtype: str


.. py:function:: int_to_col(int_val: int, sat: float = 0.5, val: float = 0.95, alpha: float = -1) -> tuple[float, float, float] | tuple[float, float, float, float]

   Uses the golden angle to generate colors programmatically with minimum overlap between colors.
   https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/

   :param int_val: index of an object you want to color, this is mapped to hue in HSV
   :type int_val: int
   :param sat: saturation in HSV. Defaults to 0.5.
   :type sat: float, optional
   :param val: value in HSV. Defaults to 0.95.
   :type val: float, optional
   :param alpha: transparency.  If -1 then only RGB is returned, if 0 or greater, RGBA is returned. Defaults to -1.
   :type alpha: float, optional

   :returns: either RGB or RGBA vector
   :rtype: tuple[float, float, float] | tuple[float, float, float, float]


.. py:function:: lerp(value: float, input_low: float, input_high: float, output_low: float, output_high: float) -> float

   Linear interpolation between two ranges.

   Maps a value from one range [input_low, input_high] to another range [output_low, output_high].

   :param value: The input value to interpolate
   :type value: float
   :param input_low: The lower bound of the input range
   :type input_low: float
   :param input_high: The upper bound of the input range
   :type input_high: float
   :param output_low: The lower bound of the output range
   :type output_low: float
   :param output_high: The upper bound of the output range
   :type output_high: float

   :returns: The interpolated value in the output range
   :rtype: float


.. py:function:: color_tuple_to_css(color: tuple[float, float, float]) -> str

   Convert a RGB color tuple to CSS rgb format string.

   :param color: RGB color tuple with values in range [0.0, 1.0]
   :type color: tuple[float, float, float]

   :returns: CSS color string in format 'rgb(r, g, b)' with values in range [0, 255]
   :rtype: str


.. py:function:: color_tuple_to_255(color: tuple[float, float, float]) -> tuple[int, int, int]

   Convert a RGB color tuple with values in range [0.0, 1.0] to values in range [0, 255].

   :param color: RGB color tuple with values in range [0.0, 1.0]
   :type color: tuple[float, float, float]

   :returns: RGB color tuple with values clamped to range [0, 255]
   :rtype: tuple[int, int, int]


.. py:data:: _current_job_key
   :type:  contextvars.ContextVar[str | None]

.. py:data:: _gen_path_counter
   :type:  contextvars.ContextVar[dict | None]

.. py:function:: gen_path(filename: str, folder: str = 'generic', suffix: str = '.dat') -> str

   Generate a path for a file in the cache directory.

   When called inside a benchmark sweep, files are placed in a per-job-key
   subdirectory so that cache overwrites can cleanly delete old media.
   Outside a sweep, falls back to UUID-based naming.

   :param filename: Base name for the file
   :type filename: str
   :param folder: Subfolder within cachedir. Defaults to "generic".
   :type folder: str, optional
   :param suffix: File extension. Defaults to ".dat".
   :type suffix: str, optional

   :returns: Absolute path to a file location
   :rtype: str


.. py:function:: gen_video_path(video_name: str = 'vid', extension: str = '.mp4') -> str

   Generate a unique path for a video file in the cache directory.

   :param video_name: Base name for the video file. Defaults to "vid".
   :type video_name: str, optional
   :param extension: Video file extension. Defaults to ".mp4".
   :type extension: str, optional

   :returns: Absolute path to a unique video file location
   :rtype: str


.. py:function:: gen_image_path(image_name: str = 'img', filetype: str = '.png') -> str

   Generate a unique path for an image file in the cache directory.

   :param image_name: Base name for the image file. Defaults to "img".
   :type image_name: str, optional
   :param filetype: Image file extension. Defaults to ".png".
   :type filetype: str, optional

   :returns: Absolute path to a unique image file location
   :rtype: str


.. py:function:: gen_rerun_data_path(rrd_name: str = 'rrd', filetype: str = '.rrd') -> str

   Generate a unique path for a rerun data file in the cache directory.

   :param rrd_name: Base name for the rerun data file. Defaults to "rrd".
   :type rrd_name: str, optional
   :param filetype: File extension. Defaults to ".rrd".
   :type filetype: str, optional

   :returns: Absolute path to a unique rerun data file location
   :rtype: str


.. py:function:: resolve_aggregate(aggregate: bool | int | list[str] | None, input_var_names: list[str] | None = None) -> list[str] | None

   Resolve the ``aggregate`` convenience parameter into a list of dimension names.

   :param aggregate: Aggregation specification.
                     - None / False: no aggregation
                     - True: aggregate all but the first input dim, collapsing data
                       to 1-D (the minimum for meaningful plots). With 0 or 1 input
                       vars there is nothing to aggregate and None is returned.
                     - int N: aggregate last N input dims (requires input_var_names)
                     - list[str]: aggregate exactly these dims (validated only when
                       input_var_names is provided)
   :param input_var_names: Names of input variables (in order).  When None,
                           list[str] values pass through without validation and True/int
                           raise ValueError (no context to resolve against).

   :returns: List of dimension names to aggregate over, or None.

   :raises ValueError: If int is out of range, list contains unknown names, or
       True/int used without input_var_names.
   :raises TypeError: If aggregate is an unsupported type.


.. py:data:: AGG_FN_MAP
   :type:  dict[str, Callable]

.. py:function:: callable_name(any_callable: Callable[Ellipsis, Any]) -> str

   Extract the name of a callable object, handling various callable types.

   This function attempts to extract the name of a callable object, including
   regular functions, partial functions, and other callables.

   :param any_callable: The callable object to get the name from
   :type any_callable: Callable[..., Any]

   :returns: The name of the callable
   :rtype: str


.. py:function:: listify(obj: Any) -> list[Any] | None

   Convert an object to a list if it's not already a list.

   This function handles conversion of various object types to lists, with special
   handling for None values and existing list/tuple types.

   :param obj: The object to convert to a list
   :type obj: Any

   :returns:

             A list containing the object, the object itself if it was
                 already a list, a list from the tuple if it was a tuple, or None if the
                 input was None
   :rtype: list[Any] | None


.. py:function:: get_name(var: Any) -> str

   Extract the name from a variable, handling param.Parameter objects.

   :param var: The variable to extract the name from
   :type var: Any

   :returns: The name of the variable
   :rtype: str


.. py:function:: params_to_str(param_list: list[param.Parameter]) -> list[str]

   Convert a list of param.Parameter objects to a list of their names.

   :param param_list: List of parameter objects
   :type param_list: list[param.Parameter]

   :returns: List of parameter names
   :rtype: list[str]


.. py:function:: publish_file(filepath: str, remote: str, branch_name: str) -> str

   Publish a file to an orphan git branch:

   .. code-block:: python

       def publish_args(branch_name) -> tuple[str, str]:
           return (
               "https://github.com/blooop/bencher.git",
               f"https://github.com/blooop/bencher/blob/{branch_name}")


   :param remote: A function the returns a tuple of the publishing urls. It must follow the signature def publish_args(branch_name) -> tuple[str, str].  The first url is the git repo name, the second url needs to match the format for viewable html pages on your git provider.  The second url can use the argument branch_name to point to the file on a specified branch.
   :type remote: Callable

   :returns: the url of the published file
   :rtype: str


.. py:class:: _Unset

   Sentinel for distinguishing 'not provided' from 'explicitly passed the default'.


   .. py:attribute:: _instance
      :value: None


   .. py:method:: __repr__()


   .. py:method:: __bool__()


.. py:data:: UNSET

.. py:function:: normalize_subsampling_divisions_kwargs(*, subsampling_divisions: int | _Unset, max_subsampling_divisions: int | None, kwargs: dict[str, Any], default_subsampling_divisions: int = 2, stacklevel: int = 2) -> tuple[int, int | None, bool]

   Translate deprecated ``level``/``max_level`` kwargs to ``subsampling_divisions``/``max_subsampling_divisions``.

   *subsampling_divisions* should be passed as ``UNSET`` when the caller did not provide it,
   so that ``run(subsampling_divisions=2, level=3)`` correctly raises ``TypeError`` instead of
   silently preferring ``level``.

   Returns ``(subsampling_divisions, max_subsampling_divisions, subsampling_divisions_was_set)`` where *subsampling_divisions_was_set*
   is ``True`` when the caller explicitly provided *subsampling_divisions* or *level*.

   Raises ``TypeError`` when old and new names are both provided.


.. py:function:: github_content(remote: str, branch_name: str, filename: str)