bencher.job =========== .. py:module:: bencher.job Attributes ---------- .. autoapisummary:: bencher.job.scoop_future_executor bencher.job._MISSING Classes ------- .. autoapisummary:: bencher.job.Job bencher.job.JobFuture bencher.job.Executors bencher.job.FutureCache bencher.job.JobFunctionCache Functions --------- .. autoapisummary:: bencher.job.run_job Module Contents --------------- .. py:data:: scoop_future_executor :value: None .. py:data:: _MISSING .. py:class:: Job(job_id: str, function: Callable, job_args: dict, job_key: str | None = None, tag: str = '') Represents a benchmarking job to be executed or retrieved from cache. A Job encapsulates a function, its arguments, and metadata for caching and tracking purposes. .. attribute:: job_id A unique identifier for the job, used for logging :type: str .. attribute:: function The function to be executed :type: Callable .. attribute:: job_args Arguments to pass to the function :type: dict .. attribute:: job_key A hash key for caching, derived from job_args if not provided :type: str .. attribute:: tag Optional tag for grouping related jobs :type: str .. py:attribute:: job_id .. py:attribute:: function .. py:attribute:: job_args .. py:attribute:: tag :value: '' .. py:class:: JobFuture(job: Job, res: dict | None = None, future: concurrent.futures.Future | None = None, cache: diskcache.Cache | None = None) A wrapper for a job result or future that handles caching. This class provides a unified interface for handling both immediate results and futures (for asynchronous execution). It also handles caching results when they become available. .. attribute:: job The job this future corresponds to :type: Job .. attribute:: res The result, if available immediately :type: dict .. attribute:: future The future representing the pending job, if executed asynchronously :type: Future .. attribute:: cache The cache to store results in when they become available .. py:attribute:: job .. py:attribute:: res :value: None .. py:attribute:: future :value: None .. py:attribute:: cache :value: None .. py:method:: result() -> dict Get the job result, waiting for completion if necessary. If the result is not immediately available (i.e., it's a future), this method will wait for the future to complete. Once the result is available, it will be cached if a cache is provided. :returns: The job result :rtype: dict .. py:function:: run_job(job: Job) -> dict Execute a job by calling its function with the provided arguments. Sets the ``_current_job_key`` context variable so that ``gen_path()`` places media files into a per-job-key directory for clean lifecycle management. :param job: The job to execute :type job: Job :returns: The result of the job execution :rtype: dict .. py:class:: Executors Bases: :py:obj:`strenum.StrEnum` Enumeration of available execution strategies for benchmark jobs. This enum defines the execution modes for running benchmark jobs and provides a factory method to create appropriate executors. .. py:attribute:: SERIAL .. py:attribute:: MULTIPROCESSING .. py:attribute:: SCOOP .. py:method:: factory(provider: Executors) -> concurrent.futures.Future | None :staticmethod: Create an executor instance based on the specified execution strategy. :param provider: The type of executor to create :type provider: Executors :returns: The executor instance, or None for serial execution :rtype: Future | None .. py:class:: FutureCache(executor=Executors.SERIAL, overwrite: bool = True, cache_name: str = 'fcache', tag_index: bool = True, size_limit: int = int(20000000000.0), cache_samples: bool = True) A cache system for benchmark job results with executor support. This class provides a unified interface for running benchmark jobs either serially or in parallel, with optional caching of results. It manages the execution strategy, caching policy, and tracks statistics about job execution. .. attribute:: executor_type The execution strategy to use :type: Executors .. attribute:: executor The executor instance, created on demand .. attribute:: cache Cache for storing job results :type: Cache .. attribute:: overwrite Whether to overwrite existing cached results :type: bool .. attribute:: call_count Counter for job calls :type: int .. attribute:: size_limit Maximum size of the cache in bytes :type: int .. attribute:: worker_wrapper_call_count Number of job submissions :type: int .. attribute:: worker_fn_call_count Number of actual function executions :type: int .. attribute:: worker_cache_call_count Number of cache hits :type: int .. py:attribute:: executor_type .. py:attribute:: executor :value: None .. py:attribute:: overwrite :value: True .. py:attribute:: call_count :value: 0 .. py:attribute:: size_limit :value: 0 .. py:attribute:: worker_wrapper_call_count :value: 0 .. py:attribute:: worker_fn_call_count :value: 0 .. py:attribute:: worker_cache_call_count :value: 0 .. py:method:: prefetch(keys: list[str]) -> dict Pre-load cached values for a batch of keys in one pass. Returns a dict mapping key -> cached value for all cache hits. This avoids per-job cache round-trips in the submit loop. .. py:method:: submit(job: Job, prefetched: dict | None = None) -> JobFuture Submit a job for execution, with caching if enabled. This method first checks the prefetched dict (if provided), then falls back to a single cache.get() query. If not found, it executes the job either serially or using the configured executor. :param job: The job to submit :type job: Job :param prefetched: Pre-fetched cache results from prefetch(). Defaults to None. :type prefetched: dict, optional :returns: A future representing the job execution :rtype: JobFuture .. py:method:: overwrite_msg(job: Job, suffix: str) -> None Log a message about overwriting or using cache. :param job: The job being executed :type job: Job :param suffix: Additional text to add to the log message :type suffix: str .. py:method:: clear_call_counts() -> None Clear the worker and cache call counts, to help debug and assert caching is happening properly. .. py:method:: clear_cache() -> None Clear all entries from the cache. .. py:method:: clear_tag(tag: str) -> None Remove all cache entries with the specified tag. Note: diskcache.evict() does not return the evicted values, so media files referenced by evicted entries may become orphans. Use ``clean_orphaned_media()`` periodically to reclaim them. :param tag: The tag identifying entries to remove from the cache :type tag: str .. py:method:: close() -> None Close the cache and shutdown the executor if they exist. .. py:method:: stats() -> str Get statistics about cache usage. :returns: A string with cache size information :rtype: str .. py:class:: JobFunctionCache(function: Callable, overwrite: bool = False, executor: Executors = Executors.SERIAL, cache_name: str = 'fcache', tag_index: bool = True, size_limit: int = int(10000000000.0)) Bases: :py:obj:`FutureCache` A specialized cache for a specific function with various input parameters. This class simplifies caching results for a specific function called with different sets of parameters. It wraps the general FutureCache with a focus on a single function. .. attribute:: function The function to cache results for :type: Callable .. py:attribute:: function .. py:method:: call(**kwargs) -> JobFuture Call the wrapped function with the provided arguments. This method creates a Job for the function call and submits it through the cache. :param \*\*kwargs: Arguments to pass to the function :returns: A future representing the function call :rtype: JobFuture