API Reference

UtilsX - a collection of generic Python utility functions and types.

`collections`

Utilities for working with collections.

`check_equal_length(*collections)`

Given an arbitrary number of collections, check if they all have equal length.

Parameters:

Name	Type	Description	Default
`*collections`	`Sized`	Objects which have length to be checked for its equality.	`()`

Returns:

Type	Description
`bool`	Whether all provided collections have equal length.

Raises:

Type	Description
`ValueError`	If no collections provided.

Source code in src/utilsx/collections.py

def check_equal_length(*collections: Sized) -> bool:
    """Given an arbitrary number of collections, check if they all have equal length.

    Args:
        *collections: Objects which have length to be checked for its equality.

    Returns:
        Whether all provided collections have equal length.

    Raises:
        ValueError: If no collections provided.
    """
    if not collections:
        raise ValueError("No collections to provided to check for lengths equality.")
    benchmark_length = len(collections[0])
    return all(len(collection) == benchmark_length for collection in collections)

`get_duplicates(iterable)`

Get a set of all values in a collection that are duplicates, i.e., present more than once.

Parameters:

Name	Type	Description	Default
`iterable`	`Iterable[T]`	A collection to check.	required

Returns:

Type	Description
`frozenset[T]`	A set of values that are present more than once.

Source code in src/utilsx/collections.py

def get_duplicates(iterable: Iterable[T]) -> frozenset[T]:
    """Get a set of all values in a collection that are duplicates, i.e., present more than once.

    Args:
        iterable: A collection to check.

    Returns:
        A set of values that are present more than once.
    """
    return frozenset(key for key, value in Counter(iterable).items() if value > 1)

`is_collection_of_equal_elements(collection)`

Check whether all elements in a collection are equal to each other.

Parameters:

Name	Type	Description	Default
`collection`	`Collection[Any]`	A collection to check that all elements are equal.	required

Returns:

Type	Description
`bool`	Whether all elements are equal to each other.

Source code in src/utilsx/collections.py

def is_collection_of_equal_elements(collection: Collection[Any]) -> bool:
    """Check whether all elements in a collection are equal to each other.

    Args:
        collection: A collection to check that all elements are equal.

    Returns:
        Whether all elements are equal to each other.
    """
    collection = list(collection)
    return all(element == collection[0] for element in collection)

`constants`

Common constant values from math, physics, etc.

`decorators`

Profiling pipeline nodes.

`narrow_return(index)`

Makes a function returning a tuple return only the element at the given index.

Implemented as a decorator factory.

Parameters:

Name	Type	Description	Default
`index`	`int`	The index of the tuple element to return.	required

Returns:

Type	Description
`Callable[[Callable[..., tuple[Any, ...]]], Callable[..., Any]]`	A decorator.

Source code in src/utilsx/decorators.py

def narrow_return(
    index: int,
) -> Callable[[Callable[..., tuple[Any, ...]]], Callable[..., Any]]:
    """Makes a function returning a tuple return only the element at the given index.

    Implemented as a decorator factory.

    Args:
        index: The index of the tuple element to return.

    Returns:
        A decorator.
    """

    def decorator(func: Callable[..., tuple[Any, ...]]) -> Callable[..., Any]:
        @wraps(func)
        def wrapper(*args: Any, **kwargs: Any) -> Any:  # noqa: ANN401
            result = func(*args, **kwargs)
            return result[index]

        return wrapper

    return decorator

`dicts`

Utilities for working with dictionaries.

`multiply_dict_values(dictionary, multiplier)`

Get a copy of the dictionary with values multiplied by scalar, preserving keys.

Parameters:

Name	Type	Description	Default
`dictionary`	`Mapping[T, float]`	A dictionary to multiply values of.	required
`multiplier`	`float`	A scalar multiplier.	required

Returns:

Type	Description
`dict[T, float]`	A copy of the original dictionary with values multiplied by scalar.

Source code in src/utilsx/dicts/_modification.py

def multiply_dict_values(dictionary: Mapping[T, float], multiplier: float) -> dict[T, float]:
    """Get a copy of the dictionary with values multiplied by scalar, preserving keys.

    Args:
        dictionary: A dictionary to multiply values of.
        multiplier: A scalar multiplier.

    Returns:
        A copy of the original dictionary with values multiplied by scalar.
    """
    return {key: value * multiplier for key, value in dictionary.items()}

`remove_items_with_zero_values(dictionary)`

Drop key-value pairs from a dictionary whose values are zero.

Parameters:

Name	Type	Description	Default
`dictionary`	`dict[T, float]`	To be filtered to exclude key-value pairs with zero values.	required

Returns:

Type	Description
`dict[T, float]`	A subset of the original dictionary items, only pairs with non-zero values.

Source code in src/utilsx/dicts/_filtering.py

def remove_items_with_zero_values(dictionary: dict[T, float]) -> dict[T, float]:
    """Drop key-value pairs from a dictionary whose values are zero.

    Args:
        dictionary: To be filtered to exclude key-value pairs with zero values.

    Returns:
        A subset of the original dictionary items, only pairs with non-zero values.
    """
    return {key: value for key, value in dictionary.items() if value}

`rename_keys_in_nested_dict(dictionary, renaming)`

Replace all specified keys by other specified names in an arbitrarily deep dictionary.

Parameters:

Name	Type	Description	Default
`dictionary`	`dict[str, Any]`	A dictionary of arbitrary depth.	required
`renaming`	`dict[str, str]`	A mapping from old to new key names.	required

Returns:

Type	Description
`dict[str, Any]`	Copy of the original dictionary with `old_key` renamed to `new_key`
`dict[str, Any]`	at all levels of key depth.

Source code in src/utilsx/dicts/_modification.py

def rename_keys_in_nested_dict(
    dictionary: dict[str, Any], renaming: dict[str, str]
) -> dict[str, Any]:
    """Replace all specified keys by other specified names in an arbitrarily deep dictionary.

    Args:
        dictionary: A dictionary of arbitrary depth.
        renaming: A mapping from old to new key names.

    Returns:
        Copy of the original dictionary with ``old_key`` renamed to ``new_key``
        at all levels of key depth.
    """
    # This ``isinstance`` check is required to leave non-dict structures as-is.
    if isinstance(dictionary, dict):
        return {
            (renaming.get(key, key)): rename_keys_in_nested_dict(value, renaming)
            for key, value in dictionary.items()
        }
    return dictionary

`sort_by_value(dictionary, reverse=False)`

Sort a dictionary with numeric values by those values.

Parameters:

Name	Type	Description	Default
`dictionary`	`dict[T, NumberT]`	A dictionary to sort by value.	required
`reverse`	`bool`	False for ascending order, True for descending. Exactly matches the `reverse` argument of `sorted` Python function.	`False`

Returns:

Type	Description
`dict[T, NumberT]`	Same dictionary in terms of content, just sorted by value.

Source code in src/utilsx/dicts/_sorting.py

def sort_by_value(dictionary: dict[T, NumberT], reverse: bool = False) -> dict[T, NumberT]:
    """Sort a dictionary with numeric values by those values.

    Args:
        dictionary: A dictionary to sort by value.
        reverse: False for ascending order, True for descending. Exactly matches the ``reverse``
            argument of ``sorted`` Python function.

    Returns:
        Same dictionary in terms of content, just sorted by value.
    """
    return dict(sorted(dictionary.items(), key=lambda item: item[1], reverse=reverse))

`sum_dicts(*dicts)`

Given dictionaries, return their summation: a union of keys and totals of values.

Parameters:

Name	Type	Description	Default
`*dicts`	`dict[T, float]`	To be added up together, any number.	`()`

Returns:

Type	Description
`dict[T, float]`	A combined dictionary with a union of keys and totals of values.

Source code in src/utilsx/dicts/_combination.py

def sum_dicts(*dicts: dict[T, float]) -> dict[T, float]:
    """Given dictionaries, return their summation: a union of keys and totals of values.

    Args:
        *dicts: To be added up together, any number.

    Returns:
        A combined dictionary with a union of keys and totals of values.
    """
    output: dict[T, float] = defaultdict(float)
    for dictionary in dicts:
        for key, value in dictionary.items():
            output[key] += value
    return dict(output)

`exceptions`

Utilities for raising exceptions.

`hint_if_extra_uninstalled(required_modules, extra_name, package_name)`

Check if an optional dependency group is installed, and hint if not, via ImportError.

Parameters:

Name	Type	Description	Default
`required_modules`	`Iterable[str]`	Modules which need to be installed in venv for that dependency group.	required
`extra_name`	`str`	Name of an optional dependency group.	required
`package_name`	`str`	Name of the package which provides a given optional dependency group.	required

Raises:

Type	Description
`ImportError`	If any of the required modules are not installed.

Source code in src/utilsx/exceptions.py

def hint_if_extra_uninstalled(
    required_modules: Iterable[str],
    extra_name: str,
    package_name: str,
) -> None:
    """Check if an optional dependency group is installed, and hint if not, via ``ImportError``.

    Args:
        required_modules: Modules which need to be installed in venv for that dependency group.
        extra_name: Name of an optional dependency group.
        package_name: Name of the package which provides a given optional dependency group.

    Raises:
        ImportError: If any of the required modules are not installed.
    """
    for module in required_modules:
        try:
            import_module(module)
        except ImportError as e:
            raise ImportError(
                f"Optional dependency group '{extra_name}' is required for this feature.\n"
                f"Add '{package_name}[{extra_name}]' to your requirements list"
                " and install to virtual environment."
            ) from e

`prohibit_negative_values(values, exception_class=ValueError, exception_msg='Negative values are prohibited')`

Raise an exception if an iterable of numbers has negative values.

Parameters:

Name	Type	Description	Default
`values`	`Iterable[float]`	To check for any negative member.	required
`exception_class`	`type[Exception]`	Exception class to raise if any member is negative, defaults to `ValueError`.	`ValueError`
`exception_msg`	`str`	A message to add to the raised exception.	`'Negative values are prohibited'`

Returns:

Type	Description
`None`	None.

Source code in src/utilsx/exceptions.py

def prohibit_negative_values(
    values: Iterable[float],
    exception_class: type[Exception] = ValueError,
    exception_msg: str = "Negative values are prohibited",
) -> None:
    """Raise an exception if an iterable of numbers has negative values.

    Args:
        values: To check for any negative member.
        exception_class: Exception class to raise if any member is negative,
            defaults to ``ValueError``.
        exception_msg: A message to add to the raised exception.

    Returns:
        None.
    """
    if any(value < 0 for value in values):
        raise exception_class(exception_msg)

`raise_key_error_with_suggestions(attempted_key, existing_keys, object_name='object', attribute_name='key')`

Raise a key error complemented with suggestions based on closest matches.

Parameters:

Name	Type	Description	Default
`attempted_key`	`str`	A key that was attempted to be found.	required
`existing_keys`	`Collection[str]`	Existing keys, among which an attempted key was not found.	required
`object_name`	`str`	Archetype of an object that was searched by key.	`'object'`
`attribute_name`	`str`	If this key represents an attribute with explicit name.	`'key'`

Returns:

Type	Description
`NoReturn`	Never returns anything.

Raises:

Type	Description
`KeyError`	Complemented with close matches, if any.

Notes

Inspired by dataset name hint implemented in Kedro: https://github.com/kedro-org/kedro

Source code in src/utilsx/exceptions.py

def raise_key_error_with_suggestions(
    attempted_key: str,
    existing_keys: Collection[str],
    object_name: str = "object",
    attribute_name: str = "key",
) -> NoReturn:
    """Raise a key error complemented with suggestions based on closest matches.

    Args:
        attempted_key: A key that was attempted to be found.
        existing_keys: Existing keys, among which an attempted key was not found.
        object_name: Archetype of an object that was searched by key.
        attribute_name: If this key represents an attribute with explicit name.

    Returns:
        Never returns anything.

    Raises:
        KeyError: Complemented with close matches, if any.

    Notes:
        Inspired by dataset name hint implemented in Kedro: https://github.com/kedro-org/kedro
    """
    error_msg = f"{object_name.capitalize()} with {attribute_name} {attempted_key} not found."
    close_matches = get_close_matches(attempted_key, existing_keys)
    if close_matches:
        suggestions = ", ".join(close_matches)
        error_msg += f" Did you mean one of these instead: {suggestions}?"
    raise KeyError(error_msg)

`functional`

Utilities for functional programming.

`identity(x)`

An identity function: returns a single input unchanged.

Source code in src/utilsx/functional.py

def identity(x: T) -> T:
    """An identity function: returns a single input unchanged."""
    return x

`math`

Utilities for mathematical operations.

`ceil_to_multiple(x, multiple)`

Ceil a number to the next multiple of another value.

Parameters:

Name	Type	Description	Default
`x`	`float`	Number to ceil.	required
`multiple`	`int`	Enforce the output to be a multiple of.	required

Returns:

Type	Description
`int`	X input ceiled to the next multiple of another value.

Source code in src/utilsx/math/_rounding.py

def ceil_to_multiple(x: float, multiple: int) -> int:
    """Ceil a number to the next multiple of another value.

    Args:
        x: Number to ceil.
        multiple: Enforce the output to be a multiple of.

    Returns:
        X input ceiled to the next multiple of another value.
    """
    return math.ceil(x / multiple) * multiple

`check_values_add_up_to_one(values, mode='either')`

Check if values in a collection add up to 1 or 100.

Parameters:

Name	Type	Description	Default
`values`	`Collection[float]`	Values to check.	required
`mode`	`Literal['fractions', 'percentages', 'either']`	"fractions" if they should add up to 1, "percentages" if they should add up to 100, "either" if either of this works.	`'either'`

Returns:

Type	Description
`bool`	Boolean outcome of the check.

Source code in src/utilsx/math/_collections.py

def check_values_add_up_to_one(
    values: Collection[float],
    mode: Literal["fractions", "percentages", "either"] = "either",
) -> bool:
    """Check if values in a collection add up to 1 or 100.

    Args:
        values: Values to check.
        mode: "fractions" if they should add up to 1, "percentages" if they should add up to 100,
            "either" if either of this works.

    Returns:
        Boolean outcome of the check.
    """
    match mode:
        case "fractions":
            valid_totals = frozenset((1,))
        case "percentages":
            valid_totals = frozenset((100,))
        case "either":
            valid_totals = frozenset((1, 100))
        case _:
            raise ValueError(f"Unrecognized mode: {mode}")

    sum_of_values = sum(values)
    return any(
        math.isclose(sum_of_values, valid_total, rel_tol=0.001) for valid_total in valid_totals
    )

`convert_number_to_units(number, units)`

Convert a number to thousands or millions.

Parameters:

Name	Type	Description	Default
`number`	`float`	Number to convert.	required
`units`	`_TUnits`	Units to convert to.	required

Returns:

Type	Description
`float`	A number converted to specified units.

Source code in src/utilsx/math/_downscaling.py

def convert_number_to_units(number: float, units: _TUnits) -> float:
    """Convert a number to thousands or millions.

    Args:
        number: Number to convert.
        units: Units to convert to.

    Returns:
        A number converted to specified units.
    """
    match units:
        case "thousand":
            denominator = THOUSAND
        case "million":
            denominator = MILLION
        case _:
            raise ValueError(f"Unrecognized units: {units}")

    return number / denominator

`double(x)`

Multiply a number by two: in other words, double it.

Parameters:

Name	Type	Description	Default
`x`	`float`	The number to multiply by two.	required

Returns:

Type	Description
`float`	Result of multiplying this number by literal integer two.

Source code in src/utilsx/math/_scalar_ops.py

def double(x: float) -> float:
    """Multiply a number by two: in other words, double it.

    Args:
        x: The number to multiply by two.

    Returns:
        Result of multiplying this number by literal integer two.
    """
    return x * 2

`halve(x)`

Divide a number by two: in other words, get its half.

Parameters:

Name	Type	Description	Default
`x`	`float`	The number to divide by two.	required

Returns:

Type	Description
`float`	Result of dividing this number by literal integer two.

Source code in src/utilsx/math/_scalar_ops.py

def halve(x: float) -> float:
    """Divide a number by two: in other words, get its half.

    Args:
        x: The number to divide by two.

    Returns:
        Result of dividing this number by literal integer two.
    """
    return x / 2

`is_monotonically_growing(time_series, multiplier)`

Check whether a time series can be considered monotonically growing.

To be called so, each next element should be at least multiplier times bigger than a previous one. Series of less than two elements are considered non-growing.

Parameters:

Name	Type	Description	Default
`time_series`	`Sequence[float]`	A sequence of numbers.	required
`multiplier`	`float`	How many times bigger each next element should be.	required

Returns:

Type	Description
`bool`	True if monotonically growing, False otherwise.

Source code in src/utilsx/math/_collections.py

def is_monotonically_growing(time_series: Sequence[float], multiplier: float) -> bool:
    """Check whether a time series can be considered monotonically growing.

    To be called so, each next element should be at least ``multiplier`` times bigger than
    a previous one. Series of less than two elements are considered non-growing.

    Args:
        time_series: A sequence of numbers.
        multiplier: How many times bigger each next element should be.

    Returns:
        True if monotonically growing, False otherwise.
    """
    if len(time_series) < 2:  # noqa: PLR2004
        return False
    return all(
        time_series[i + 1] > time_series[i] * multiplier for i in range(len(time_series) - 1)
    )

`normalize(values)`

Normalize a sequence of numbers to make them add up to one.

Source code in src/utilsx/math/_collections.py

def normalize(values: Sequence[float]) -> list[float]:
    """Normalize a sequence of numbers to make them add up to one."""
    return [value / sum(values) for value in values]

`safe_divide(numerator, denominator, fallback=0)`

Divide one number by another, falling back on something in case of ZeroDivisionError.

Parameters:

Name	Type	Description	Default
`numerator`	`float`	Number to divide.	required
`denominator`	`float`	Denominator to use.	required
`fallback`	`float`	Fallback value to use in case the denominator is zero, defaults to zero.	`0`

Source code in src/utilsx/math/_division.py

def safe_divide(numerator: float, denominator: float, fallback: float = 0) -> float:
    """Divide one number by another, falling back on something in case of ``ZeroDivisionError``.

    Args:
        numerator: Number to divide.
        denominator: Denominator to use.
        fallback: Fallback value to use in case the denominator is zero, defaults to zero.
    """
    return numerator / denominator if denominator != 0 else fallback

`pandas`

Utilities for enhancing your Pandas workflows.

`count_na(df)`

Get the total number of missing values in a DataFrame.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	To count missing values in.	required

Returns:

Type	Description
`int`	The number of missing values.

Source code in src/utilsx/pandas/_missing.py

def count_na(df: pd.DataFrame) -> int:
    """Get the total number of missing values in a DataFrame.

    Args:
        df: To count missing values in.

    Returns:
        The number of missing values.
    """
    return int(df.isna().sum().sum())

`text`

Utilities for text transformations.

`add_suffix(base, suffix, separator='_')`

Add suffix to a base string using a separator.

Parameters:

Name	Type	Description	Default
`base`	`str`	Base string to add a suffix to.	required
`suffix`	`str`	A suffix to add.	required
`separator`	`str`	A separator to insert between the base and suffix.	`'_'`

Returns:

Type	Description
`str`	A string with suffix concatenated to the base on the right, with a separator in between.

If the suffix is empty, returns just the base string, omitting the separator.

Source code in src/utilsx/text.py

def add_suffix(base: str, suffix: str, separator: str = "_") -> str:
    """Add suffix to a base string using a separator.

    Args:
        base: Base string to add a suffix to.
        suffix: A suffix to add.
        separator: A separator to insert between the base and suffix.

    Returns:
        A string with suffix concatenated to the base on the right, with a separator in between.

    If the suffix is empty, returns just the base string, omitting the separator.
    """
    return f"{base}{separator}{suffix}" if suffix else base

`typevars`

A collection of type variables.