Skip to content

API Reference

UtilsX - a collection of generic Python utility functions and types.

collections

Utilities for working with collections.

check_equal_length(*collections)

Given an arbitrary number of collections, check if they all have equal length.

Parameters:

Name Type Description Default
*collections Sized

Objects which have length to be checked for its equality.

()

Returns:

Type Description
bool

Whether all provided collections have equal length.

Raises:

Type Description
ValueError

If no collections provided.

Source code in src/utilsx/collections.py
def check_equal_length(*collections: Sized) -> bool:
    """Given an arbitrary number of collections, check if they all have equal length.

    Args:
        *collections: Objects which have length to be checked for its equality.

    Returns:
        Whether all provided collections have equal length.

    Raises:
        ValueError: If no collections provided.
    """
    if not collections:
        raise ValueError("No collections to provided to check for lengths equality.")
    benchmark_length = len(collections[0])
    return all(len(collection) == benchmark_length for collection in collections)

get_duplicates(iterable)

Get a set of all values in a collection that are duplicates, i.e., present more than once.

Parameters:

Name Type Description Default
iterable Iterable[T]

A collection to check.

required

Returns:

Type Description
frozenset[T]

A set of values that are present more than once.

Source code in src/utilsx/collections.py
def get_duplicates(iterable: Iterable[T]) -> frozenset[T]:
    """Get a set of all values in a collection that are duplicates, i.e., present more than once.

    Args:
        iterable: A collection to check.

    Returns:
        A set of values that are present more than once.
    """
    return frozenset(key for key, value in Counter(iterable).items() if value > 1)

is_collection_of_equal_elements(collection)

Check whether all elements in a collection are equal to each other.

Parameters:

Name Type Description Default
collection Collection[Any]

A collection to check that all elements are equal.

required

Returns:

Type Description
bool

Whether all elements are equal to each other.

Source code in src/utilsx/collections.py
def is_collection_of_equal_elements(collection: Collection[Any]) -> bool:
    """Check whether all elements in a collection are equal to each other.

    Args:
        collection: A collection to check that all elements are equal.

    Returns:
        Whether all elements are equal to each other.
    """
    collection = list(collection)
    return all(element == collection[0] for element in collection)

constants

Common constant values from math, physics, etc.

decorators

Profiling pipeline nodes.

narrow_return(index)

Makes a function returning a tuple return only the element at the given index.

Implemented as a decorator factory.

Parameters:

Name Type Description Default
index int

The index of the tuple element to return.

required

Returns:

Type Description
Callable[[Callable[..., tuple[Any, ...]]], Callable[..., Any]]

A decorator.

Source code in src/utilsx/decorators.py
def narrow_return(
    index: int,
) -> Callable[[Callable[..., tuple[Any, ...]]], Callable[..., Any]]:
    """Makes a function returning a tuple return only the element at the given index.

    Implemented as a decorator factory.

    Args:
        index: The index of the tuple element to return.

    Returns:
        A decorator.
    """

    def decorator(func: Callable[..., tuple[Any, ...]]) -> Callable[..., Any]:
        @wraps(func)
        def wrapper(*args: Any, **kwargs: Any) -> Any:  # noqa: ANN401
            result = func(*args, **kwargs)
            return result[index]

        return wrapper

    return decorator

dicts

Utilities for working with dictionaries.

multiply_dict_values(dictionary, multiplier)

Get a copy of the dictionary with values multiplied by scalar, preserving keys.

Parameters:

Name Type Description Default
dictionary Mapping[T, float]

A dictionary to multiply values of.

required
multiplier float

A scalar multiplier.

required

Returns:

Type Description
dict[T, float]

A copy of the original dictionary with values multiplied by scalar.

Source code in src/utilsx/dicts/_modification.py
def multiply_dict_values(dictionary: Mapping[T, float], multiplier: float) -> dict[T, float]:
    """Get a copy of the dictionary with values multiplied by scalar, preserving keys.

    Args:
        dictionary: A dictionary to multiply values of.
        multiplier: A scalar multiplier.

    Returns:
        A copy of the original dictionary with values multiplied by scalar.
    """
    return {key: value * multiplier for key, value in dictionary.items()}

remove_items_with_zero_values(dictionary)

Drop key-value pairs from a dictionary whose values are zero.

Parameters:

Name Type Description Default
dictionary dict[T, float]

To be filtered to exclude key-value pairs with zero values.

required

Returns:

Type Description
dict[T, float]

A subset of the original dictionary items, only pairs with non-zero values.

Source code in src/utilsx/dicts/_filtering.py
def remove_items_with_zero_values(dictionary: dict[T, float]) -> dict[T, float]:
    """Drop key-value pairs from a dictionary whose values are zero.

    Args:
        dictionary: To be filtered to exclude key-value pairs with zero values.

    Returns:
        A subset of the original dictionary items, only pairs with non-zero values.
    """
    return {key: value for key, value in dictionary.items() if value}

rename_keys_in_nested_dict(dictionary, renaming)

Replace all specified keys by other specified names in an arbitrarily deep dictionary.

Parameters:

Name Type Description Default
dictionary dict[str, Any]

A dictionary of arbitrary depth.

required
renaming dict[str, str]

A mapping from old to new key names.

required

Returns:

Type Description
dict[str, Any]

Copy of the original dictionary with old_key renamed to new_key

dict[str, Any]

at all levels of key depth.

Source code in src/utilsx/dicts/_modification.py
def rename_keys_in_nested_dict(
    dictionary: dict[str, Any], renaming: dict[str, str]
) -> dict[str, Any]:
    """Replace all specified keys by other specified names in an arbitrarily deep dictionary.

    Args:
        dictionary: A dictionary of arbitrary depth.
        renaming: A mapping from old to new key names.

    Returns:
        Copy of the original dictionary with ``old_key`` renamed to ``new_key``
        at all levels of key depth.
    """
    # This ``isinstance`` check is required to leave non-dict structures as-is.
    if isinstance(dictionary, dict):
        return {
            (renaming.get(key, key)): rename_keys_in_nested_dict(value, renaming)
            for key, value in dictionary.items()
        }
    return dictionary

sort_by_value(dictionary, reverse=False)

Sort a dictionary with numeric values by those values.

Parameters:

Name Type Description Default
dictionary dict[T, NumberT]

A dictionary to sort by value.

required
reverse bool

False for ascending order, True for descending. Exactly matches the reverse argument of sorted Python function.

False

Returns:

Type Description
dict[T, NumberT]

Same dictionary in terms of content, just sorted by value.

Source code in src/utilsx/dicts/_sorting.py
def sort_by_value(dictionary: dict[T, NumberT], reverse: bool = False) -> dict[T, NumberT]:
    """Sort a dictionary with numeric values by those values.

    Args:
        dictionary: A dictionary to sort by value.
        reverse: False for ascending order, True for descending. Exactly matches the ``reverse``
            argument of ``sorted`` Python function.

    Returns:
        Same dictionary in terms of content, just sorted by value.
    """
    return dict(sorted(dictionary.items(), key=lambda item: item[1], reverse=reverse))

sum_dicts(*dicts)

Given dictionaries, return their summation: a union of keys and totals of values.

Parameters:

Name Type Description Default
*dicts dict[T, float]

To be added up together, any number.

()

Returns:

Type Description
dict[T, float]

A combined dictionary with a union of keys and totals of values.

Source code in src/utilsx/dicts/_combination.py
def sum_dicts(*dicts: dict[T, float]) -> dict[T, float]:
    """Given dictionaries, return their summation: a union of keys and totals of values.

    Args:
        *dicts: To be added up together, any number.

    Returns:
        A combined dictionary with a union of keys and totals of values.
    """
    output: dict[T, float] = defaultdict(float)
    for dictionary in dicts:
        for key, value in dictionary.items():
            output[key] += value
    return dict(output)

exceptions

Utilities for raising exceptions.

hint_if_extra_uninstalled(required_modules, extra_name, package_name)

Check if an optional dependency group is installed, and hint if not, via ImportError.

Parameters:

Name Type Description Default
required_modules Iterable[str]

Modules which need to be installed in venv for that dependency group.

required
extra_name str

Name of an optional dependency group.

required
package_name str

Name of the package which provides a given optional dependency group.

required

Raises:

Type Description
ImportError

If any of the required modules are not installed.

Source code in src/utilsx/exceptions.py
def hint_if_extra_uninstalled(
    required_modules: Iterable[str],
    extra_name: str,
    package_name: str,
) -> None:
    """Check if an optional dependency group is installed, and hint if not, via ``ImportError``.

    Args:
        required_modules: Modules which need to be installed in venv for that dependency group.
        extra_name: Name of an optional dependency group.
        package_name: Name of the package which provides a given optional dependency group.

    Raises:
        ImportError: If any of the required modules are not installed.
    """
    for module in required_modules:
        try:
            import_module(module)
        except ImportError as e:
            raise ImportError(
                f"Optional dependency group '{extra_name}' is required for this feature.\n"
                f"Add '{package_name}[{extra_name}]' to your requirements list"
                " and install to virtual environment."
            ) from e

prohibit_negative_values(values, exception_class=ValueError, exception_msg='Negative values are prohibited')

Raise an exception if an iterable of numbers has negative values.

Parameters:

Name Type Description Default
values Iterable[float]

To check for any negative member.

required
exception_class type[Exception]

Exception class to raise if any member is negative, defaults to ValueError.

ValueError
exception_msg str

A message to add to the raised exception.

'Negative values are prohibited'

Returns:

Type Description
None

None.

Source code in src/utilsx/exceptions.py
def prohibit_negative_values(
    values: Iterable[float],
    exception_class: type[Exception] = ValueError,
    exception_msg: str = "Negative values are prohibited",
) -> None:
    """Raise an exception if an iterable of numbers has negative values.

    Args:
        values: To check for any negative member.
        exception_class: Exception class to raise if any member is negative,
            defaults to ``ValueError``.
        exception_msg: A message to add to the raised exception.

    Returns:
        None.
    """
    if any(value < 0 for value in values):
        raise exception_class(exception_msg)

raise_key_error_with_suggestions(attempted_key, existing_keys, object_name='object', attribute_name='key')

Raise a key error complemented with suggestions based on closest matches.

Parameters:

Name Type Description Default
attempted_key str

A key that was attempted to be found.

required
existing_keys Collection[str]

Existing keys, among which an attempted key was not found.

required
object_name str

Archetype of an object that was searched by key.

'object'
attribute_name str

If this key represents an attribute with explicit name.

'key'

Returns:

Type Description
NoReturn

Never returns anything.

Raises:

Type Description
KeyError

Complemented with close matches, if any.

Notes

Inspired by dataset name hint implemented in Kedro: https://github.com/kedro-org/kedro

Source code in src/utilsx/exceptions.py
def raise_key_error_with_suggestions(
    attempted_key: str,
    existing_keys: Collection[str],
    object_name: str = "object",
    attribute_name: str = "key",
) -> NoReturn:
    """Raise a key error complemented with suggestions based on closest matches.

    Args:
        attempted_key: A key that was attempted to be found.
        existing_keys: Existing keys, among which an attempted key was not found.
        object_name: Archetype of an object that was searched by key.
        attribute_name: If this key represents an attribute with explicit name.

    Returns:
        Never returns anything.

    Raises:
        KeyError: Complemented with close matches, if any.

    Notes:
        Inspired by dataset name hint implemented in Kedro: https://github.com/kedro-org/kedro
    """
    error_msg = f"{object_name.capitalize()} with {attribute_name} {attempted_key} not found."
    close_matches = get_close_matches(attempted_key, existing_keys)
    if close_matches:
        suggestions = ", ".join(close_matches)
        error_msg += f" Did you mean one of these instead: {suggestions}?"
    raise KeyError(error_msg)

functional

Utilities for functional programming.

identity(x)

An identity function: returns a single input unchanged.

Source code in src/utilsx/functional.py
6
7
8
def identity(x: T) -> T:
    """An identity function: returns a single input unchanged."""
    return x

math

Utilities for mathematical operations.

ceil_to_multiple(x, multiple)

Ceil a number to the next multiple of another value.

Parameters:

Name Type Description Default
x float

Number to ceil.

required
multiple int

Enforce the output to be a multiple of.

required

Returns:

Type Description
int

X input ceiled to the next multiple of another value.

Source code in src/utilsx/math/_rounding.py
def ceil_to_multiple(x: float, multiple: int) -> int:
    """Ceil a number to the next multiple of another value.

    Args:
        x: Number to ceil.
        multiple: Enforce the output to be a multiple of.

    Returns:
        X input ceiled to the next multiple of another value.
    """
    return math.ceil(x / multiple) * multiple

check_values_add_up_to_one(values, mode='either')

Check if values in a collection add up to 1 or 100.

Parameters:

Name Type Description Default
values Collection[float]

Values to check.

required
mode Literal['fractions', 'percentages', 'either']

"fractions" if they should add up to 1, "percentages" if they should add up to 100, "either" if either of this works.

'either'

Returns:

Type Description
bool

Boolean outcome of the check.

Source code in src/utilsx/math/_collections.py
def check_values_add_up_to_one(
    values: Collection[float],
    mode: Literal["fractions", "percentages", "either"] = "either",
) -> bool:
    """Check if values in a collection add up to 1 or 100.

    Args:
        values: Values to check.
        mode: "fractions" if they should add up to 1, "percentages" if they should add up to 100,
            "either" if either of this works.

    Returns:
        Boolean outcome of the check.
    """
    match mode:
        case "fractions":
            valid_totals = frozenset((1,))
        case "percentages":
            valid_totals = frozenset((100,))
        case "either":
            valid_totals = frozenset((1, 100))
        case _:
            raise ValueError(f"Unrecognized mode: {mode}")

    sum_of_values = sum(values)
    return any(
        math.isclose(sum_of_values, valid_total, rel_tol=0.001) for valid_total in valid_totals
    )

convert_number_to_units(number, units)

Convert a number to thousands or millions.

Parameters:

Name Type Description Default
number float

Number to convert.

required
units _TUnits

Units to convert to.

required

Returns:

Type Description
float

A number converted to specified units.

Source code in src/utilsx/math/_downscaling.py
def convert_number_to_units(number: float, units: _TUnits) -> float:
    """Convert a number to thousands or millions.

    Args:
        number: Number to convert.
        units: Units to convert to.

    Returns:
        A number converted to specified units.
    """
    match units:
        case "thousand":
            denominator = THOUSAND
        case "million":
            denominator = MILLION
        case _:
            raise ValueError(f"Unrecognized units: {units}")

    return number / denominator

double(x)

Multiply a number by two: in other words, double it.

Parameters:

Name Type Description Default
x float

The number to multiply by two.

required

Returns:

Type Description
float

Result of multiplying this number by literal integer two.

Source code in src/utilsx/math/_scalar_ops.py
def double(x: float) -> float:
    """Multiply a number by two: in other words, double it.

    Args:
        x: The number to multiply by two.

    Returns:
        Result of multiplying this number by literal integer two.
    """
    return x * 2

halve(x)

Divide a number by two: in other words, get its half.

Parameters:

Name Type Description Default
x float

The number to divide by two.

required

Returns:

Type Description
float

Result of dividing this number by literal integer two.

Source code in src/utilsx/math/_scalar_ops.py
def halve(x: float) -> float:
    """Divide a number by two: in other words, get its half.

    Args:
        x: The number to divide by two.

    Returns:
        Result of dividing this number by literal integer two.
    """
    return x / 2

is_monotonically_growing(time_series, multiplier)

Check whether a time series can be considered monotonically growing.

To be called so, each next element should be at least multiplier times bigger than a previous one. Series of less than two elements are considered non-growing.

Parameters:

Name Type Description Default
time_series Sequence[float]

A sequence of numbers.

required
multiplier float

How many times bigger each next element should be.

required

Returns:

Type Description
bool

True if monotonically growing, False otherwise.

Source code in src/utilsx/math/_collections.py
def is_monotonically_growing(time_series: Sequence[float], multiplier: float) -> bool:
    """Check whether a time series can be considered monotonically growing.

    To be called so, each next element should be at least ``multiplier`` times bigger than
    a previous one. Series of less than two elements are considered non-growing.

    Args:
        time_series: A sequence of numbers.
        multiplier: How many times bigger each next element should be.

    Returns:
        True if monotonically growing, False otherwise.
    """
    if len(time_series) < 2:  # noqa: PLR2004
        return False
    return all(
        time_series[i + 1] > time_series[i] * multiplier for i in range(len(time_series) - 1)
    )

normalize(values)

Normalize a sequence of numbers to make them add up to one.

Source code in src/utilsx/math/_collections.py
def normalize(values: Sequence[float]) -> list[float]:
    """Normalize a sequence of numbers to make them add up to one."""
    return [value / sum(values) for value in values]

safe_divide(numerator, denominator, fallback=0)

Divide one number by another, falling back on something in case of ZeroDivisionError.

Parameters:

Name Type Description Default
numerator float

Number to divide.

required
denominator float

Denominator to use.

required
fallback float

Fallback value to use in case the denominator is zero, defaults to zero.

0
Source code in src/utilsx/math/_division.py
def safe_divide(numerator: float, denominator: float, fallback: float = 0) -> float:
    """Divide one number by another, falling back on something in case of ``ZeroDivisionError``.

    Args:
        numerator: Number to divide.
        denominator: Denominator to use.
        fallback: Fallback value to use in case the denominator is zero, defaults to zero.
    """
    return numerator / denominator if denominator != 0 else fallback

pandas

Utilities for enhancing your Pandas workflows.

count_na(df)

Get the total number of missing values in a DataFrame.

Parameters:

Name Type Description Default
df DataFrame

To count missing values in.

required

Returns:

Type Description
int

The number of missing values.

Source code in src/utilsx/pandas/_missing.py
def count_na(df: pd.DataFrame) -> int:
    """Get the total number of missing values in a DataFrame.

    Args:
        df: To count missing values in.

    Returns:
        The number of missing values.
    """
    return int(df.isna().sum().sum())

text

Utilities for text transformations.

add_suffix(base, suffix, separator='_')

Add suffix to a base string using a separator.

Parameters:

Name Type Description Default
base str

Base string to add a suffix to.

required
suffix str

A suffix to add.

required
separator str

A separator to insert between the base and suffix.

'_'

Returns:

Type Description
str

A string with suffix concatenated to the base on the right, with a separator in between.

If the suffix is empty, returns just the base string, omitting the separator.

Source code in src/utilsx/text.py
def add_suffix(base: str, suffix: str, separator: str = "_") -> str:
    """Add suffix to a base string using a separator.

    Args:
        base: Base string to add a suffix to.
        suffix: A suffix to add.
        separator: A separator to insert between the base and suffix.

    Returns:
        A string with suffix concatenated to the base on the right, with a separator in between.

    If the suffix is empty, returns just the base string, omitting the separator.
    """
    return f"{base}{separator}{suffix}" if suffix else base

typevars

A collection of type variables.