ubelt.util_list module

Utility functions for manipulating iterables, lists, and sequences.

class ubelt.util_list.chunks(items, chunksize=None, nchunks=None, total=None, bordermode='none')[source]

Bases: object

Generates successive n-sized chunks from items.

If the last chunk has less than n elements, bordermode is used to determine fill values.

Parameters:
  • items (Iterable) – input to iterate over
  • chunksize (int) – size of each sublist yielded
  • nchunks (int) – number of chunks to create ( cannot be specified if chunksize is specified)
  • bordermode (str) – determines how to handle the last case if the length of the input is not divisible by chunksize valid values are: {‘none’, ‘cycle’, ‘replicate’}
  • total (int) – hints about the length of the input

Todo

should this handle the case when sequence is a string?

References

http://stackoverflow.com/questions/434287/iterate-over-a-list-in-chunks

CommandLine:
python -m ubelt.util_list chunks

Example

>>> import ubelt as ub
>>> items = [1, 2, 3, 4, 5, 6, 7]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='none')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7]]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='cycle')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 1, 2]]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='replicate')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 7, 7]]
Doctest:
>>> import ubelt as ub
>>> assert len(list(ub.chunks(range(2), nchunks=2))) == 2
>>> assert len(list(ub.chunks(range(3), nchunks=2))) == 2
>>> # Note: ub.chunks will not do the 2,1,1 split
>>> assert len(list(ub.chunks(range(4), nchunks=3))) == 2
>>> assert len(list(ub.chunks([], 2, None, 'none'))) == 0
>>> assert len(list(ub.chunks([], 2, None, 'cycle'))) == 0
>>> assert len(list(ub.chunks([], 2, None, 'replicate'))) == 0
Doctest:
>>> def _check_len(self):
...     assert len(self) == len(list(self))
>>> _check_len(chunks(list(range(3)), nchunks=2))
>>> _check_len(chunks(list(range(2)), nchunks=2))
>>> _check_len(chunks(list(range(2)), nchunks=3))
Doctest:
>>> import pytest
>>> assert pytest.raises(ValueError, chunks, range(9))
>>> assert pytest.raises(ValueError, chunks, range(9), chunksize=2, nchunks=2)
>>> assert pytest.raises(TypeError, len, chunks((_ for _ in range(2)), 2))
static noborder(items, chunksize)[source]
static cycle(items, chunksize)[source]
static replicate(items, chunksize)[source]
ubelt.util_list.iterable(obj, strok=False)[source]

Checks if the input implements the iterator interface. An exception is made for strings, which return False unless strok is True

Parameters:
  • obj (object) – a scalar or iterable input
  • strok (bool) – if True allow strings to be interpreted as iterable
Returns:

True if the input is iterable

Return type:

bool

Example

>>> obj_list = [3, [3], '3', (3,), [3, 4, 5], {}]
>>> result = [iterable(obj) for obj in obj_list]
>>> assert result == [False, True, False, True, True, True]
>>> result = [iterable(obj, strok=True) for obj in obj_list]
>>> assert result == [False, True, True, True, True, True]
ubelt.util_list.take(items, indices)[source]

Selects a subset of a list based on a list of indices. This is similar to np.take, but pure python.

Parameters:
  • items (Sequence) – an indexable object to select items from
  • indices (Iterable) – sequence of indexing objects
Returns:

subset of the list

Return type:

Iterable or scalar

SeeAlso:
ub.dict_subset

Example

>>> import ubelt as ub
>>> items = [0, 1, 2, 3]
>>> indices = [2, 0]
>>> list(ub.take(items, indices))
[2, 0]
ubelt.util_list.compress(items, flags)[source]

Selects items where the corresponding value in flags is True This is similar to np.compress and it.compress

Parameters:
  • items (Iterable) – a sequence to select items from
  • flags (Iterable) – corresponding sequence of bools
Returns:

a subset of masked items

Return type:

Iterable

Example

>>> import ubelt as ub
>>> items = [1, 2, 3, 4, 5]
>>> flags = [False, True, True, False, True]
>>> list(ub.compress(items, flags))
[2, 3, 5]
ubelt.util_list.flatten(nested_list)[source]

Transforms a nested iterable into a flat iterable.

This is simply an alias for itertools.chain.from_iterable

Parameters:nested_list (Iterable[Iterable]) – list of lists
Returns:flattened items
Return type:Iterable

Example

>>> import ubelt as ub
>>> nested_list = [['a', 'b'], ['c', 'd']]
>>> list(ub.flatten(nested_list))
['a', 'b', 'c', 'd']
ubelt.util_list.unique(items, key=None)[source]

Generates unique items in the order they appear.

Parameters:
  • items (Iterable) – list of items
  • key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields:

object – a unique item from the input sequence

CommandLine:
python -m utool.util_list –exec-unique_ordered

Example

>>> import ubelt as ub
>>> items = [4, 6, 6, 0, 6, 1, 0, 2, 2, 1]
>>> unique_items = list(ub.unique(items))
>>> assert unique_items == [4, 6, 0, 1, 2]

Example

>>> import ubelt as ub
>>> items = ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'D', 'E']
>>> unique_items = list(ub.unique(items, key=six.text_type.lower))
>>> assert unique_items == ['A', 'b', 'C', 'D', 'e']
>>> unique_items = list(ub.unique(items))
>>> assert unique_items == ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'E']
ubelt.util_list.argunique(items, key=None)[source]

Returns indices corresponding to the first instance of each unique item.

Parameters:
  • items (Sequence) – indexable collection of items
  • key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields:

int – indices of the unique items

Example

>>> items = [0, 2, 5, 1, 1, 0, 2, 4]
>>> indices = list(argunique(items))
>>> assert indices == [0, 1, 2, 3, 7]
>>> indices = list(argunique(items, key=lambda x: x % 2 == 0))
>>> assert indices == [0, 2]
ubelt.util_list.unique_flags(items, key=None)[source]

Returns a list of booleans corresponding to the first instance of each unique item.

Parameters:
  • items (Sequence) – indexable collection of items
  • key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Returns:

flags the items that are unique

Return type:

List[bool]

Example

>>> import ubelt as ub
>>> items = [0, 2, 1, 1, 0, 9, 2]
>>> flags = unique_flags(items)
>>> assert flags == [True, True, True, False, False, True, False]
>>> flags = unique_flags(items, key=lambda x: x % 2 == 0)
>>> assert flags == [True, False, True, False, False, False, False]
ubelt.util_list.boolmask(indices, maxval=None)[source]

Constructs a list of booleans where an item is True if its position is in indices otherwise it is False.

Parameters:
  • indices (list) – list of integer indices
  • maxval (int) – length of the returned list. If not specified this is inferred from indices

Note

In the future the arg maxval may change its name to shape

Returns:mask: list of booleans. mask[idx] is True if idx in indices
Return type:list

Example

>>> import ubelt as ub
>>> indices = [0, 1, 4]
>>> mask = ub.boolmask(indices, maxval=6)
>>> assert mask == [True, True, False, False, True, False]
>>> mask = ub.boolmask(indices)
>>> assert mask == [True, True, False, False, True]
ubelt.util_list.iter_window(iterable, size=2, step=1, wrap=False)[source]

Iterates through iterable with a window size. This is essentially a 1D sliding window.

Parameters:
  • iterable (Iterable) – an iterable sequence
  • size (int) – sliding window size (default = 2)
  • step (int) – sliding step size (default = 1)
  • wrap (bool) – wraparound (default = False)
Returns:

returns windows in a sequence

Return type:

iter

Example

>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 1, True
>>> window_iter = iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = %r' % (window_list,))
window_list = [(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 1), (6, 1, 2)]

Example

>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 2, True
>>> window_iter = iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = %r' % (window_list,))
window_list = [(1, 2, 3), (3, 4, 5), (5, 6, 1)]

Example

>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 2, False
>>> window_iter = iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = %r' % (window_list,))
window_list = [(1, 2, 3), (3, 4, 5)]

Example

>>> iterable = []
>>> size, step, wrap = 3, 2, False
>>> window_iter = iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = %r' % (window_list,))
window_list = []
ubelt.util_list.allsame(iterable, eq=<built-in function eq>)[source]

Determine if all items in a sequence are the same

Parameters:
  • iterable (Iterable) – items to determine if they are all the same
  • eq (Callable, optional) – function to determine equality (default: operator.eq)

Example

>>> allsame([1, 1, 1, 1])
True
>>> allsame([])
True
>>> allsame([0, 1])
False
>>> iterable = iter([0, 1, 1, 1])
>>> next(iterable)
>>> allsame(iterable)
True
>>> allsame(range(10))
False
>>> allsame(range(10), lambda a, b: True)
True
ubelt.util_list.argsort(indexable, key=None, reverse=False)[source]

Returns the indices that would sort a indexable object.

This is similar to numpy.argsort, but it is written in pure python and works on both lists and dictionaries.

Parameters:
  • indexable (Iterable or Mapping) – indexable to sort by
  • key (Callable, optional) – customizes the ordering of the indexable
  • reverse (bool, optional) – if True returns in descending order
Returns:

indices: list of indices such that sorts the indexable

Return type:

list

Example

>>> import ubelt as ub
>>> # argsort works on dicts by returning keys
>>> dict_ = {'a': 3, 'b': 2, 'c': 100}
>>> indices = ub.argsort(dict_)
>>> assert list(ub.take(dict_, indices)) == sorted(dict_.values())
>>> # argsort works on lists by returning indices
>>> indexable = [100, 2, 432, 10]
>>> indices = ub.argsort(indexable)
>>> assert list(ub.take(indexable, indices)) == sorted(indexable)
>>> # Can use iterators, but be careful. It exhausts them.
>>> indexable = reversed(range(100))
>>> indices = ub.argsort(indexable)
>>> assert indices[0] == 99
>>> # Can use key just like sorted
>>> indexable = [[0, 1, 2], [3, 4], [5]]
>>> indices = ub.argsort(indexable, key=len)
>>> assert indices == [2, 1, 0]
>>> # Can use reverse just like sorted
>>> indexable = [0, 2, 1]
>>> indices = ub.argsort(indexable, reverse=True)
>>> assert indices == [1, 2, 0]
ubelt.util_list.argmax(indexable, key=None)[source]

Returns index / key of the item with the largest value.

This is similar to numpy.argmax, but it is written in pure python and works on both lists and dictionaries.

Parameters:
  • indexable (Iterable or Mapping) – indexable to sort by
  • key (Callable, optional) – customizes the ordering of the indexable
CommandLine:
python -m ubelt.util_list argmax

Example

>>> assert argmax({'a': 3, 'b': 2, 'c': 100}) == 'c'
>>> assert argmax(['a', 'c', 'b', 'z', 'f']) == 3
>>> assert argmax([[0, 1], [2, 3, 4], [5]], key=len) == 1
>>> assert argmax({'a': 3, 'b': 2, 3: 100, 4: 4}) == 3
>>> assert argmax(iter(['a', 'c', 'b', 'z', 'f'])) == 3
ubelt.util_list.argmin(indexable, key=None)[source]

Returns index / key of the item with the smallest value.

This is similar to numpy.argmin, but it is written in pure python and works on both lists and dictionaries.

Parameters:
  • indexable (Iterable or Mapping) – indexable to sort by
  • key (Callable, optional) – customizes the ordering of the indexable

Example

>>> assert argmin({'a': 3, 'b': 2, 'c': 100}) == 'b'
>>> assert argmin(['a', 'c', 'b', 'z', 'f']) == 0
>>> assert argmin([[0, 1], [2, 3, 4], [5]], key=len) == 2
>>> assert argmin({'a': 3, 'b': 2, 3: 100, 4: 4}) == 'b'
>>> assert argmin(iter(['a', 'c', 'A', 'z', 'f'])) == 2
ubelt.util_list.peek(iterable)[source]

Look at the first item of an iterable. If the input is an iterator, then the next element is exhausted (i.e. a pop operation).

Parameters:iterable (List[T]) – an iterable
Returns:
item: the first item of ordered sequence, a popped item from an
iterator, or an arbitrary item from an unordered collection.
Return type:T

Example

>>> import ubelt as ub
>>> data = [0, 1, 2]
>>> ub.peek(data)
0
>>> iterator = iter(data)
>>> print(ub.peek(iterator))
0
>>> print(ub.peek(iterator))
1
>>> print(ub.peek(iterator))
2
>>> ub.peek(range(3))
0