ubelt.util_list module¶
Utility functions for manipulating iterables, lists, and sequences.
The chunks function splits a list into smaller parts. There are different strategies for how to do this.
The flatten function take a list of lists and removees the inner lists. This only removes one level of nesting.
The iterable function checks if an object is iterable or not. Similar to the callable builtin function.
The argmax, argmin, and argsort work similarly to the analogous numpy functions, except they operate on dictionaries and other Python builtin types.
The take and compress are generators, and also similar to their lesser known, but very useful numpy equivalents.
There are also other numpy inspired functions: unique, argunique, unique_flags, and boolmask.
-
class
ubelt.util_list.
chunks
(items, chunksize=None, nchunks=None, total=None, bordermode='none')[source]¶ Bases:
object
Generates successive n-sized chunks from items.
If the last chunk has less than n elements, bordermode is used to determine fill values.
Parameters: - items (Iterable) – input to iterate over
- chunksize (int) – size of each sublist yielded
- nchunks (int) – number of chunks to create ( cannot be specified if chunksize is specified)
- bordermode (str) – determines how to handle the last case if the length of the input is not divisible by chunksize valid values are: {‘none’, ‘cycle’, ‘replicate’}
- total (int) – hints about the length of the input
Todo
should this handle the case when sequence is a string?
References
http://stackoverflow.com/questions/434287/iterate-over-a-list-in-chunks
- CommandLine:
- python -m ubelt.util_list chunks
Example
>>> import ubelt as ub >>> items = [1, 2, 3, 4, 5, 6, 7] >>> genresult = ub.chunks(items, chunksize=3, bordermode='none') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7]] >>> genresult = ub.chunks(items, chunksize=3, bordermode='cycle') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 1, 2]] >>> genresult = ub.chunks(items, chunksize=3, bordermode='replicate') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 7, 7]]
- Doctest:
>>> import ubelt as ub >>> assert len(list(ub.chunks(range(2), nchunks=2))) == 2 >>> assert len(list(ub.chunks(range(3), nchunks=2))) == 2 >>> # Note: ub.chunks will not do the 2,1,1 split >>> assert len(list(ub.chunks(range(4), nchunks=3))) == 2 >>> assert len(list(ub.chunks([], 2, None, 'none'))) == 0 >>> assert len(list(ub.chunks([], 2, None, 'cycle'))) == 0 >>> assert len(list(ub.chunks([], 2, None, 'replicate'))) == 0
- Doctest:
>>> def _check_len(self): ... assert len(self) == len(list(self)) >>> _check_len(chunks(list(range(3)), nchunks=2)) >>> _check_len(chunks(list(range(2)), nchunks=2)) >>> _check_len(chunks(list(range(2)), nchunks=3))
- Doctest:
>>> import pytest >>> assert pytest.raises(ValueError, chunks, range(9)) >>> assert pytest.raises(ValueError, chunks, range(9), chunksize=2, nchunks=2) >>> assert pytest.raises(TypeError, len, chunks((_ for _ in range(2)), 2))
-
ubelt.util_list.
iterable
(obj, strok=False)[source]¶ Checks if the input implements the iterator interface. An exception is made for strings, which return False unless strok is True
Parameters: - obj (object) – a scalar or iterable input
- strok (bool) – if True allow strings to be interpreted as iterable
Returns: True if the input is iterable
Return type: bool
Example
>>> obj_list = [3, [3], '3', (3,), [3, 4, 5], {}] >>> result = [iterable(obj) for obj in obj_list] >>> assert result == [False, True, False, True, True, True] >>> result = [iterable(obj, strok=True) for obj in obj_list] >>> assert result == [False, True, True, True, True, True]
-
ubelt.util_list.
take
(items, indices)[source]¶ Selects a subset of a list based on a list of indices. This is similar to np.take, but pure python.
Parameters: - items (Sequence) – an indexable object to select items from
- indices (Iterable) – sequence of indexing objects
Returns: subset of the list
Return type: Iterable or scalar
- SeeAlso:
- ub.dict_subset
Example
>>> import ubelt as ub >>> items = [0, 1, 2, 3] >>> indices = [2, 0] >>> list(ub.take(items, indices)) [2, 0]
-
ubelt.util_list.
compress
(items, flags)[source]¶ Selects items where the corresponding value in flags is True This is similar to np.compress and it.compress
Parameters: - items (Iterable) – a sequence to select items from
- flags (Iterable) – corresponding sequence of bools
Returns: a subset of masked items
Return type: Iterable
Example
>>> import ubelt as ub >>> items = [1, 2, 3, 4, 5] >>> flags = [False, True, True, False, True] >>> list(ub.compress(items, flags)) [2, 3, 5]
-
ubelt.util_list.
flatten
(nested_list)[source]¶ Transforms a nested iterable into a flat iterable.
This is simply an alias for itertools.chain.from_iterable
Parameters: nested_list (Iterable[Iterable]) – list of lists Returns: flattened items Return type: Iterable Example
>>> import ubelt as ub >>> nested_list = [['a', 'b'], ['c', 'd']] >>> list(ub.flatten(nested_list)) ['a', 'b', 'c', 'd']
-
ubelt.util_list.
unique
(items, key=None)[source]¶ Generates unique items in the order they appear.
Parameters: - items (Iterable) – list of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields: object – a unique item from the input sequence
- CommandLine:
- python -m utool.util_list –exec-unique_ordered
Example
>>> import ubelt as ub >>> items = [4, 6, 6, 0, 6, 1, 0, 2, 2, 1] >>> unique_items = list(ub.unique(items)) >>> assert unique_items == [4, 6, 0, 1, 2]
Example
>>> import ubelt as ub >>> items = ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'D', 'E'] >>> unique_items = list(ub.unique(items, key=six.text_type.lower)) >>> assert unique_items == ['A', 'b', 'C', 'D', 'e'] >>> unique_items = list(ub.unique(items)) >>> assert unique_items == ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'E']
-
ubelt.util_list.
argunique
(items, key=None)[source]¶ Returns indices corresponding to the first instance of each unique item.
Parameters: - items (Sequence) – indexable collection of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields: int – indices of the unique items
Example
>>> items = [0, 2, 5, 1, 1, 0, 2, 4] >>> indices = list(argunique(items)) >>> assert indices == [0, 1, 2, 3, 7] >>> indices = list(argunique(items, key=lambda x: x % 2 == 0)) >>> assert indices == [0, 2]
-
ubelt.util_list.
unique_flags
(items, key=None)[source]¶ Returns a list of booleans corresponding to the first instance of each unique item.
Parameters: - items (Sequence) – indexable collection of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Returns: flags the items that are unique
Return type: List[bool]
Example
>>> import ubelt as ub >>> items = [0, 2, 1, 1, 0, 9, 2] >>> flags = unique_flags(items) >>> assert flags == [True, True, True, False, False, True, False] >>> flags = unique_flags(items, key=lambda x: x % 2 == 0) >>> assert flags == [True, False, True, False, False, False, False]
-
ubelt.util_list.
boolmask
(indices, maxval=None)[source]¶ Constructs a list of booleans where an item is True if its position is in indices otherwise it is False.
Parameters: - indices (list) – list of integer indices
- maxval (int) – length of the returned list. If not specified this is inferred from indices
Note
In the future the arg maxval may change its name to shape
Returns: mask: list of booleans. mask[idx] is True if idx in indices Return type: list Example
>>> import ubelt as ub >>> indices = [0, 1, 4] >>> mask = ub.boolmask(indices, maxval=6) >>> assert mask == [True, True, False, False, True, False] >>> mask = ub.boolmask(indices) >>> assert mask == [True, True, False, False, True]
-
ubelt.util_list.
iter_window
(iterable, size=2, step=1, wrap=False)[source]¶ Iterates through iterable with a window size. This is essentially a 1D sliding window.
Parameters: - iterable (Iterable) – an iterable sequence
- size (int) – sliding window size (default = 2)
- step (int) – sliding step size (default = 1)
- wrap (bool) – wraparound (default = False)
Returns: returns windows in a sequence
Return type: iter
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 1, True >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 1), (6, 1, 2)]
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 2, True >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (3, 4, 5), (5, 6, 1)]
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 2, False >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (3, 4, 5)]
Example
>>> iterable = [] >>> size, step, wrap = 3, 2, False >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = []
-
ubelt.util_list.
allsame
(iterable, eq=<built-in function eq>)[source]¶ Determine if all items in a sequence are the same
Parameters: - iterable (Iterable) – items to determine if they are all the same
- eq (Callable, optional) – function to determine equality (default: operator.eq)
Example
>>> allsame([1, 1, 1, 1]) True >>> allsame([]) True >>> allsame([0, 1]) False >>> iterable = iter([0, 1, 1, 1]) >>> next(iterable) >>> allsame(iterable) True >>> allsame(range(10)) False >>> allsame(range(10), lambda a, b: True) True
-
ubelt.util_list.
argsort
(indexable, key=None, reverse=False)[source]¶ Returns the indices that would sort a indexable object.
This is similar to numpy.argsort, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
- reverse (bool, optional) – if True returns in descending order
Returns: indices: list of indices such that sorts the indexable
Return type: list
Example
>>> import ubelt as ub >>> # argsort works on dicts by returning keys >>> dict_ = {'a': 3, 'b': 2, 'c': 100} >>> indices = ub.argsort(dict_) >>> assert list(ub.take(dict_, indices)) == sorted(dict_.values()) >>> # argsort works on lists by returning indices >>> indexable = [100, 2, 432, 10] >>> indices = ub.argsort(indexable) >>> assert list(ub.take(indexable, indices)) == sorted(indexable) >>> # Can use iterators, but be careful. It exhausts them. >>> indexable = reversed(range(100)) >>> indices = ub.argsort(indexable) >>> assert indices[0] == 99 >>> # Can use key just like sorted >>> indexable = [[0, 1, 2], [3, 4], [5]] >>> indices = ub.argsort(indexable, key=len) >>> assert indices == [2, 1, 0] >>> # Can use reverse just like sorted >>> indexable = [0, 2, 1] >>> indices = ub.argsort(indexable, reverse=True) >>> assert indices == [1, 2, 0]
-
ubelt.util_list.
argmax
(indexable, key=None)[source]¶ Returns index / key of the item with the largest value.
This is similar to numpy.argmax, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
- CommandLine:
- python -m ubelt.util_list argmax
Example
>>> assert argmax({'a': 3, 'b': 2, 'c': 100}) == 'c' >>> assert argmax(['a', 'c', 'b', 'z', 'f']) == 3 >>> assert argmax([[0, 1], [2, 3, 4], [5]], key=len) == 1 >>> assert argmax({'a': 3, 'b': 2, 3: 100, 4: 4}) == 3 >>> assert argmax(iter(['a', 'c', 'b', 'z', 'f'])) == 3
-
ubelt.util_list.
argmin
(indexable, key=None)[source]¶ Returns index / key of the item with the smallest value.
This is similar to numpy.argmin, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
Example
>>> assert argmin({'a': 3, 'b': 2, 'c': 100}) == 'b' >>> assert argmin(['a', 'c', 'b', 'z', 'f']) == 0 >>> assert argmin([[0, 1], [2, 3, 4], [5]], key=len) == 2 >>> assert argmin({'a': 3, 'b': 2, 3: 100, 4: 4}) == 'b' >>> assert argmin(iter(['a', 'c', 'A', 'z', 'f'])) == 2
-
ubelt.util_list.
peek
(iterable)[source]¶ Look at the first item of an iterable. If the input is an iterator, then the next element is exhausted (i.e. a pop operation).
Parameters: iterable (List[T]) – an iterable Returns: - item: the first item of ordered sequence, a popped item from an
- iterator, or an arbitrary item from an unordered collection.
Return type: T Example
>>> import ubelt as ub >>> data = [0, 1, 2] >>> ub.peek(data) 0 >>> iterator = iter(data) >>> print(ub.peek(iterator)) 0 >>> print(ub.peek(iterator)) 1 >>> print(ub.peek(iterator)) 2 >>> ub.peek(range(3)) 0