ubelt.util_dict module¶

ubelt.util_dict.odict¶: alias of OrderedDict

ubelt.util_dict.ddict¶: alias of defaultdict

class ubelt.util_dict.AutoDict[source]¶

Bases: dict

An infinitely nested default dict of dicts.

Implementation of perl’s autovivification feature.

SeeAlso:: ub.AutoOrderedDict - the ordered version

References

http://stackoverflow.com/questions/651794/init-dict-of-dicts

Example

>>> import ubelt as ub
>>> auto = ub.AutoDict()
>>> auto[0][10][100] = None
>>> assert str(auto) == '{0: {10: {100: None}}}'

to_dict()[source]¶

Recursively casts a AutoDict into a regular dictionary. All nested AutoDict values are also converted.

Returns:	a copy of this dict without autovivification
Return type:	dict

Example

>>> from ubelt.util_dict import AutoDict
>>> auto = AutoDict()
>>> auto[1] = 1
>>> auto['n1'] = AutoDict()
>>> static = auto.to_dict()
>>> assert not isinstance(static, AutoDict)
>>> assert not isinstance(static['n1'], AutoDict)

class ubelt.util_dict.AutoOrderedDict[source]¶

Bases: collections.OrderedDict, ubelt.util_dict.AutoDict

An an infinitely nested default dict of dicts that maintains the ordering of items.

SeeAlso:

ub.AutoDict - the unordered version

Example0:

>>> import ubelt as ub
>>> auto = ub.AutoOrderedDict()
>>> auto[0][3] = 3
>>> auto[0][2] = 2
>>> auto[0][1] = 1
>>> assert list(auto[0].values()) == [3, 2, 1]

ubelt.util_dict.dzip(items1, items2)[source]¶

Zips elementwise pairs between items1 and items2 into a dictionary. Values from items2 can be broadcast onto items1.

Parameters:	items1 (Sequence) – full sequence items2 (Sequence) – can either be a sequence of one item or a sequence of equal length to items1
Returns:	similar to dict(zip(items1, items2))
Return type:	dict

Example

>>> assert dzip([1, 2, 3], [4]) == {1: 4, 2: 4, 3: 4}
>>> assert dzip([1, 2, 3], [4, 4, 4]) == {1: 4, 2: 4, 3: 4}
>>> assert dzip([], [4]) == {}

ubelt.util_dict.group_items(item_list, groupid_list, sorted_=True)[source]¶

Groups a list of items by group id.

Parameters:	item_list (list) – a list of items to group groupid_list (list) – a corresponding list of item groupids sorted_ (bool) – if True preserves the ordering of items within groups (default = True)

Todo

[ ] change names from item_list->values and groupid_list->keys
[ ] allow keys to be an iterable or a function so this can work

similar to itertools.groupby

Returns:	groupid_to_items: maps a groupid to a list of items
Return type:	dict

CommandLine:: python -m ubelt.util_dict group_items

Example

>>> import ubelt as ub
>>> item_list    = ['ham',     'jam',   'spam',     'eggs',    'cheese', 'banana']
>>> groupid_list = ['protein', 'fruit', 'protein',  'protein', 'dairy',  'fruit']
>>> groupid_to_items = ub.group_items(item_list, groupid_list)
>>> print(ub.repr2(groupid_to_items, nl=0))
{'dairy': ['cheese'], 'fruit': ['jam', 'banana'], 'protein': ['ham', 'spam', 'eggs']}

ubelt.util_dict.dict_hist(item_list, weight_list=None, ordered=False, labels=None)[source]¶

Builds a histogram of items

Parameters:

item_list (list) – list with hashable items (usually containing duplicates)
weight_list (list) – list of weights for each items
ordered (bool) – if True the result is ordered by frequency
labels (list) – expected labels (default None) if specified the frequency of each label is initialized to zero and item_list can only contain items specified in labels.

Returns:

dictionary where the keys are items in item_list, and the values: are the number of times the item appears in item_list.

Return type:

dict

CommandLine:: python -m ubelt.util_dict dict_hist

Example

>>> import ubelt as ub
>>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist = ub.dict_hist(item_list)
>>> print(ub.repr2(hist, nl=0))
{1: 1, 2: 4, 39: 1, 900: 3, 1232: 2}

Example

>>> import ubelt as ub
>>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist1 = ub.dict_hist(item_list)
>>> hist2 = ub.dict_hist(item_list, ordered=True)
>>> try:
>>>     hist3 = ub.dict_hist(item_list, labels=[])
>>> except KeyError:
>>>     pass
>>> else:
>>>     raise AssertionError('expected key error')
>>> #result = ub.repr2(hist_)
>>> weight_list = [1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]
>>> hist4 = ub.dict_hist(item_list, weight_list=weight_list)
>>> print(ub.repr2(hist1, nl=0))
{1: 1, 2: 4, 39: 1, 900: 3, 1232: 2}
>>> print(ub.repr2(hist4, nl=0))
{1: 1, 2: 4, 39: 1, 900: 1, 1232: 0}

ubelt.util_dict.find_duplicates(items, k=2)[source]¶

Find all duplicate items in a list.

Search for all items that appear more than k times and return a mapping from each duplicate item to the positions it appeared in.

Parameters:	items (list) – a list of hashable items possibly containing duplicates k (int) – only return items that appear at least k times (default=2)
Returns:	maps each duplicate item to the indices at which it appears
Return type:	dict

CommandLine:: python -m ubelt.util_dict find_duplicates

Example

>>> import ubelt as ub
>>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9]
>>> duplicates = ub.find_duplicates(items)
>>> print('items = %r' % (items,))
>>> print('duplicates = %r' % (duplicates,))
>>> assert duplicates == {0: [0, 1, 6], 2: [3, 8], 3: [4, 5]}
>>> assert ub.find_duplicates(items, 3) == {0: [0, 1, 6]}

Example

>>> import ubelt as ub
>>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9]
>>> # note: k can be 0
>>> duplicates = ub.find_duplicates(items, k=0)
>>> print(ub.repr2(duplicates, nl=0))
{0: [0, 1, 6], 1: [2], 2: [3, 8], 3: [4, 5], 9: [9], 12: [7]}

ubelt.util_dict.dict_subset(dict_, keys, default=NoParam)[source]¶

Get a subset of a dictionary

Parameters:	dict_ (dict) – superset dictionary keys (list) – keys to take from dict_
Returns:	subset dictionary
Return type:	dict

Example

>>> import ubelt as ub
>>> dict_ = {'K': 3, 'dcvs_clip_max': 0.2, 'p': 0.1}
>>> keys = ['K', 'dcvs_clip_max']
>>> subdict_ = ub.dict_subset(dict_, keys)
>>> print(ub.repr2(subdict_, nl=0))
{'K': 3, 'dcvs_clip_max': 0.2}

ubelt.util_dict.dict_take(dict_, keys, default=NoParam)[source]¶

Generates values from a dictionary

Parameters:	dict_ (dict) keys (list) default (Optional) – if specified uses default if keys are missing

CommandLine:: python -m ubelt.util_dict dict_take_gen

Example

>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> keys = [1, 2, 3, 4, 5]
>>> result = list(ub.dict_take(dict_, keys, None))
>>> assert result == ['a', 'b', 'c', None, None]

Example

>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> keys = [1, 2, 3, 4, 5]
>>> try:
>>>     print(list(ub.dict_take(dict_, keys)))
>>>     raise AssertionError('did not get key error')
>>> except KeyError:
>>>     print('correctly got key error')

ubelt.util_dict.dict_union(*args)[source]¶

Combines the disjoint keys in multiple dictionaries. For intersecting keys, dictionaries towards the end of the sequence are given precidence.

Parameters:	*args – a sequence of dictionaries
Returns:	OrderedDict if the first argument is an OrderedDict, otherwise dict

Example

>>> result = dict_union({'a': 1, 'b': 1}, {'b': 2, 'c': 2})
>>> assert result == {'a': 1, 'b': 2, 'c': 2}
>>> dict_union(odict([('a', 1), ('b', 2)]), odict([('c', 3), ('d', 4)]))
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
>>> dict_union()
{}

ubelt.util_dict.map_vals(func, dict_)[source]¶

applies a function to each of the keys in a dictionary

Parameters:	func (callable) – a function or indexable object dict_ (dict) – a dictionary
Returns:	transformed dictionary
Return type:	newdict

CommandLine:: python -m ubelt.util_dict map_vals

Example

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> func = len
>>> newdict = ub.map_vals(func, dict_)
>>> assert newdict ==  {'a': 3, 'b': 0}
>>> print(newdict)
>>> # Can also use indexables as `func`
>>> dict_ = {'a': 0, 'b': 1}
>>> func = [42, 21]
>>> newdict = ub.map_vals(func, dict_)
>>> assert newdict ==  {'a': 42, 'b': 21}
>>> print(newdict)

ubelt.util_dict.map_keys(func, dict_)[source]¶

applies a function to each of the keys in a dictionary

Parameters:	func (callable) – a function or indexable object dict_ (dict) – a dictionary
Returns:	transformed dictionary
Return type:	newdict

CommandLine:: python -m ubelt.util_dict map_keys

Example

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> func = ord
>>> newdict = ub.map_keys(func, dict_)
>>> print(newdict)
>>> assert newdict == {97: [1, 2, 3], 98: []}
>>> #ut.assert_raises(AssertionError, map_keys, len, dict_)
>>> dict_ = {0: [1, 2, 3], 1: []}
>>> func = ['a', 'b']
>>> newdict = ub.map_keys(func, dict_)
>>> print(newdict)
>>> assert newdict == {'a': [1, 2, 3], 'b': []}
>>> #ut.assert_raises(AssertionError, map_keys, len, dict_)

ubelt.util_dict.invert_dict(dict_, unique_vals=True)[source]¶

Swaps the keys and values in a dictionary.

Parameters:	dict_ (dict) – dictionary to invert unique_vals (bool) – if False, inverted keys are returned in a set. The default is True.
Returns:	inverted_dict
Return type:	dict

Notes

The must values be hashable.

If the original dictionary contains duplicate values, then only one of the corresponding keys will be returned and the others will be discarded. This can be prevented by setting unique_vals=True, causing the inverted keys to be returned in a set.

CommandLine:: python -m ubelt.util_dict invert_dict

Example

>>> import ubelt as ub
>>> dict_ = {'a': 1, 'b': 2}
>>> inverted_dict = ub.invert_dict(dict_)
>>> assert inverted_dict == {1: 'a', 2: 'b'}

Example

>>> import ubelt as ub
>>> dict_ = ub.odict([(2, 'a'), (1, 'b'), (0, 'c'), (None, 'd')])
>>> inverted_dict = ub.invert_dict(dict_)
>>> assert list(inverted_dict.keys())[0] == 'a'

Example

>>> import ubelt as ub
>>> dict_ = {'a': 1, 'b': 0, 'c': 0, 'd': 0, 'f': 2}
>>> inverted_dict = ub.invert_dict(dict_, unique_vals=False)
>>> assert inverted_dict == {0: {'b', 'c', 'd'}, 1: {'a'}, 2: {'f'}}