ubelt package ¶

Example

>>> import ubelt as ub
>>> base = ub.Path('base')
>>> assert base.endswith('se')
>>> assert not base.endswith('be')
>>> # test start / stop cases
>>> assert ub.Path('aabbccdd').endswith('cdd', 5)
>>> assert not ub.Path('aabbccdd').endswith('cdd', 6)
>>> assert ub.Path('aabbccdd').endswith('cdd', 5, 10)
>>> assert not ub.Path('aabbccdd').endswith('cdd', 5, 7)
>>> # test tuple case
>>> assert ub.Path('aabbccdd').endswith(('foo', 'cdd'))
>>> assert ub.Path('foo').endswith(('foo', 'cdd'))
>>> assert not ub.Path('bar').endswith(('foo', 'cdd'))

startswith(prefix, *args)[source]¶

Test if the fspath representation startswith a particular string

Allows ubelt.Path to be a better drop-in replacement when working with string-based paths.

Parameters

prefix (str | Tuple[str, …]) – One or more prefixes to test for
*args – start (int): if specified begin testing at this position. end (int): if specified stop testing at this position.

Returns

True if any of the prefixes are matched.

Return type

http://datagenetics.com/blog/february12017/index.html

Example

>>> import ubelt as ub
>>> base = ub.Path('base')
>>> assert base.startswith('base')
>>> assert not base.startswith('all your')
>>> # test start / stop cases
>>> assert ub.Path('aabbccdd').startswith('aab', 0)
>>> assert ub.Path('aabbccdd').startswith('aab', 0, 5)
>>> assert not ub.Path('aabbccdd').startswith('aab', 1, 5)
>>> assert not ub.Path('aabbccdd').startswith('aab', 0, 2)
>>> # test tuple case
>>> assert ub.Path('aabbccdd').startswith(('foo', 'aab'))
>>> assert ub.Path('foo').startswith(('foo', 'aab'))
>>> assert not ub.Path('bar').startswith(('foo', 'aab'))

class ubelt.ProgIter(iterable=None, desc=None, total=None, freq=1, initial=0, eta_window=64, clearline=True, adjust=True, time_thresh=2.0, show_times=True, show_wall=False, enabled=True, verbose=None, stream=None, chunksize=None, rel_adjust_limit=4.0, **kwargs)[source]¶

Bases: _TQDMCompat, _BackwardsCompat

Prints progress as an iterator progresses

ProgIter is an alternative to tqdm. ProgIter implements much of the tqdm-API. The main difference between ProgIter and tqdm is that ProgIter does not use threading where as tqdm does.

Variables

iterable (List | Iterable) – A list or iterable to loop over
desc (str) – description label to show with progress
total (int) – Maximum length of the process. If not specified, we estimate it from the iterable, if possible.
freq (int) – How many iterations to wait between messages. Defaults to 1.
adjust (bool) – if True freq is adjusted based on time_thresh Defaults to True.
eta_window (int) – number of previous measurements to use in eta calculation, default=64
clearline (bool) – if True messages are printed on the same line otherwise each new progress message is printed on new line. default=True
adjust – if True freq is adjusted based on time_thresh. This may be overwritten depending on the setting of verbose. default=True
time_thresh (float) – desired amount of time to wait between messages if adjust is True otherwise does nothing, default=2.0
show_times (bool) – shows rate and eta, default=True
show_wall (bool) – show wall time, default=False
initial (int) – starting index offset, default=0
stream (IO) – stream where progress information is written to, default=sys.stdout
enabled (bool) – if False nothing happens. default=True
chunksize (int | None) – indicates that each iteration processes a batch of this size. Iteration rate is displayed in terms of single-items.
rel_adjust_limit (float) – Maximum factor update frequency can be adjusted by in a single step. default=4.0
verbose (int) – verbosity mode, which controls clearline, adjust, and enabled. The following maps the value of verbose to its effect. 0: enabled=False, 1: enabled=True with clearline=True and adjust=True, 2: enabled=True with clearline=False and adjust=True, 3: enabled=True with clearline=False and adjust=False

Note

Either use ProgIter in a with statement or call prog.end() at the end of the computation if there is a possibility that the entire iterable may not be exhausted.

Note

ProgIter is an alternative to tqdm. The main difference between ProgIter and tqdm is that ProgIter does not use threading where as tqdm does. ProgIter is simpler than tqdm and thus more stable in certain circumstances.

SeeAlso:: tqdm - https://pypi.python.org/pypi/tqdm

References

Example

>>> 
>>> def is_prime(n):
...     return n >= 2 and not any(n % i == 0 for i in range(2, n))
>>> for n in ProgIter(range(100), verbose=1, show_wall=True):
>>>     # do some work
>>>     is_prime(n)
100/100... rate=... Hz, total=..., wall=...

set_extra(extra)[source]¶

specify a custom info appended to the end of the next message

Todo

[ ] extra is a bad name; come up with something better and rename

Example

>>> prog = ProgIter(range(100, 300, 100), show_times=False, verbose=3)
>>> for n in prog:
>>>     prog.set_extra('processesing num {}'.format(n))
0/2...
1/2...processesing num 100
2/2...processesing num 200

step(inc=1, force=False)[source]¶

Manually step progress update, either directly or by an increment.

Parameters

inc (int, default=1) – number of steps to increment
force (bool, default=False) – if True forces progress display

Example

>>> n = 3
>>> prog = ProgIter(desc='manual', total=n, verbose=3)
>>> # Need to manually begin and end in this mode
>>> prog.begin()
>>> for _ in range(n):
...     prog.step()
>>> prog.end()

Example

>>> n = 3
>>> # can be used as a context manager in manual mode
>>> with ProgIter(desc='manual', total=n, verbose=3) as prog:
...     for _ in range(n):
...         prog.step()

start()[source]¶: Alias of ubelt.progiter.ProgIter.begin()

begin()[source]¶

Initializes information used to measure progress

This only needs to be used if this ProgIter is not wrapping an iterable. Does nothing if the this ProgIter is disabled.

Returns: a chainable self-reference
Return type: ProgIter

end()[source]¶

Signals that iteration has ended and displays the final message.

This only needs to be used if this ProgIter is not wrapping an iterable. Does nothing if the this ProgIter object is disabled or has already finished.

format_message()[source]¶

builds a formatted progress message with the current values. This contains the special characters needed to clear lines.

Example

>>> self = ProgIter(clearline=False, show_times=False)
>>> print(repr(self.format_message()))
'    0/?... \n'
>>> self.begin()
>>> self.step()
>>> print(repr(self.format_message()))
' 1/?... \n'

Example

>>> self = ProgIter(chunksize=10, total=100, clearline=False,
>>>                 show_times=False, microseconds=True)
>>> # hack, microseconds=True for coverage, needs real test
>>> print(repr(self.format_message()))
' 0.00% of 10x100... \n'
>>> self.begin()
>>> self.update()  # tqdm alternative to step
>>> print(repr(self.format_message()))
' 1.00% of 10x100... \n'

ensure_newline()[source]¶

use before any custom printing when using the progress iter to ensure your print statement starts on a new line instead of at the end of a progress line

Example

>>> # Unsafe version may write your message on the wrong line
>>> prog = ProgIter(range(3), show_times=False, freq=2, adjust=False)
>>> for n in prog:
...     print('unsafe message')
 0/3... unsafe message
unsafe message
 2/3... unsafe message
 3/3...
>>> # apparently the safe version does this too.
>>> print('---')
---
>>> prog = ProgIter(range(3), show_times=False, freq=2, adjust=False)
>>> for n in prog:
...     prog.ensure_newline()
...     print('safe message')
 0/3...
safe message
safe message
 2/3...
safe message
 3/3...

display_message()[source]¶: Writes current progress to the output stream

class ubelt.SetDict[source]¶

Bases: dict

A dictionary subclass where all set operations are defined.

All of the set operations are defined in a key-wise fashion, that is it is like performing the operation on sets of keys.

Note

The SetDict class only defines key-wise set operations. Value-wise or item-wise operations are in general not hashable and therefore not supported. A heavier extension would be needed for that.

Example

>>> import ubelt as ub
>>> primes = ub.sdict({v: f'prime_{v}' for v in [2, 3, 5, 7, 11]})
>>> evens = ub.sdict({v: f'even_{v}' for v in [0, 2, 4, 6, 8, 10]})
>>> odds = ub.sdict({v: f'odd_{v}' for v in [1, 3, 5, 7, 9, 11]})
>>> squares = ub.sdict({v: f'square_{v}' for v in [0, 1, 4, 9]})
>>> div3 = ub.sdict({v: f'div3_{v}' for v in [0, 3, 6, 9]})
>>> # All of the set methods are defined
>>> results1 = {}
>>> results1['ints'] = ints = odds.union(evens)
>>> results1['composites'] = ints.difference(primes)
>>> results1['even_primes'] = evens.intersection(primes)
>>> results1['odd_nonprimes_and_two'] = odds.symmetric_difference(primes)
>>> print('results1 = {}'.format(ub.repr2(results1, nl=2, sort=True)))
results1 = {
    'composites': {
        0: 'even_0',
        1: 'odd_1',
        4: 'even_4',
        6: 'even_6',
        8: 'even_8',
        9: 'odd_9',
        10: 'even_10',
    },
    'even_primes': {
        2: 'even_2',
    },
    'ints': {
        0: 'even_0',
        1: 'odd_1',
        2: 'even_2',
        3: 'odd_3',
        4: 'even_4',
        5: 'odd_5',
        6: 'even_6',
        7: 'odd_7',
        8: 'even_8',
        9: 'odd_9',
        10: 'even_10',
        11: 'odd_11',
    },
    'odd_nonprimes_and_two': {
        1: 'odd_1',
        2: 'prime_2',
        9: 'odd_9',
    },
}
>>> # As well as their corresponding binary operators
>>> assert results1['ints'] == odds | evens
>>> assert results1['composites'] == ints - primes
>>> assert results1['even_primes'] == evens & primes
>>> assert results1['odd_nonprimes_and_two'] == odds ^ primes
>>> # These can also be used as classmethods
>>> assert results1['ints'] == ub.sdict.union(odds, evens)
>>> assert results1['composites'] == ub.sdict.difference(ints, primes)
>>> assert results1['even_primes'] == ub.sdict.intersection(evens, primes)
>>> assert results1['odd_nonprimes_and_two'] == ub.sdict.symmetric_difference(odds, primes)
>>> # The narry variants are also implemented
>>> results2 = {}
>>> results2['nary_union'] = ub.sdict.union(primes, div3, odds)
>>> results2['nary_difference'] = ub.sdict.difference(primes, div3, odds)
>>> results2['nary_intersection'] = ub.sdict.intersection(primes, div3, odds)
>>> # Note that the definition of symmetric difference might not be what you think in the nary case.
>>> results2['nary_symmetric_difference'] = ub.sdict.symmetric_difference(primes, div3, odds)
>>> print('results2 = {}'.format(ub.repr2(results2, nl=2, sort=True)))
results2 = {
    'nary_difference': {
        2: 'prime_2',
    },
    'nary_intersection': {
        3: 'prime_3',
    },
    'nary_symmetric_difference': {
        0: 'div3_0',
        1: 'odd_1',
        2: 'prime_2',
        3: 'odd_3',
        6: 'div3_6',
    },
    'nary_union': {
        0: 'div3_0',
        1: 'odd_1',
        2: 'prime_2',
        3: 'odd_3',
        5: 'odd_5',
        6: 'div3_6',
        7: 'odd_7',
        9: 'odd_9',
        11: 'odd_11',
    },
}

Example

>>> # A neat thing about our implementation is that often the right
>>> # hand side is not required to be a dictionary, just something
>>> # that can be cast to a set.
>>> import ubelt as ub
>>> primes = ub.sdict({2: 'a', 3: 'b', 5: 'c', 7: 'd', 11: 'e'})
>>> assert primes - {2, 3} == {5: 'c', 7: 'd', 11: 'e'}
>>> assert primes & {2, 3} == {2: 'a', 3: 'b'}
>>> # Union does need to have a second dictionary
>>> import pytest
>>> with pytest.raises(AttributeError):
>>>     primes | {2, 3}

copy()[source]¶

Example

>>> import ubelt as ub
>>> a = ub.sdict({1: 1, 2: 2, 3: 3})
>>> b = ub.udict({1: 1, 2: 2, 3: 3})
>>> c = a.copy()
>>> d = b.copy()
>>> assert c is not a
>>> assert d is not b
>>> assert d == b
>>> assert c == a
>>> list(map(type, [a, b, c, d]))
>>> assert isinstance(c, ub.sdict)
>>> assert isinstance(d, ub.udict)

union(*others, cls=None)[source]¶

Return the key-wise union of two or more dictionaries.

For items with intersecting keys, dictionaries towards the end of the sequence are given precedence.

Parameters

self (SetDict | dict) – if called as a static method this must be provided.
*others – other dictionary like objects that have an items method. (i.e. it must return an iterable of 2-tuples where the first item is hashable.)
cls (type) – the desired return dictionary type.

Returns

whatever the dictionary type of the first argument is

Return type

Example

>>> import ubelt as ub
>>> a = ub.SetDict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> b = ub.SetDict({k: 'B_' + chr(97 + k) for k in [2, 4, 0, 7]})
>>> c = ub.SetDict({k: 'C_' + chr(97 + k) for k in [2, 8, 3]})
>>> d = ub.SetDict({k: 'D_' + chr(97 + k) for k in [9, 10, 11]})
>>> e = ub.SetDict({k: 'E_' + chr(97 + k) for k in []})
>>> assert a | b == {2: 'B_c', 3: 'A_d', 5: 'A_f', 7: 'B_h', 4: 'B_e', 0: 'B_a'}
>>> a.union(b)
>>> a | b | c
>>> res = ub.SetDict.union(a, b, c, d, e)
>>> print(ub.repr2(res, sort=1, nl=0, si=1))
{0: B_a, 2: C_c, 3: C_d, 4: B_e, 5: A_f, 7: B_h, 8: C_i, 9: D_j, 10: D_k, 11: D_l}

intersection(*others, cls=None)[source]¶

Return the key-wise intersection of two or more dictionaries.

All items returned will be from the first dictionary for keys that exist in all other dictionaries / sets provided.

Parameters

self (SetDict | dict) – if called as a static method this must be provided.
*others – other dictionary or set like objects that can be coerced into a set of keys.
cls (type) – the desired return dictionary type.

Returns

whatever the dictionary type of the first argument is

Return type

Example

>>> import ubelt as ub
>>> a = ub.SetDict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> b = ub.SetDict({k: 'B_' + chr(97 + k) for k in [2, 4, 0, 7]})
>>> c = ub.SetDict({k: 'C_' + chr(97 + k) for k in [2, 8, 3]})
>>> d = ub.SetDict({k: 'D_' + chr(97 + k) for k in [9, 10, 11]})
>>> e = ub.SetDict({k: 'E_' + chr(97 + k) for k in []})
>>> assert a & b == {2: 'A_c', 7: 'A_h'}
>>> a.intersection(b)
>>> a & b & c
>>> res = ub.SetDict.intersection(a, b, c, d, e)
>>> print(ub.repr2(res, sort=1, nl=0, si=1))
{}

difference(*others, cls=None)[source]¶

Return the key-wise difference between this dictionary and one or more other dictionary / keys.

The returned items will be from the first dictionary, and will only contain keys that do not appear in any of the other dictionaries / sets.

Parameters

self (SetDict | dict) – if called as a static method this must be provided.
*others – other dictionary or set like objects that can be coerced into a set of keys.
cls (type) – the desired return dictionary type.

Returns

whatever the dictionary type of the first argument is

Return type

Example

>>> import ubelt as ub
>>> a = ub.SetDict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> b = ub.SetDict({k: 'B_' + chr(97 + k) for k in [2, 4, 0, 7]})
>>> c = ub.SetDict({k: 'C_' + chr(97 + k) for k in [2, 8, 3]})
>>> d = ub.SetDict({k: 'D_' + chr(97 + k) for k in [9, 10, 11]})
>>> e = ub.SetDict({k: 'E_' + chr(97 + k) for k in []})
>>> assert a - b == {3: 'A_d', 5: 'A_f'}
>>> a.difference(b)
>>> a - b - c
>>> res = ub.SetDict.difference(a, b, c, d, e)
>>> print(ub.repr2(res, sort=1, nl=0, si=1))
{5: A_f}

symmetric_difference(*others, cls=None)[source]¶

Return the key-wise symmetric difference between this dictionary and one or more other dictionaries.

Returns items that are (key-wise) in an odd number of the given dictionaries. This is consistent with the standard n-ary definition of symmetric difference [WikiSymDiff] and corresponds with the xor operation.

Parameters

self (SetDict | dict) – if called as a static method this must be provided.
*others – other dictionary or set like objects that can be coerced into a set of keys.
cls (type) – the desired return dictionary type.

Returns

whatever the dictionary type of the first argument is

Return type

References

WikiSymDiff: https://en.wikipedia.org/wiki/Symmetric_difference

Example

>>> import ubelt as ub
>>> a = ub.SetDict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> b = ub.SetDict({k: 'B_' + chr(97 + k) for k in [2, 4, 0, 7]})
>>> c = ub.SetDict({k: 'C_' + chr(97 + k) for k in [2, 8, 3]})
>>> d = ub.SetDict({k: 'D_' + chr(97 + k) for k in [9, 10, 11]})
>>> e = ub.SetDict({k: 'E_' + chr(97 + k) for k in []})
>>> a ^ b
{3: 'A_d', 5: 'A_f', 4: 'B_e', 0: 'B_a'}
>>> a.symmetric_difference(b)
>>> a - b - c
>>> res = ub.SetDict.symmetric_difference(a, b, c, d, e)
>>> print(ub.repr2(res, sort=1, nl=0, si=1))
{0: B_a, 2: C_c, 4: B_e, 5: A_f, 8: C_i, 9: D_j, 10: D_k, 11: D_l}

class ubelt.TeeStringIO(redirect=None)[source]¶

Bases: StringIO

An IO object that writes to itself and another IO stream.

Variables: redirect (io.IOBase) – The other stream to write to.

Example

>>> import ubelt as ub
>>> redirect = io.StringIO()
>>> self = ub.TeeStringIO(redirect)

isatty()[source]¶: Returns true of the redirect is a terminal.

Note

Needed for IPython.embed to work properly when this class is used to override stdout / stderr.

fileno()[source]¶

Returns underlying file descriptor of the redirected IOBase object if one exists.

Example

>>> # Not sure the best way to test, this func is important for
>>> # capturing stdout when ipython embedding
>>> import pytest
>>> with pytest.raises(io.UnsupportedOperation):
>>>     TeeStringIO(redirect=io.StringIO()).fileno()
>>> with pytest.raises(io.UnsupportedOperation):
>>>     TeeStringIO(None).fileno()

property encoding¶

Gets the encoding of the redirect IO object

Example

>>> import ubelt as ub
>>> redirect = io.StringIO()
>>> assert ub.TeeStringIO(redirect).encoding is None
>>> assert ub.TeeStringIO(None).encoding is None
>>> assert ub.TeeStringIO(sys.stdout).encoding is sys.stdout.encoding
>>> redirect = io.TextIOWrapper(io.StringIO())
>>> assert ub.TeeStringIO(redirect).encoding is redirect.encoding

write(msg)[source]¶: Write to this and the redirected stream

flush()[source]¶: Flush to this and the redirected stream

class ubelt.TempDir[source]¶

Bases: object

Context for creating and cleaning up temporary directories.

Note

This class will be DEPRECATED. The exact deprecation version and mitigation plan has not yet been developed.

Note

This exists because tempfile.TemporaryDirectory was introduced in Python 3.2. Thus once ubelt no longer supports python 2.7, this class will be deprecated.

Example

>>> from ubelt.util_path import *  # NOQA
>>> with TempDir() as self:
>>>     dpath = self.dpath
>>>     assert exists(dpath)
>>> assert not exists(dpath)

Example

>>> from ubelt.util_path import *  # NOQA
>>> self = TempDir()
>>> dpath = self.ensure()
>>> assert exists(dpath)
>>> self.cleanup()
>>> assert not exists(dpath)

ensure()[source]¶

cleanup()[source]¶

start()[source]¶

class ubelt.Timer(label='', verbose=None, newline=True)[source]¶

Bases: object

Measures time elapsed between a start and end point. Can be used as a with-statement context manager, or using the tic/toc api.

Parameters

label (str, default=’’) – identifier for printing
verbose (int, default=None) – verbosity flag, defaults to True if label is given
newline (bool, default=True) – if False and verbose, print tic and toc on the same line

Variables

elapsed (float) – number of seconds measured by the context manager
tstart (float) – time of last tic reported by self._time()

Example

>>> # Create and start the timer using the context manager
>>> import math
>>> timer = Timer('Timer test!', verbose=1)
>>> with timer:
>>>     math.factorial(10)
>>> assert timer.elapsed > 0
tic('Timer test!')
...toc('Timer test!')=...

Example

>>> # Create and start the timer using the tic/toc interface
>>> timer = Timer().tic()
>>> elapsed1 = timer.toc()
>>> elapsed2 = timer.toc()
>>> elapsed3 = timer.toc()
>>> assert elapsed1 <= elapsed2
>>> assert elapsed2 <= elapsed3

tic()[source]¶: starts the timer

toc()[source]¶: stops the timer

class ubelt.UDict[source]¶

Bases: SetDict

A subclass of dict with ubelt enhancements

This builds on top of SetDict which itself is a simple extension that contains only that extra functionality. The extra invert, map, sorted, and peek functions are less fundamental and there are at least reasonable workarounds when they are not available.

The UDict class is a simple subclass of dict that provides the following upgrades:

set operations - inherited from SetDict

intersection - find items in common

union - merge dicts

difference - find items in one but not the other

symmetric_difference - find items that appear an odd number of times

subdict - take a subset with optional default values. (similar to intersection, but the later ignores non-common values)

inversion -

invert - swaps a dictionary keys and values (with options for dealing with duplicates).

mapping -

map_keys - applies a function over each key and keeps the values the same

map_values - applies a function over each key and keeps the values the same

sorting -

sorted_keys - returns a dictionary ordered by the keys

sorted_values - returns a dictionary ordered by the values

IMO key-wise set operations on dictionaries are fundamentaly and sorely missing from the stdlib, mapping is super convinient, sorting and inversion are less common, but still useful to have.

Todo

[ ] UbeltDict, UltraDict, not sure what the name is. We may just rename this to Dict,

Example

>>> import ubelt as ub
>>> a = ub.udict({1: 20, 2: 20, 3: 30, 4: 40})
>>> b = ub.udict({0: 0, 2: 20, 4: 42})
>>> c = ub.udict({3: -1, 5: -1})
>>> # Demo key-wise set operations
>>> assert a & b == {2: 20, 4: 40}
>>> assert a - b == {1: 20, 3: 30}
>>> assert a ^ b == {1: 20, 3: 30, 0: 0}
>>> assert a | b == {1: 20, 2: 20, 3: 30, 4: 42, 0: 0}
>>> # Demo new n-ary set methods
>>> a.union(b, c) == {1: 20, 2: 20, 3: -1, 4: 42, 0: 0, 5: -1}
>>> a.intersection(b, c) == {}
>>> a.difference(b, c) == {1: 20}
>>> a.symmetric_difference(b, c) == {1: 20, 0: 0, 5: -1}
>>> # Demo new quality of life methods
>>> assert a.subdict({2, 4, 6, 8}, default=None) == {8: None, 2: 20, 4: 40, 6: None}
>>> assert a.invert() == {20: 2, 30: 3, 40: 4}
>>> assert a.invert(unique_vals=0) == {20: {1, 2}, 30: {3}, 40: {4}}
>>> assert a.peek_key() == ub.peek(a.keys())
>>> assert a.peek_value() == ub.peek(a.values())
>>> assert a.map_keys(lambda x: x * 10) == {10: 20, 20: 20, 30: 30, 40: 40}
>>> assert a.map_values(lambda x: x * 10) == {1: 200, 2: 200, 3: 300, 4: 400}

subdict(keys, default=NoParam)[source]¶

Get a subset of a dictionary

Parameters

self (Dict[KT, VT]) – dictionary or the implicit instance
keys (Iterable[KT]) – keys to take from self
default (Optional[object] | NoParamType) – if specified uses default if keys are missing.

Raises

KeyError – if a key does not exist and default is not specified

SeeAlso:: ubelt.util_dict.dict_subset() ubelt.UDict.take()

Example

>>> import ubelt as ub
>>> a = ub.udict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> s = a.subdict({2, 5})
>>> print('s = {}'.format(ub.repr2(s, nl=0, sort=1)))
s = {2: 'A_c', 5: 'A_f'}
>>> import pytest
>>> with pytest.raises(KeyError):
>>>     s = a.subdict({2, 5, 100})
>>> s = a.subdict({2, 5, 100}, default='DEF')
>>> print('s = {}'.format(ub.repr2(s, nl=0, sort=1)))
s = {2: 'A_c', 5: 'A_f', 100: 'DEF'}

take(keys, default=NoParam)[source]¶

Get values of an iterable of keys.

Parameters

self (Dict[KT, VT]) – dictionary or the implicit instance
keys (Iterable[KT]) – keys to take from self
default (Optional[object] | NoParamType) – if specified uses default if keys are missing.

Yields

VT – a selected value within the dictionary

Raises

KeyError – if a key does not exist and default is not specified

SeeAlso:: ubelt.util_list.take() ubelt.UDict.subdict()

Example

>>> import ubelt as ub
>>> a = ub.udict({k: 'A_' + chr(97 + k) for k in [2, 3, 5, 7]})
>>> s = list(a.take({2, 5}))
>>> print('s = {}'.format(ub.repr2(s, nl=0, sort=1)))
s = ['A_c', 'A_f']
>>> import pytest
>>> with pytest.raises(KeyError):
>>>     s = a.subdict({2, 5, 100})
>>> s = list(a.take({2, 5, 100}, default='DEF'))
>>> print('s = {}'.format(ub.repr2(s, nl=0, sort=1)))
s = ['A_c', 'A_f', 'DEF']

invert(unique_vals=True)[source]¶

Swaps the keys and values in a dictionary.

Parameters

self (Dict[KT, VT]) – dictionary or the implicit instance to invert
unique_vals (bool, default=True) – if False, the values of the new dictionary are sets of the original keys.
cls (type | None) – specifies the dict subclassof the result. if unspecified will be dict or OrderedDict. This behavior may change.

Returns

the inverted dictionary

Return type

Dict[VT, KT] | Dict[VT, Set[KT]]

Note

The must values be hashable.

If the original dictionary contains duplicate values, then only one of the corresponding keys will be returned and the others will be discarded. This can be prevented by setting unique_vals=False, causing the inverted keys to be returned in a set.

Example

>>> import ubelt as ub
>>> inverted = ub.udict({'a': 1, 'b': 2}).invert()
>>> assert inverted == {1: 'a', 2: 'b'}

map_keys(func)[source]¶

Apply a function to every value in a dictionary.

Creates a new dictionary with the same keys and modified values.

Parameters

self (Dict[KT, VT]) – a dictionary or the implicit instance.
func (Callable[[VT], T] | Mapping[VT, T]) – a function or indexable object

Returns

transformed dictionary

Return type

Dict[KT, T]

Example

>>> import ubelt as ub
>>> new = ub.udict({'a': [1, 2, 3], 'b': []}).map_keys(ord)
>>> assert new == {97: [1, 2, 3], 98: []}

map_values(func)[source]¶

Apply a function to every value in a dictionary.

Creates a new dictionary with the same keys and modified values.

Parameters

self (Dict[KT, VT]) – a dictionary or the implicit instance.
func (Callable[[VT], T] | Mapping[VT, T]) – a function or indexable object

Returns

transformed dictionary

Return type

Dict[KT, T]

Example

>>> import ubelt as ub
>>> newdict = ub.udict({'a': [1, 2, 3], 'b': []}).map_values(len)
>>> assert newdict ==  {'a': 3, 'b': 0}

sorted_keys(key=None, reverse=False)[source]¶

Return an ordered dictionary sorted by its keys

Parameters

self (Dict[KT, VT]) – dictionary to sort or the implicit instance. The keys must be of comparable types.
key (Callable[[KT], Any] | None) – If given as a callable, customizes the sorting by ordering using transformed keys.
reverse (bool, default=False) – if True returns in descending order

Returns

new dictionary where the keys are ordered

Return type

OrderedDict[KT, VT]

Example

>>> import ubelt as ub
>>> new = ub.udict({'spam': 2.62, 'eggs': 1.20, 'jam': 2.92}).sorted_keys()
>>> assert new == ub.odict([('eggs', 1.2), ('jam', 2.92), ('spam', 2.62)])

sorted_values(key=None, reverse=False)[source]¶

Return an ordered dictionary sorted by its values

Parameters

self (Dict[KT, VT]) – dictionary to sort or the implicit instance. The values must be of comparable types.
key (Callable[[VT], Any] | None) – If given as a callable, customizes the sorting by ordering using transformed values.
reverse (bool, default=False) – if True returns in descending order

Returns

new dictionary where the values are ordered

Return type

OrderedDict[KT, VT]

Example

>>> import ubelt as ub
>>> new = ub.udict({'spam': 2.62, 'eggs': 1.20, 'jam': 2.92}).sorted_values()
>>> assert new == ub.odict([('eggs', 1.2), ('spam', 2.62), ('jam', 2.92)])

peek_key(default=NoParam)[source]¶

Get the first key in the dictionary

Parameters

self (Dict) – a dictionary or the implicit instance
default (T | NoParamType) – default item to return if the iterable is empty, otherwise a StopIteration error is raised

Returns

the first value or the default

Return type

KT

Example

>>> import ubelt as ub
>>> assert ub.udict({1: 2}).peek_key() == 1

peek_value(default=NoParam)[source]¶

Get the first value in the dictionary

Parameters

self (Dict[KT, VT]) – a dictionary or the implicit instance
default (T | NoParamType) – default item to return if the iterable is empty, otherwise a StopIteration error is raised

Returns

the first value or the default

Return type

VT

Example

>>> import ubelt as ub
>>> assert ub.udict({1: 2}).peek_value() == 2

ubelt.allsame(iterable, eq=<built-in function eq>)[source]¶

Determine if all items in a sequence are the same

Parameters

iterable (Iterable[T]) – items to determine if they are all the same
eq (Callable[[T, T], bool], default=operator.eq) – function used to test for equality

Returns

True if all items are equal, otherwise False

Return type

Notes

Similar to more_itertools.all_equal()

Example

>>> import ubelt as ub
>>> ub.allsame([1, 1, 1, 1])
True
>>> ub.allsame([])
True
>>> ub.allsame([0, 1])
False
>>> iterable = iter([0, 1, 1, 1])
>>> next(iterable)
>>> ub.allsame(iterable)
True
>>> ub.allsame(range(10))
False
>>> ub.allsame(range(10), lambda a, b: True)
True

ubelt.argflag(key, argv=None)[source]¶

Determines if a key is specified on the command line.

This is a functional alternative to key in sys.argv, but it also allows for multiple aliases of the same flag to be specified.

Parameters

key (str | Tuple[str, …]) – string or tuple of strings. Each key should be prefixed with two hyphens (i.e. --).
argv (List[str], default=None) – overrides sys.argv if specified

Returns

flag - True if the key (or any of the keys) was specified

Return type

CommandLine

xdoctest -m ubelt.util_arg argflag:0
xdoctest -m ubelt.util_arg argflag:0 --devflag
xdoctest -m ubelt.util_arg argflag:0 -df
xdoctest -m ubelt.util_arg argflag:0 --devflag2
xdoctest -m ubelt.util_arg argflag:0 -df2

Example

>>> # Everyday usage of this function might look like this
>>> import ubelt as ub
>>> # Check if either of these strings are in sys.argv
>>> flag = ub.argflag(('-df', '--devflag'))
>>> if flag:
>>>     print(ub.color_text(
>>>         'A hidden developer flag was given!', 'blue'))
>>> print('Pass the hidden CLI flag to see a secret message')

Example

>>> import ubelt as ub
>>> argv = ['--spam', '--eggs', 'foo']
>>> assert ub.argflag('--eggs', argv=argv) is True
>>> assert ub.argflag('--ans', argv=argv) is False
>>> assert ub.argflag('foo', argv=argv) is True
>>> assert ub.argflag(('bar', '--spam'), argv=argv) is True

ubelt.argmax(indexable, key=None)[source]¶

Returns index / key of the item with the largest value.

This is similar to numpy.argmax(), but it is written in pure python and works on both lists and dictionaries.

Parameters

indexable (Iterable[VT] | Mapping[KT, VT]) – indexable to sort by
key (Callable[[VT], Any], default=None) – customizes the ordering of the indexable

Returns

the index of the item with the maximum value.

Return type

int | KT

Example

>>> import ubelt as ub
>>> assert ub.argmax({'a': 3, 'b': 2, 'c': 100}) == 'c'
>>> assert ub.argmax(['a', 'c', 'b', 'z', 'f']) == 3
>>> assert ub.argmax([[0, 1], [2, 3, 4], [5]], key=len) == 1
>>> assert ub.argmax({'a': 3, 'b': 2, 3: 100, 4: 4}) == 3
>>> assert ub.argmax(iter(['a', 'c', 'b', 'z', 'f'])) == 3

ubelt.argmin(indexable, key=None)[source]¶

Returns index / key of the item with the smallest value.

This is similar to numpy.argmin(), but it is written in pure python and works on both lists and dictionaries.

Parameters

indexable (Iterable[VT] | Mapping[KT, VT]) – indexable to sort by
key (Callable[[VT], VT], default=None) – customizes the ordering of the indexable

Returns

the index of the item with the minimum value.

Return type

int | KT

Example

>>> import ubelt as ub
>>> assert ub.argmin({'a': 3, 'b': 2, 'c': 100}) == 'b'
>>> assert ub.argmin(['a', 'c', 'b', 'z', 'f']) == 0
>>> assert ub.argmin([[0, 1], [2, 3, 4], [5]], key=len) == 2
>>> assert ub.argmin({'a': 3, 'b': 2, 3: 100, 4: 4}) == 'b'
>>> assert ub.argmin(iter(['a', 'c', 'A', 'z', 'f'])) == 2

ubelt.argsort(indexable, key=None, reverse=False)[source]¶

Returns the indices that would sort a indexable object.

This is similar to numpy.argsort(), but it is written in pure python and works on both lists and dictionaries.

Parameters

indexable (Iterable[VT] | Mapping[KT, VT]) – indexable to sort by
key (Callable[[VT], VT] | None, default=None) – customizes the ordering of the indexable
reverse (bool, default=False) – if True returns in descending order

Returns

indices - list of indices that sorts the indexable

Return type

List[int] | List[KT]

Example

>>> import ubelt as ub
>>> # argsort works on dicts by returning keys
>>> dict_ = {'a': 3, 'b': 2, 'c': 100}
>>> indices = ub.argsort(dict_)
>>> assert list(ub.take(dict_, indices)) == sorted(dict_.values())
>>> # argsort works on lists by returning indices
>>> indexable = [100, 2, 432, 10]
>>> indices = ub.argsort(indexable)
>>> assert list(ub.take(indexable, indices)) == sorted(indexable)
>>> # Can use iterators, but be careful. It exhausts them.
>>> indexable = reversed(range(100))
>>> indices = ub.argsort(indexable)
>>> assert indices[0] == 99
>>> # Can use key just like sorted
>>> indexable = [[0, 1, 2], [3, 4], [5]]
>>> indices = ub.argsort(indexable, key=len)
>>> assert indices == [2, 1, 0]
>>> # Can use reverse just like sorted
>>> indexable = [0, 2, 1]
>>> indices = ub.argsort(indexable, reverse=True)
>>> assert indices == [1, 2, 0]

ubelt.argunique(items, key=None)[source]¶

Returns indices corresponding to the first instance of each unique item.

Parameters

items (Sequence[VT]) – indexable collection of items
key (Callable[[VT], Any], default=None) – custom normalization function. If specified returns items where key(item) is unique.

Returns

indices of the unique items

Return type

Iterator[int]

Example

>>> import ubelt as ub
>>> items = [0, 2, 5, 1, 1, 0, 2, 4]
>>> indices = list(ub.argunique(items))
>>> assert indices == [0, 1, 2, 3, 7]
>>> indices = list(ub.argunique(items, key=lambda x: x % 2 == 0))
>>> assert indices == [0, 2]

ubelt.argval(key, default=NoParam, argv=None)[source]¶

Get the value of a keyword argument specified on the command line.

Values can be specified as <key> <value> or <key>=<value>

The use-case for this function is to add hidden command line feature where a developer can pass in a special value. This can be used to prototype a command line interface, provide an easter egg, or add some other command line parsing that wont be exposed in CLI help docs.

Parameters

key (str | Tuple[str, …]) – string or tuple of strings. Each key should be prefixed with two hyphens (i.e. --)
default (T | NoParamType, default=NoParam) – a value to return if not specified.
argv (Optional[List[str]], default=None) – uses sys.argv if unspecified

Returns

value - the value specified after the key. It they key is specified multiple times, then the first value is returned.

Return type

str | T

Todo

[x] Can we handle the case where the value is a list of long paths? - No
[ ] Should we default the first or last specified instance of the flag.

CommandLine

xdoctest -m ubelt.util_arg argval:0
xdoctest -m ubelt.util_arg argval:0 --devval
xdoctest -m ubelt.util_arg argval:0 --devval=1
xdoctest -m ubelt.util_arg argval:0 --devval=2
xdoctest -m ubelt.util_arg argval:0 --devval 3
xdoctest -m ubelt.util_arg argval:0 --devval "4 5 6"

Example

>>> # Everyday usage of this function might look like this where
>>> import ubelt as ub
>>> # grab a key/value pair if is given on the command line
>>> value = ub.argval('--devval', default='1')
>>> print('Checking if the hidden CLI key/value pair is given')
>>> if value != '1':
>>>     print(ub.color_text(
>>>         'A hidden developer secret: {!r}'.format(value), 'yellow'))
>>> print('Pass the hidden CLI key/value pair to see a secret message')

Example

>>> import ubelt as ub
>>> argv = ['--ans', '42', '--quest=the grail', '--ans=6', '--bad']
>>> assert ub.argval('--spam', argv=argv) == ub.NoParam
>>> assert ub.argval('--quest', argv=argv) == 'the grail'
>>> assert ub.argval('--ans', argv=argv) == '42'
>>> assert ub.argval('--bad', argv=argv) == ub.NoParam
>>> assert ub.argval(('--bad', '--bar'), argv=argv) == ub.NoParam

Example

>>> # Test fix for GH Issue #41
>>> import ubelt as ub
>>> argv = ['--path=/path/with/k=3']
>>> ub.argval('--path', argv=argv) == '/path/with/k=3'

ubelt.augpath(path, suffix='', prefix='', ext=None, tail='', base=None, dpath=None, relative=None, multidot=False)[source]¶

Create a new path with a different extension, basename, directory, prefix, and/or suffix.

A prefix is inserted before the basename. A suffix is inserted between the basename and the extension. The basename and extension can be replaced with a new one. Essentially a path is broken down into components (dpath, base, ext), and then recombined as (dpath, prefix, base, suffix, ext) after replacing any specified component.

Parameters

path (str | PathLike) – a path to augment
suffix (str) – placed between the basename and extension Note: this is referred to as stemsuffix in ub.Path.augment().
prefix (str) – placed in front of the basename
ext (str | None) – if specified, replaces the extension
tail (str | None) – If specified, appends this text to the extension
base (str | None) – if specified, replaces the basename without extension. Note: this is referred to as stem in ub.Path.augment().
dpath (str | PathLike | None) – if specified, replaces the specified “relative” directory, which by default is the parent directory.
relative (str | PathLike | None) – Replaces relative with dpath in path. Has no effect if dpath is not specified. Defaults to the dirname of the input path. experimental not currently implemented.
multidot (bool) – Allows extensions to contain multiple dots. Specifically, if False, everything after the last dot in the basename is the extension. If True, everything after the first dot in the basename is the extension.

Returns

augmented path

Return type

Example

>>> import ubelt as ub
>>> path = 'foo.bar'
>>> suffix = '_suff'
>>> prefix = 'pref_'
>>> ext = '.baz'
>>> newpath = ub.augpath(path, suffix, prefix, ext=ext, base='bar')
>>> print('newpath = %s' % (newpath,))
newpath = pref_bar_suff.baz

Example

>>> from ubelt.util_path import *  # NOQA
>>> augpath('foo.bar')
'foo.bar'
>>> augpath('foo.bar', ext='.BAZ')
'foo.BAZ'
>>> augpath('foo.bar', suffix='_')
'foo_.bar'
>>> augpath('foo.bar', prefix='_')
'_foo.bar'
>>> augpath('foo.bar', base='baz')
'baz.bar'
>>> augpath('foo.tar.gz', ext='.zip', multidot=True)
foo.zip
>>> augpath('foo.tar.gz', ext='.zip', multidot=False)
foo.tar.zip
>>> augpath('foo.tar.gz', suffix='_new', multidot=True)
foo_new.tar.gz
>>> augpath('foo.tar.gz', suffix='_new', tail='.cache', multidot=True)
foo_new.tar.gz.cache

ubelt.boolmask(indices, maxval=None)[source]¶

Constructs a list of booleans where an item is True if its position is in indices otherwise it is False.

Parameters

indices (List[int]) – list of integer indices
maxval (int) – length of the returned list. If not specified this is inferred using max(indices)

Returns

mask - a list of booleans. mask[idx] is True if idx in indices

Return type

List[bool]

Note

In the future the arg maxval may change its name to shape

Example

>>> import ubelt as ub
>>> indices = [0, 1, 4]
>>> mask = ub.boolmask(indices, maxval=6)
>>> assert mask == [True, True, False, False, True, False]
>>> mask = ub.boolmask(indices)
>>> assert mask == [True, True, False, False, True]

class ubelt.chunks(items, chunksize=None, nchunks=None, total=None, bordermode='none', legacy=False)[source]¶

Bases: object

Generates successive n-sized chunks from items.

If the last chunk has less than n elements, bordermode is used to determine fill values.

Parameters

items (Iterable[T]) – input to iterate over
chunksize (int) – size of each sublist yielded
nchunks (int) – number of chunks to create ( cannot be specified if chunksize is specified)
bordermode (str) – determines how to handle the last case if the length of the input is not divisible by chunksize valid values are: {‘none’, ‘cycle’, ‘replicate’}
total (int) – hints about the length of the input

Note

FIXME:: When nchunks is given, that’s how many chunks we should get but the issue is that chunksize is not well defined in that instance For instance how do we turn a list with 4 elements into 3 chunks where does the extra item go?

In ubelt <= 0.10.3 there is a bug when specifying nchunks, where it chooses a chunksize that is too large. Specify legacy=True to get the old buggy behavior if needed.

Notes

This is similar to functionality provided by: more_itertools.chunked(), more_itertools.chunked_even(), more_itertools.sliced(), more_itertools.divide(),

Yields: List[T] – subsequent non-overlapping chunks of the input items

References

SO_434287: http://stackoverflow.com/questions/434287/iterate-over-a-list-in-chunks

Example

>>> import ubelt as ub
>>> items = '1234567'
>>> genresult = ub.chunks(items, chunksize=3)
>>> list(genresult)
[['1', '2', '3'], ['4', '5', '6'], ['7']]

Example

>>> import ubelt as ub
>>> items = [1, 2, 3, 4, 5, 6, 7]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='none')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7]]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='cycle')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 1, 2]]
>>> genresult = ub.chunks(items, chunksize=3, bordermode='replicate')
>>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 7, 7]]

Example

>>> import ubelt as ub
>>> assert len(list(ub.chunks(range(2), nchunks=2))) == 2
>>> assert len(list(ub.chunks(range(3), nchunks=2))) == 2
>>> # Note: ub.chunks will not do the 2,1,1 split
>>> assert len(list(ub.chunks(range(4), nchunks=3))) == 3
>>> assert len(list(ub.chunks([], 2, bordermode='none'))) == 0
>>> assert len(list(ub.chunks([], 2, bordermode='cycle'))) == 0
>>> assert len(list(ub.chunks([], 2, None, bordermode='replicate'))) == 0

Example

>>> from ubelt.util_list import *  # NOQA
>>> def _check_len(self):
...     assert len(self) == len(list(self))
>>> _check_len(chunks(list(range(3)), nchunks=2))
>>> _check_len(chunks(list(range(2)), nchunks=2))
>>> _check_len(chunks(list(range(2)), nchunks=3))

Example

>>> from ubelt.util_list import *  # NOQA
>>> import pytest
>>> assert pytest.raises(ValueError, chunks, range(9))
>>> assert pytest.raises(ValueError, chunks, range(9), chunksize=2, nchunks=2)
>>> assert pytest.raises(TypeError, len, chunks((_ for _ in range(2)), 2))

Example

>>> from ubelt.util_list import *  # NOQA
>>> import ubelt as ub
>>> basis = {
>>>     'legacy': [False, True],
>>>     'chunker': [{'nchunks': 3}, {'nchunks': 4}, {'nchunks': 5}, {'nchunks': 7}, {'chunksize': 3}],
>>>     'items': [range(2), range(4), range(5), range(7), range(9)],
>>>     'bordermode': ['none', 'cycle', 'replicate'],
>>> }
>>> grid_items = list(ub.named_product(basis))
>>> rows = []
>>> for grid_item in ub.ProgIter(grid_items):
>>>     chunker = grid_item.get('chunker')
>>>     grid_item.update(chunker)
>>>     kw = ub.dict_diff(grid_item, {'chunker'})
>>>     self = chunk_iter = ub.chunks(**kw)
>>>     chunked = list(chunk_iter)
>>>     chunk_lens = list(map(len, chunked))
>>>     row = ub.dict_union(grid_item, {'chunk_lens': chunk_lens, 'chunks': chunked})
>>>     row['chunker'] = str(row['chunker'])
>>>     if not row['legacy'] and 'nchunks' in kw:
>>>         assert kw['nchunks'] == row['nchunks']
>>>     row.update(chunk_iter.__dict__)
>>>     rows.append(row)
>>> # xdoctest: +SKIP
>>> import pandas as pd
>>> df = pd.DataFrame(rows)
>>> for _, subdf in df.groupby('chunker'):
>>>     print(subdf)

static noborder(items, chunksize)[source]¶

static cycle(items, chunksize)[source]¶

static replicate(items, chunksize)[source]¶

ubelt.cmd(command, shell=False, detach=False, verbose=0, tee=None, cwd=None, env=None, tee_backend='auto', check=False, system=False, timeout=None)[source]¶

Executes a command in a subprocess.

The advantage of this wrapper around subprocess is that (1) you control if the subprocess prints to stdout, (2) the text written to stdout and stderr is returned for parsing, (3) cross platform behavior that lets you specify the command as a string or tuple regardless of whether or not shell=True. (4) ability to detach, return the process object and allow the process to run in the background (eventually we may return a Future object instead).

Parameters

command (str | List[str]) – command string, tuple of executable and args, or shell command.
shell (bool, default=False) – if True, process is run in shell.
detach (bool, default=False) – if True, process is detached and run in background.
verbose (int, default=0) – verbosity mode. Can be 0, 1, 2, or 3.
tee (bool | None) – if True, simultaneously writes to stdout while capturing output from the command. If not specified, defaults to True if verbose > 0. If detach is True, then this argument is ignored.
cwd (str | PathLike | None) – Path to run command. Defaults to current working directory if unspecified.
env (Dict[str, str] | None) – environment passed to Popen
tee_backend (str, default=’auto’) – backend for tee output. Valid choices are: “auto”, “select” (POSIX only), and “thread”.
check (bool, default=False) – if True, check that the return code was zero before returning, otherwise raise a subprocess.CalledProcessError. Does nothing if detach is True.
system (bool, default=False) – if True, most other considerations are dropped, and os.system() is used to execute the command in a platform dependant way. Other arguments such as env, tee, timeout, and shell are all ignored. (new in version 1.1.0)
timeout (float) – If the process does not complete in timeout seconds, raises a subprocess.TimeoutExpired. (new in version 1.1.0) Currently unhandled when tee is True.
log (Callable | None) – If specified, verbose output is written using this function, otherwise the builtin print function is used.

Returns

info - information about command status. if detach is False info contains captured standard out, standard error, and the return code if detach is True info contains a reference to the process.

Return type

Raises

ValueError - on an invalid configuration –
subprocess.TimeoutExpired - if the timeout limit is exceeded –
subprocess.CalledProcessError - if check and the return value is non zero –

Note

Inputs can either be text or tuple based. On UNIX we ensure conversion to text if shell=True, and to tuple if shell=False. On windows, the input is always text based. See [SO_33560364] for a potential cross-platform shlex solution for windows.

When using the tee output, the stdout and stderr may be shuffled from what they would be on the command line.

Related Work:: https://github.com/pycontribs/subprocess-tee https://github.com/mortoray/shelljob https://github.com/netinvent/command_runner https://www.pyinvoke.org/prior-art.html

References

SO_11495783: https://stackoverflow.com/questions/11495783/redirect-subprocess-stderr-to-stdout
SO_7729336: https://stackoverflow.com/questions/7729336/how-can-i-print-and-display-subprocess-stdout-and-stderr-output-without-distorti
SO_33560364: https://stackoverflow.com/questions/33560364/python-windows-parsing-command-lines-with-shlex

CommandLine

xdoctest -m ubelt.util_cmd cmd:6
python -c "import ubelt as ub; ub.cmd('ping localhost -c 2', verbose=2)"
pytest "$(python -c 'import ubelt; print(ubelt.util_cmd.__file__)')" -sv --xdoctest-verbose 2

Example

>>> import ubelt as ub
>>> info = ub.cmd(('echo', 'simple cmdline interface'), verbose=1)
simple cmdline interface
>>> assert info['ret'] == 0
>>> assert info['out'].strip() == 'simple cmdline interface'
>>> assert info['err'].strip() == ''

Example

>>> import ubelt as ub
>>> info = ub.cmd('echo str noshell', verbose=0)
>>> assert info['out'].strip() == 'str noshell'

Example

>>> # windows echo will output extra single quotes
>>> import ubelt as ub
>>> info = ub.cmd(('echo', 'tuple noshell'), verbose=0)
>>> assert info['out'].strip().strip("'") == 'tuple noshell'

Example

>>> # Note this command is formatted to work on win32 and unix
>>> import ubelt as ub
>>> info = ub.cmd('echo str&&echo shell', verbose=0, shell=True)
>>> assert info['out'].strip() == 'str' + chr(10) + 'shell'

Example

>>> import ubelt as ub
>>> info = ub.cmd(('echo', 'tuple shell'), verbose=0, shell=True)
>>> assert info['out'].strip().strip("'") == 'tuple shell'

Example

>>> import pytest
>>> import ubelt as ub
>>> info = ub.cmd('echo hi', check=True)
>>> import subprocess
>>> with pytest.raises(subprocess.CalledProcessError):
>>>     ub.cmd('exit 1', check=True, shell=True)

Example

>>> import ubelt as ub
>>> from os.path import join, exists
>>> dpath = ub.Path.appdir('ubelt', 'test').ensuredir()
>>> fpath1 = (dpath / 'cmdout1.txt').delete()
>>> fpath2 = (dpath / 'cmdout2.txt').delete()
>>> # Start up two processes that run simultaneously in the background
>>> info1 = ub.cmd(('touch', str(fpath1)), detach=True)
>>> info2 = ub.cmd('echo writing2 > ' + str(fpath2), shell=True, detach=True)
>>> # Detached processes are running in the background
>>> # We can run other code while we wait for them.
>>> while not exists(fpath1):
...     pass
>>> while not exists(fpath2):
...     pass
>>> # communicate with the process before you finish
>>> # (otherwise you may leak a text wrapper)
>>> info1['proc'].communicate()
>>> info2['proc'].communicate()
>>> # Check that the process actually did finish
>>> assert (info1['proc'].wait()) == 0
>>> assert (info2['proc'].wait()) == 0
>>> # Check that the process did what we expect
>>> assert fpath1.read_text() == ''
>>> assert fpath2.read_text().strip() == 'writing2'

Example

>>> # Can also use ub.cmd to call os.system
>>> import pytest
>>> import ubelt as ub
>>> import subprocess
>>> info = ub.cmd('echo hi', check=True, system=True)
>>> with pytest.raises(subprocess.CalledProcessError):
>>>     ub.cmd('exit 1', check=True, shell=True)

ubelt.codeblock(text)[source]¶

Create a block of text that preserves all newlines and relative indentation

Wraps multiline string blocks and returns unindented code. Useful for templated code defined in indented parts of code.

Parameters: text (str) – typically a multiline string
Returns: the unindented string
Return type: str

Example

>>> import ubelt as ub
>>> # Simulate an indented part of code
>>> if True:
>>>     # notice the indentation on this will be normal
>>>     codeblock_version = ub.codeblock(
...             '''
...             def foo():
...                 return 'bar'
...             '''
...         )
>>>     # notice the indentation and newlines on this will be odd
>>>     normal_version = ('''
...         def foo():
...             return 'bar'
...     ''')
>>> assert normal_version != codeblock_version
>>> print('Without codeblock')
>>> print(normal_version)
>>> print('With codeblock')
>>> print(codeblock_version)

ubelt.color_text(text, color)[source]¶

Colorizes text a single color using ansi tags.

Parameters

text (str) – text to colorize
color (str) – color code. different systems may have different colors. commonly available colors are: ‘red’, ‘brown’, ‘yellow’, ‘green’, ‘blue’, ‘black’, and ‘white’.

Returns

text - colorized text. If pygments is not installed plain text is returned.

Return type

ubelt.util_path.Path.delete()

Example

>>> text = 'raw text'
>>> import pytest
>>> import ubelt as ub
>>> if ub.modname_to_modpath('pygments'):
>>>     # Colors text only if pygments is installed
>>>     ansi_text = ub.color_text(text, 'red')
>>>     prefix = '\x1b[31'
>>>     print('prefix = {!r}'.format(prefix))
>>>     print('ansi_text = {!r}'.format(ansi_text))
>>>     assert ansi_text.startswith(prefix)
>>>     assert ub.color_text(text, None) == 'raw text'
>>> else:
>>>     # Otherwise text passes through unchanged
>>>     assert ub.color_text(text, 'red') == 'raw text'
>>>     assert ub.color_text(text, None) == 'raw text'

Example

>>> # xdoctest: +REQUIRES(module:pygments)
>>> import pygments.console
>>> import ubelt as ub
>>> known_colors = pygments.console.codes.keys()
>>> for color in known_colors:
...     print(ub.color_text(color, color))

ubelt.compatible(config, func, start=0, keywords=True)[source]¶

Take the “compatible” subset of a dictionary that a function will accept as keyword arguments.

A common pattern is to track the configuration of a program in a single dictionary. Often there will be functions that only require subsets of this dictionary, and they will be written such that those items are passed via keyword arguments. The ubelt.compatible() utility makes it easier select only the relevant config variables. It does this by inspecting the signature of the function to determine what keyword arguments it accepts, and returns the dictionary intersection of the full config and the allowed keywords. The user can then call the function with the normal ** mechanism.

Parameters

config (Dict[str, Any]) – A dictionary that contains keyword arguments that might be passed to a function.
func (Callable) – A function or method to check the arguments of
start (int) – Only take args after this position. Set to 1 if calling with an unbound method to avoid the self argument. Defaults to 0.
keywords (bool | Iterable[str]) – If True (default), and **kwargs is in the signature, prevent any filtering of the config dictionary. If False, then ignore that **kwargs is in the signature and only return the subset of config that matches the explicit signature. Otherwise if specified as a non-string iterable of strings, assume these are the allowed keys that are compatible with the way kwargs is handled in the function.

Returns

A subset of config that only contains items compatible with the signature of func.

Return type

Dict[str, Any]

Example

>>> # An example use case is to select a subset of of a config
>>> # that can be passed to some function as kwargs
>>> import ubelt as ub
>>> # Define a function with args that match some keys in a config.
>>> def func(a, e, f):
>>>     return a * e * f
>>> # Define a config that has a superset of items needed by the func
>>> config = {
...   'a': 2, 'b': 3, 'c': 7,
...   'd': 11, 'e': 13, 'f': 17,
... }
>>> # Call the function only with keys that are compatible
>>> func(**ub.compatible(config, func))
442

Example

>>> # Test case with kwargs
>>> import ubelt as ub
>>> def func(a, e, f, *args, **kwargs):
>>>     return a * e * f
>>> config = {
...   'a': 2, 'b': 3, 'c': 7,
...   'd': 11, 'e': 13, 'f': 17,
... }
>>> func(**ub.compatible(config, func))
442
>>> print(sorted(ub.compatible(config, func)))
['a', 'b', 'c', 'd', 'e', 'f']
>>> print(sorted(ub.compatible(config, func, keywords=False)))
['a', 'e', 'f']
>>> print(sorted(ub.compatible(config, func, keywords={'b'})))
['a', 'b', 'e', 'f']

ubelt.compress(items, flags)[source]¶

Selects from items where the corresponding value in flags is True.

Parameters

items (Iterable[Any]) – a sequence to select items from
flags (Iterable[bool]) – corresponding sequence of bools

Returns

a subset of masked items

Return type

Iterable[Any]

Notes

This function is based on numpy.compress(), but is pure Python and swaps the condition and array argument to be consistent with ubelt.take().

This is equivalent to itertools.compress().

Example

>>> import ubelt as ub
>>> items = [1, 2, 3, 4, 5]
>>> flags = [False, True, True, False, True]
>>> list(ub.compress(items, flags))
[2, 3, 5]

ubelt.ddict¶: alias of defaultdict

ubelt.delete(path, verbose=False)[source]¶

Removes a file or recursively removes a directory. If a path does not exist, then this is does nothing.

Parameters

path (str | PathLike) – file or directory to remove
verbose (bool) – if True prints what is being done

SeeAlso:

send2trash -: A cross-platform Python package for sending files to the trash instead of irreversibly deleting them.

Notes

This can call os.unlink(), os.rmdir(), or shutil.rmtree(), depending on what path references on the filesystem. (On windows may also call a custom ubelt._win32_links._win32_rmtree()).

Example

>>> import ubelt as ub
>>> from os.path import join
>>> base = ub.Path.appdir('ubelt', 'delete_test').ensuredir()
>>> dpath1 = ub.ensuredir(join(base, 'dir'))
>>> ub.ensuredir(join(base, 'dir', 'subdir'))
>>> ub.touch(join(base, 'dir', 'to_remove1.txt'))
>>> fpath1 = join(base, 'dir', 'subdir', 'to_remove3.txt')
>>> fpath2 = join(base, 'dir', 'subdir', 'to_remove2.txt')
>>> ub.touch(fpath1)
>>> ub.touch(fpath2)
>>> assert all(map(exists, (dpath1, fpath1, fpath2)))
>>> ub.delete(fpath1)
>>> assert all(map(exists, (dpath1, fpath2)))
>>> assert not exists(fpath1)
>>> ub.delete(dpath1)
>>> assert not any(map(exists, (dpath1, fpath1, fpath2)))

Example

>>> import ubelt as ub
>>> from os.path import exists, join
>>> dpath = ub.Path.appdir('ubelt', 'delete_test2').ensuredir()
>>> dpath1 = ub.ensuredir(join(dpath, 'dir'))
>>> fpath1 = ub.touch(join(dpath1, 'to_remove.txt'))
>>> assert exists(fpath1)
>>> ub.delete(dpath)
>>> assert not exists(fpath1)

ubelt.dict_diff(*args)[source]¶

Dictionary set extension for set.difference()

Constructs a dictionary that contains any of the keys in the first arg, which are not in any of the following args.

Parameters: *args (List[Dict[KT, VT] | Iterable[KT]]) – A sequence of dictionaries (or sets of keys). The first argument should always be a dictionary, but the subsequent arguments can just be sets of keys.
Returns: OrderedDict if the first argument is an OrderedDict, otherwise dict
Return type: Dict[KT, VT] | OrderedDict[KT, VT]

Todo

[ ] Add inplace keyword argument, which modifies the first dictionary inplace.

Example

>>> import ubelt as ub
>>> ub.dict_diff({'a': 1, 'b': 1}, {'a'}, {'c'})
{'b': 1}
>>> ub.dict_diff(odict([('a', 1), ('b', 2)]), odict([('c', 3)]))
OrderedDict([('a', 1), ('b', 2)])
>>> ub.dict_diff()
{}
>>> ub.dict_diff({'a': 1, 'b': 2}, {'c'})

ubelt.dict_hist(items, weights=None, ordered=False, labels=None)[source]¶

Builds a histogram of items, counting the number of time each item appears in the input.

Parameters

items (Iterable[T]) – hashable items (usually containing duplicates)
weights (Iterable[float], default=None) – Corresponding weights for each item.
ordered (bool, default=False) – If True the result is ordered by frequency.
labels (Iterable[T], default=None) – Expected labels. Allows this function to pre-initialize the histogram. If specified the frequency of each label is initialized to zero and items can only contain items specified in labels.

Returns

dictionary where the keys are unique elements from items, and the values are the number of times the item appears in items.

Return type

dict[T, int]

Example

>>> import ubelt as ub
>>> items = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist = ub.dict_hist(items)
>>> print(ub.repr2(hist, nl=0))
{1: 1, 2: 4, 39: 1, 900: 3, 1232: 2}

Example

>>> import ubelt as ub
>>> items = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist1 = ub.dict_hist(items)
>>> hist2 = ub.dict_hist(items, ordered=True)
>>> try:
>>>     hist3 = ub.dict_hist(items, labels=[])
>>> except KeyError:
>>>     pass
>>> else:
>>>     raise AssertionError('expected key error')
>>> weights = [1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]
>>> hist4 = ub.dict_hist(items, weights=weights)
>>> print(ub.repr2(hist1, nl=0))
{1: 1, 2: 4, 39: 1, 900: 3, 1232: 2}
>>> print(ub.repr2(hist4, nl=0))
{1: 1, 2: 4, 39: 1, 900: 1, 1232: 0}

ubelt.dict_isect(*args)[source]¶

Dictionary set extension for set.intersection()

Constructs a dictionary that contains keys common between all inputs. The returned values will only belong to the first dictionary.

Parameters: *args (List[Dict[KT, VT] | Iterable[KT]]) – A sequence of dictionaries (or sets of keys). The first argument should always be a dictionary, but the subsequent arguments can just be sets of keys.
Returns: OrderedDict if the first argument is an OrderedDict, otherwise dict
Return type: Dict[KT, VT] | OrderedDict[KT, VT]

Note

This function can be used as an alternative to dict_subset() where any key not in the dictionary is ignored. See the following example:

>>> import ubelt as ub
>>> # xdoctest: +IGNORE_WANT
>>> ub.dict_isect({'a': 1, 'b': 2, 'c': 3}, ['a', 'c', 'd'])
{'a': 1, 'c': 3}

Example

>>> import ubelt as ub
>>> ub.dict_isect({'a': 1, 'b': 1}, {'b': 2, 'c': 2})
{'b': 1}
>>> ub.dict_isect(odict([('a', 1), ('b', 2)]), odict([('c', 3)]))
OrderedDict()
>>> ub.dict_isect()
{}

ubelt.dict_subset(dict_, keys, default=NoParam, cls=<class 'collections.OrderedDict'>)[source]¶

Get a subset of a dictionary

Parameters

dict_ (Dict[KT, VT]) – superset dictionary
keys (Iterable[KT]) – keys to take from dict_
default (Optional[object] | NoParamType) – if specified uses default if keys are missing.
cls (Type[Dict], default=OrderedDict) – type of the returned dictionary.

Returns

subset dictionary

Return type

Dict[KT, VT]

SeeAlso:: dict_isect() - similar functionality, but ignores missing keys

Example

>>> import ubelt as ub
>>> dict_ = {'K': 3, 'dcvs_clip_max': 0.2, 'p': 0.1}
>>> keys = ['K', 'dcvs_clip_max']
>>> subdict_ = ub.dict_subset(dict_, keys)
>>> print(ub.repr2(subdict_, nl=0))
{'K': 3, 'dcvs_clip_max': 0.2}

ubelt.dict_union(*args)[source]¶

Dictionary set extension for set.union

Combines items with from multiple dictionaries. For items with intersecting keys, dictionaries towards the end of the sequence are given precedence.

Parameters: *args (List[Dict]) – A sequence of dictionaries. Values are taken from the last
Returns: OrderedDict if the first argument is an OrderedDict, otherwise dict
Return type: Dict | OrderedDict

Notes

In Python 3.8+, the bitwise or operator “|” operator performs a similar operation, but as of 2022-06-01 there is still no public method for dictionary union (or any other dictionary set operator).

References

https://stackoverflow.com/questions/38987/merge-two-dict

SeeAlso:: collections.ChainMap() - a standard python builtin data structure that provides a view that treats multiple dicts as a single dict. https://docs.python.org/3/library/collections.html#chainmap-objects

Example

>>> import ubelt as ub
>>> result = ub.dict_union({'a': 1, 'b': 1}, {'b': 2, 'c': 2})
>>> assert result == {'a': 1, 'b': 2, 'c': 2}
>>> ub.dict_union(
>>>     ub.odict([('a', 1), ('b', 2)]),
>>>     ub.odict([('c', 3), ('d', 4)]))
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
>>> ub.dict_union()
{}

ubelt.download(url, fpath=None, dpath=None, fname=None, appname=None, hash_prefix=None, hasher='sha512', chunksize=8192, verbose=1, timeout=NoParam, progkw=None)[source]¶

Downloads a url to a file on disk.

If unspecified the location and name of the file is chosen automatically. A hash_prefix can be specified to verify the integrity of the downloaded data. This function will download the data every time its called. For cached downloading see grabdata.

Parameters

url (str) – The url to download.
fpath (Optional[str | PathLike | io.BytesIO]) – The path to download to. Defaults to basename of url and ubelt’s application cache. If this is a io.BytesIO object then information is directly written to this object (note this prevents the use of temporary files).
dpath (Optional[PathLike]) – where to download the file. If unspecified appname is used to determine this. Mutually exclusive with fpath.
fname (Optional[str]) – What to name the downloaded file. Defaults to the url basename. Mutually exclusive with fpath.
appname (str) – set dpath to ub.get_app_cache_dir(appname or 'ubelt') if dpath and fpath are not given.
hash_prefix (None | str) – If specified, download will retry / error if the file hash does not match this value. Defaults to None.
hasher (str | Hasher) – If hash_prefix is specified, this indicates the hashing algorithm to apply to the file. Defaults to sha512.
chunksize (int) – Download chunksize. Default to 2 ** 13
verbose (int | bool) – Verbosity flag. Quiet is 0, higher is more verbose. Defaults to 1.
timeout (float | NoParamType) – Specify timeout in seconds for urllib.request.urlopen(). (if not specified, the global default timeout setting will be used) This only works for HTTP, HTTPS and FTP connections for blocking operations like the connection attempt.
progkw (Dict | NoParamType) – if specified provides extra arguments to the progress iterator. See ubelt.progiter.ProgIter for available options.

Returns

fpath - path to the downloaded file.

Return type

str | PathLike

Raises

URLError - if there is problem downloading the url –
RuntimeError - if the hash does not match the hash_prefix –

Note

Based largely on code in pytorch [TorchDL] with modifications influenced by other resources [Shichao_2012] [SO_15644964] [SO_16694907].

References

Shichao_2012: https://blog.shichao.io/2012/10/04/progress_speed_indicator_for_urlretrieve_in_python.html
SO_15644964: http://stackoverflow.com/questions/15644964/python-progress-bar-and-downloads
SO_16694907: http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py
TorchDL: https://github.com/pytorch/pytorch/blob/2787f1d8edbd4aadd4a8680d204341a1d7112e2d/torch/hub.py#L347

Example

>>> # xdoctest: +REQUIRES(--network)
>>> from ubelt.util_download import *  # NOQA
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> fpath = download(url)
>>> print(basename(fpath))
rqwaDag.png

Example

>>> # xdoctest: +REQUIRES(--network)
>>> import ubelt as ub
>>> import io
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> file = io.BytesIO()
>>> fpath = ub.download(url, file)
>>> file.seek(0)
>>> data = file.read()
>>> assert ub.hash_data(data, hasher='sha1').startswith('f79ea24571')

Example

>>> # xdoctest: +REQUIRES(--network)
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> fpath = download(url, hasher='sha1', hash_prefix='f79ea24571da6ddd2ba12e3d57b515249ecb8a35')
Downloading url='http://i.imgur.com/rqwaDag.png' to fpath=...rqwaDag.png
...
...1233/1233... rate=... Hz, eta=..., total=...

Example

>>> # xdoctest: +REQUIRES(--network)
>>> import pytest
>>> import ubelt as ub
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> #fpath = download(url, hasher='sha1', hash_prefix='f79ea24571da6ddd2ba12e3d57b515249ecb8a35')
>>> # test download from girder
>>> #url = 'https://data.kitware.com/api/v1/item/5b4039308d777f2e6225994c/download'
>>> #ub.download(url, hasher='sha512', hash_prefix='c98a46cb31205cf')
>>> with pytest.raises(RuntimeError):
>>>     ub.download(url, hasher='sha512', hash_prefix='BAD_HASH')

ubelt.dzip(items1, items2, cls=<class 'dict'>)[source]¶

Zips elementwise pairs between items1 and items2 into a dictionary.

Values from items2 can be broadcast onto items1.

Parameters

items1 (Iterable[KT]) – full sequence
items2 (Iterable[VT]) – can either be a sequence of one item or a sequence of equal length to items1
cls (Type[dict], default=dict) – dictionary type to use.

Returns

similar to dict(zip(items1, items2)).

Return type

Dict[KT, VT]

Example

>>> import ubelt as ub
>>> assert ub.dzip([1, 2, 3], [4]) == {1: 4, 2: 4, 3: 4}
>>> assert ub.dzip([1, 2, 3], [4, 4, 4]) == {1: 4, 2: 4, 3: 4}
>>> assert ub.dzip([], [4]) == {}

ubelt.ensure_app_cache_dir(appname, *args)[source]¶

Calls get_app_cache_dir() but ensures the directory exists.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='cache').ensuredir().

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

the path to the ensured directory

Return type

SeeAlso:: get_app_cache_dir()

Example

>>> import ubelt as ub
>>> dpath = ub.ensure_app_cache_dir('ubelt')
>>> assert exists(dpath)

ubelt.ensure_app_config_dir(appname, *args)[source]¶

Calls get_app_config_dir() but ensures the directory exists.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='config').ensuredir().

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

the path to the ensured directory

Return type

SeeAlso:: get_app_config_dir()

Example

>>> import ubelt as ub
>>> dpath = ub.ensure_app_config_dir('ubelt')
>>> assert exists(dpath)

ubelt.ensure_app_data_dir(appname, *args)[source]¶

Calls get_app_data_dir() but ensures the directory exists.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='data').ensuredir().

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

the path to the ensured directory

Return type

SeeAlso:: get_app_data_dir()

Example

>>> import ubelt as ub
>>> dpath = ub.ensure_app_data_dir('ubelt')
>>> assert exists(dpath)

ubelt.ensure_unicode(text)[source]¶

Casts bytes into utf8 (mostly for python2 compatibility)

Parameters: text (str | bytes) – text to ensure is decoded as unicode
Returns: str

References

[SO_12561063] http://stackoverflow.com/questions/12561063/extract-data-from-file

Example

>>> from ubelt.util_str import *
>>> import codecs  # NOQA
>>> assert ensure_unicode('my ünicôdé strįng') == 'my ünicôdé strįng'
>>> assert ensure_unicode('text1') == 'text1'
>>> assert ensure_unicode('text1'.encode('utf8')) == 'text1'
>>> assert ensure_unicode('ï»¿text1'.encode('utf8')) == 'ï»¿text1'
>>> assert (codecs.BOM_UTF8 + 'text»¿'.encode('utf8')).decode('utf8')

ubelt.ensuredir(dpath, mode=1023, verbose=0, recreate=False)[source]¶

Ensures that directory will exist. Creates new dir with sticky bits by default

Parameters

dpath (str | PathLike | Tuple[str | PathLike]) – dir to ensure. Can also be a tuple to send to join
mode (int) – octal mode of directory
verbose (int) – verbosity
recreate (bool) – if True removes the directory and all of its contents and creates a fresh new directory. USE CAREFULLY.

Returns

path - the ensured directory

Return type

SeeAlso:: ubelt.Path.ensuredir()

Note

This function is not thread-safe in Python2

Example

>>> from ubelt.util_path import *  # NOQA
>>> import ubelt as ub
>>> cache_dpath = ub.Path.appdir('ubelt').ensuredir()
>>> dpath = join(cache_dpath, 'ensuredir')
>>> if exists(dpath):
...     os.rmdir(dpath)
>>> assert not exists(dpath)
>>> ub.ensuredir(dpath)
>>> assert exists(dpath)
>>> os.rmdir(dpath)

ubelt.expandpath(path)[source]¶

Shell-like environment variable and tilde path expansion.

Parameters: path (str | PathLike) – string representation of a path
Returns: expanded path
Return type: str

Example

>>> from ubelt.util_path import *  # NOQA
>>> import ubelt as ub
>>> assert normpath(ub.expandpath('~/foo')) == join(ub.userhome(), 'foo')
>>> assert ub.expandpath('foo') == 'foo'

ubelt.find_duplicates(items, k=2, key=None)[source]¶

Find all duplicate items in a list.

Search for all items that appear more than k times and return a mapping from each (k)-duplicate item to the positions it appeared in.

Parameters

items (Iterable[T]) – hashable items possibly containing duplicates
k (int, default=2) – only return items that appear at least k times.
key (Callable[[T], Any], default=None) – Returns indices where key(items[i]) maps to a particular value at least k times.

Returns

maps each duplicate item to the indices at which it appears

Return type

dict[T, List[int]]

Notes

Similar to more_itertools.duplicates_everseen(), more_itertools.duplicates_justseen().

Example

>>> import ubelt as ub
>>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9]
>>> duplicates = ub.find_duplicates(items)
>>> # Duplicates are a mapping from each item that occurs 2 or more
>>> # times to the indices at which they occur.
>>> assert duplicates == {0: [0, 1, 6], 2: [3, 8], 3: [4, 5]}
>>> # You can set k=3 if you want to don't mind duplicates but you
>>> # want to find triplicates or quadruplets etc.
>>> assert ub.find_duplicates(items, k=3) == {0: [0, 1, 6]}

Example

>>> import ubelt as ub
>>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9]
>>> # note: k can less then 2
>>> duplicates = ub.find_duplicates(items, k=0)
>>> print(ub.repr2(duplicates, nl=0))
{0: [0, 1, 6], 1: [2], 2: [3, 8], 3: [4, 5], 9: [9], 12: [7]}

Example

>>> import ubelt as ub
>>> items = [10, 11, 12, 13, 14, 15, 16]
>>> duplicates = ub.find_duplicates(items, key=lambda x: x // 2)
>>> print(ub.repr2(duplicates, nl=0))
{5: [0, 1], 6: [2, 3], 7: [4, 5]}

ubelt.find_exe(name, multi=False, path=None)[source]¶

Locate a command.

Search your local filesystem for an executable and return the first matching file with executable permission.

Parameters

name (str | PathLike) – globstr of matching filename
multi (bool, default=False) – if True return all matches instead of just the first.
path (str | PathLike | Iterable[str | PathLike] | None, default=None) – overrides the system PATH variable.

Returns

returns matching executable(s).

Return type

str | List[str] | None

SeeAlso:: shutil.which() - which is available in Python 3.3+.

Note

This is essentially the which UNIX command

References

SO_377017: https://stackoverflow.com/questions/377017/test-if-executable-exists-in-python/377028#377028
shutil_which: https://docs.python.org/dev/library/shutil.html#shutil.which

Example

>>> # The following are programs commonly exposed via the PATH variable.
>>> # Exact results may differ between machines.
>>> # xdoctest: +IGNORE_WANT
>>> import ubelt as ub
>>> print(ub.find_exe('ls'))
>>> print(ub.find_exe('ping'))
>>> print(ub.find_exe('which'))
>>> print(ub.find_exe('which', multi=True))
>>> print(ub.find_exe('ping', multi=True))
>>> print(ub.find_exe('noexist', multi=True))
/usr/bin/ls
/usr/bin/ping
/usr/bin/which
['/usr/bin/which', '/bin/which']
['/usr/bin/ping', '/bin/ping']
[]

Example

>>> import ubelt as ub
>>> assert not ub.find_exe('!noexist', multi=False)
>>> assert ub.find_exe('ping', multi=False) or ub.find_exe('ls', multi=False)
>>> assert not ub.find_exe('!noexist', multi=True)
>>> assert ub.find_exe('ping', multi=True) or ub.find_exe('ls', multi=True)

Benchmark:

>>> # xdoctest: +IGNORE_WANT
>>> import ubelt as ub
>>> import shutil
>>> from timerit import Timerit
>>> for timer in Timerit(1000, bestof=10, label='ub.find_exe'):
>>>     ub.find_exe('which')
>>> for timer in Timerit(1000, bestof=10, label='shutil.which'):
>>>     shutil.which('which')
Timed best=25.339 µs, mean=25.809 ± 0.3 µs for ub.find_exe
Timed best=28.600 µs, mean=28.986 ± 0.3 µs for shutil.which

ubelt.find_path(name, path=None, exact=False)[source]¶

Search for a file or directory on your local filesystem by name (file must be in a directory specified in a PATH environment variable)

Parameters

name (str | PathLike) – file name to match. If exact is False this may be a glob pattern
path (str | Iterable[str | PathLike], default=None) – list of directories to search either specified as an os.pathsep separated string or a list of directories. Defaults to environment PATH.
exact (bool, default=False) – if True, only returns exact matches.

Yields

str – candidate - a path that matches name

Note

Running with name='' (i.e. ub.find_path('')) will simply yield all directories in your PATH.

Note

For recursive behavior set path=(d for d, _, _ in os.walk('.')), where ‘.’ might be replaced by the root directory of interest.

Example

>>> # xdoctest: +IGNORE_WANT
>>> import ubelt as ub
>>> print(list(ub.find_path('ping', exact=True)))
>>> print(list(ub.find_path('bin')))
>>> print(list(ub.find_path('gcc*')))
>>> print(list(ub.find_path('cmake*')))
['/usr/bin/ping', '/bin/ping']
[]
[... '/usr/bin/gcc-11', '/usr/bin/gcc-ranlib', ...]
[... '/usr/bin/cmake-gui', '/usr/bin/cmake', ...]

Example

>>> import ubelt as ub
>>> from os.path import dirname
>>> path = dirname(dirname(ub.util_platform.__file__))
>>> res = sorted(ub.find_path('ubelt/util_*.py', path=path))
>>> assert len(res) >= 10
>>> res = sorted(ub.find_path('ubelt/util_platform.py', path=path, exact=True))
>>> print(res)
>>> assert len(res) == 1

ubelt.flatten(nested)[source]¶

Transforms a nested iterable into a flat iterable.

Parameters: nested (Iterable[Iterable[Any]]) – list of lists
Returns: flattened items
Return type: Iterable[Any]

Notes

Equivalent to more_itertools.flatten() and itertools.chain.from_iterable().

Example

>>> import ubelt as ub
>>> nested = [['a', 'b'], ['c', 'd']]
>>> list(ub.flatten(nested))
['a', 'b', 'c', 'd']

ubelt.get_app_cache_dir(appname, *args)[source]¶

Returns a writable directory for an application. This should be used for temporary deletable data.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='cache').

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

the path to the ensured directory

Return type

Returns

dpath - writable cache directory for this application

Return type

SeeAlso:: ensure_app_cache_dir()

ubelt.get_app_config_dir(appname, *args)[source]¶

Returns a writable directory for an application This should be used for persistent configuration files.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='config').

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

dpath - writable config directory for this application

Return type

SeeAlso:: ensure_app_config_dir()

ubelt.get_app_data_dir(appname, *args)[source]¶

Returns a writable directory for an application. This should be used for temporary deletable data.

Note

New applications should prefer ubelt.util_path.Path.appdir() i.e. ubelt.Path.appdir(appname, *args, type='data').

Parameters

appname (str) – the name of the application
*args – any other subdirectories may be specified

Returns

dpath - writable data directory for this application

Return type

SeeAlso:: ensure_app_data_dir()

ubelt.grabdata(url, fpath=None, dpath=None, fname=None, redo=False, verbose=1, appname=None, hash_prefix=None, hasher='sha512', expires=None, **download_kw)[source]¶

Downloads a file, caches it, and returns its local path.

If unspecified the location and name of the file is chosen automatically. A hash_prefix can be specified to verify the integrity of the downloaded data.

Parameters

url (str) – url of the file to download
fpath (Optional[str | PathLike]) – The full path to download the file to. If unspecified, the arguments dpath and fname are used to determine this.
dpath (Optional[str | PathLike]) – where to download the file. If unspecified appname is used to determine this. Mutually exclusive with fpath.
fname (Optional[str]) – What to name the downloaded file. Defaults to the url basename. Mutually exclusive with fpath.
redo (bool, default=False) – if True forces redownload of the file
verbose (int) – Verbosity flag. Quiet is 0, higher is more verbose. Defaults to 1.
appname (str) – set dpath to ub.get_app_cache_dir(appname or 'ubelt') if dpath and fpath are not given.
hash_prefix (None | str) – If specified, grabdata verifies that this matches the hash of the file, and then saves the hash in a adjacent file to certify that the download was successful. Defaults to None.
hasher (str | Hasher) – If hash_prefix is specified, this indicates the hashing algorithm to apply to the file. Defaults to sha512. NOTE: Only pass hasher as a string. Passing as an instance is deprecated and can cause unexpected results.
expires (str | int | datetime.datetime) – when the cache should expire and redownload or the number of seconds to wait before the cache should expire.
**download_kw – additional kwargs to pass to ubelt.util_download.download()

Returns

fpath - path to downloaded or cached file.

Return type

str | PathLike

CommandLine

xdoctest -m ubelt.util_download grabdata --network

Example

>>> # xdoctest: +REQUIRES(--network)
>>> import ubelt as ub
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> fpath = ub.grabdata(url, fname='mario.png')
>>> result = basename(fpath)
>>> print(result)
mario.png

Example

>>> # xdoctest: +REQUIRES(--network)
>>> import ubelt as ub
>>> import json
>>> fname = 'foo.bar'
>>> url = 'http://i.imgur.com/rqwaDag.png'
>>> prefix1 = '944389a39dfb8fa9'
>>> fpath = ub.grabdata(url, fname=fname, hash_prefix=prefix1, verbose=3)
>>> stamp_fpath = ub.Path(fpath + '.stamp_sha512.json')
>>> assert json.loads(stamp_fpath.read_text())['hash'][0].startswith(prefix1)
>>> # Check that the download doesn't happen again
>>> fpath = ub.grabdata(url, fname=fname, hash_prefix=prefix1)
>>> # todo: check file timestamps have not changed
>>> #
>>> # Check redo works with hash
>>> fpath = ub.grabdata(url, fname=fname, hash_prefix=prefix1, redo=True)
>>> # todo: check file timestamps have changed
>>> #
>>> # Check that a redownload occurs when the stamp is changed
>>> with open(stamp_fpath, 'w') as file:
>>>     file.write('corrupt-stamp')
>>> fpath = ub.grabdata(url, fname=fname, hash_prefix=prefix1)
>>> assert json.loads(stamp_fpath.read_text())['hash'][0].startswith(prefix1)
>>> #
>>> # Check that a redownload occurs when the stamp is removed
>>> ub.delete(stamp_fpath)
>>> with open(fpath, 'w') as file:
>>>     file.write('corrupt-data')
>>> assert not ub.hash_file(fpath, base='hex', hasher='sha512').startswith(prefix1)
>>> fpath = ub.grabdata(url, fname=fname, hash_prefix=prefix1)
>>> assert ub.hash_file(fpath, base='hex', hasher='sha512').startswith(prefix1)
>>> #
>>> # Check that requesting new data causes redownload
>>> #url2 = 'https://data.kitware.com/api/v1/item/5b4039308d777f2e6225994c/download'
>>> #prefix2 = 'c98a46cb31205cf'  # hack SSL
>>> url2 = 'http://i.imgur.com/rqwaDag.png'
>>> prefix2 = '944389a39dfb8fa9'
>>> fpath = ub.grabdata(url2, fname=fname, hash_prefix=prefix2)
>>> assert json.loads(stamp_fpath.read_text())['hash'][0].startswith(prefix2)

ubelt.group_items(items, key)[source]¶

Groups a list of items by group id.

Parameters

items (Iterable[VT]) – a list of items to group
key (Iterable[KT] | Callable[[VT], KT]) – either a corresponding list of group-ids for each item or a function used to map each item to a group-id.

Returns

a mapping from each group id to the list of corresponding items

Return type

dict[KT, List[VT]]

Example

>>> import ubelt as ub
>>> items    = ['ham',     'jam',   'spam',     'eggs',    'cheese', 'banana']
>>> groupids = ['protein', 'fruit', 'protein',  'protein', 'dairy',  'fruit']
>>> id_to_items = ub.group_items(items, groupids)
>>> print(ub.repr2(id_to_items, nl=0))
{'dairy': ['cheese'], 'fruit': ['jam', 'banana'], 'protein': ['ham', 'spam', 'eggs']}

ubelt.hash_data(data, hasher=NoParam, base=NoParam, types=False, convert=False, extensions=None)[source]¶

Get a unique hash depending on the state of the data.

Parameters

data (object) – Any sort of loosely organized data
hasher (str | Hasher | NoParamType) – string code or a hash algorithm from hashlib. Valid hashing algorithms are defined by hashlib.algorithms_guaranteed (e.g. ‘sha1’, ‘sha512’, ‘md5’) as well as ‘xxh32’ and ‘xxh64’ if xxhash is installed. Defaults to ‘sha512’.
base (List[str] | str | NoParamType) – list of symbols or shorthand key. Valid keys are ‘abc’, ‘hex’, and ‘dec’. Defaults to ‘hex’
types (bool) – If True data types are included in the hash, otherwise only the raw data is hashed. Defaults to False.
convert (bool, default=True) – if True, try and convert the data to json an the json is hashed instead. This can improve runtime in some instances, however the hash may differ from the case where convert=False.
extensions (HashableExtensions) – a custom HashableExtensions instance that can overwrite or define how different types of objects are hashed.

Note

The types allowed are specified by the HashableExtensions object. By default ubelt will register:

OrderedDict, uuid.UUID, np.random.RandomState, np.int64, np.int32, np.int16, np.int8, np.uint64, np.uint32, np.uint16, np.uint8, np.float16, np.float32, np.float64, np.float128, np.ndarray, bytes, str, int, float, long (in python2), list, tuple, set, and dict

Returns: text representing the hashed data
Return type: str

Note

The alphabet26 base is a pretty nice base, I recommend it. However we default to base='hex' because it is standard. You can try the alphabet26 base by setting base='abc'.

Example

>>> import ubelt as ub
>>> print(ub.hash_data([1, 2, (3, '4')], convert=False))
60b758587f599663931057e6ebdf185a...
>>> print(ub.hash_data([1, 2, (3, '4')], base='abc',  hasher='sha512')[:32])
hsrgqvfiuxvvhcdnypivhhthmrolkzej

ubelt.hash_file(fpath, blocksize=1048576, stride=1, maxbytes=None, hasher=NoParam, base=NoParam)[source]¶

Hashes the data in a file on disk.

The results of this function agree with the standard UNIX commands (e.g. sha1sum, sha512sum, md5sum, etc…)

Parameters

fpath (PathLike) – location of the file to be hashed.
blocksize (int) – Amount of data to read and hash at a time. There is a trade off and the optimal number will depend on specific hardware. This number was chosen to be optimal on a developer system. See “dev/bench_hash_file” for methodology to choose this number for your use case. Defaults to 2 ** 20.
stride (int) – strides > 1 skip data to hash, useful for faster hashing, but less accurate, also makes hash dependent on blocksize. Defaults to 1.
maxbytes (int | None) – if specified, only hash the leading maxbytes of data in the file.
hasher (str | Hasher | NoParamType) – string code or a hash algorithm from hashlib. Valid hashing algorithms are defined by hashlib.algorithms_guaranteed (e.g. ‘sha1’, ‘sha512’, ‘md5’) as well as ‘xxh32’ and ‘xxh64’ if xxhash is installed. Defaults to ‘sha512’.

TODO: add logic such that you can update an existing hasher
base (List[str] | str | NoParamType) – list of symbols or shorthand key. Valid keys are ‘abc’, ‘hex’, and ‘dec’. Defaults to ‘hex’.

Note

For better hashes keep stride = 1. For faster hashes set stride > 1. Blocksize matters when stride > 1.

References

SO_3431825: http://stackoverflow.com/questions/3431825/md5-checksum-of-a-file
SO_5001893: http://stackoverflow.com/questions/5001893/when-to-use-sha-1-vs-sha-2

Example

>>> import ubelt as ub
>>> from os.path import join
>>> dpath = ub.Path.appdir('ubelt/tests/test-hash').ensuredir()
>>> fpath = dpath / 'tmp1.txt'
>>> fpath.write_text('foobar')
>>> print(ub.hash_file(fpath, hasher='sha1', base='hex'))
8843d7f92416211de9ebb963ff4ce28125932878

Example

>>> import ubelt as ub
>>> dpath = ub.Path.appdir('ubelt/tests/test-hash').ensuredir()
>>> fpath = dpath / 'tmp2.txt'
>>> # We have the ability to only hash at most ``maxbytes`` in a file
>>> fpath.write_text('abcdefghijklmnop')
>>> h0 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=11, blocksize=3)
>>> h1 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=32, blocksize=3)
>>> h2 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=32, blocksize=32)
>>> h3 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=16, blocksize=1)
>>> h4 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=16, blocksize=18)
>>> assert h1 == h2 == h3 == h4
>>> assert h1 != h0

>>> # Using a stride makes the result dependent on the blocksize
>>> h0 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=11, blocksize=3, stride=2)
>>> h1 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=32, blocksize=3, stride=2)
>>> h2 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=32, blocksize=32, stride=2)
>>> h3 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=16, blocksize=1, stride=2)
>>> h4 = ub.hash_file(fpath, hasher='sha1', base='hex', maxbytes=16, blocksize=18, stride=2)
>>> assert h1 != h2 != h3
>>> assert h1 == h0
>>> assert h2 == h4

Example

>>> import ubelt as ub
>>> from os.path import join
>>> dpath = ub.Path.appdir('ubelt/tests/test-hash').ensuredir()
>>> fpath = ub.touch(join(dpath, 'empty_file'))
>>> # Test that the output is the same as sha1sum executable
>>> if ub.find_exe('sha1sum'):
>>>     want = ub.cmd(['sha1sum', fpath], verbose=2)['out'].split(' ')[0]
>>>     got = ub.hash_file(fpath, hasher='sha1')
>>>     print('want = {!r}'.format(want))
>>>     print('got = {!r}'.format(got))
>>>     assert want.endswith(got)
>>> # Do the same for sha512 sum and md5sum
>>> if ub.find_exe('sha512sum'):
>>>     want = ub.cmd(['sha512sum', fpath], verbose=2)['out'].split(' ')[0]
>>>     got = ub.hash_file(fpath, hasher='sha512')
>>>     print('want = {!r}'.format(want))
>>>     print('got = {!r}'.format(got))
>>>     assert want.endswith(got)
>>> if ub.find_exe('md5sum'):
>>>     want = ub.cmd(['md5sum', fpath], verbose=2)['out'].split(' ')[0]
>>>     got = ub.hash_file(fpath, hasher='md5')
>>>     print('want = {!r}'.format(want))
>>>     print('got = {!r}'.format(got))
>>>     assert want.endswith(got)

ubelt.highlight_code(text, lexer_name='python', **kwargs)[source]¶

Highlights a block of text using ANSI tags based on language syntax.

Parameters

text (str) – plain text to highlight
lexer_name (str) – name of language. eg: python, docker, c++
**kwargs – passed to pygments.lexers.get_lexer_by_name

Returns

text - highlighted text If pygments is not installed, the plain text is returned.

Return type

Example

>>> import ubelt as ub
>>> text = 'import ubelt as ub; print(ub)'
>>> new_text = ub.highlight_code(text)
>>> print(new_text)

ubelt.hzcat(args, sep='')[source]¶

Horizontally concatenates strings preserving indentation

Concatenates a list of objects ensuring that the next item in the list is all the way to the right of any previous items.

Parameters

args (List[str]) – strings to concatenate
sep (str, default=’’) – separator

Example1:

>>> import ubelt as ub
>>> B = ub.repr2([[1, 2], [3, 457]], nl=1, cbr=True, trailsep=False)
>>> C = ub.repr2([[5, 6], [7, 8]], nl=1, cbr=True, trailsep=False)
>>> args = ['A = ', B, ' * ', C]
>>> print(ub.hzcat(args))
A = [[1, 2],   * [[5, 6],
     [3, 457]]    [7, 8]]

Example2:

>>> import ubelt as ub
>>> import unicodedata
>>> aa = unicodedata.normalize('NFD', 'á')  # a unicode char with len2
>>> B = ub.repr2([['θ', aa], [aa, aa, aa]], nl=1, si=True, cbr=True, trailsep=False)
>>> C = ub.repr2([[5, 6], [7, 'θ']], nl=1, si=True, cbr=True, trailsep=False)
>>> args = ['A', '=', B, '*', C]
>>> print(ub.hzcat(args, sep='｜'))
A｜=｜[[θ, á],   ｜*｜[[5, 6],
 ｜ ｜ [á, á, á]]｜ ｜ [7, θ]]

ubelt.identity(arg=None, *args, **kwargs)[source]¶

Return the value of the first argument unchanged.

All other positional and keyword inputs are ignored. Defaults to None if called without any args.

The name identity is used in the mathematical sense [WikiIdentity]. This is slightly different than the pure identity function, which is defined strictly with a single argument. This implementation allows but ignores extra arguments, making it easier to use as a drop in replacement for functions that accept extra configuration arguments that change their behavior and aren’t true inputs.

The value of this utility is a cleaner way to write lambda x: x or more precisely lambda x=None, *a, **k: x or writing the function inline. Unlike the lambda variant, this does not trigger common linter errors when assigning it to a value.

Parameters

arg (Any, default=None) – The value to return unchanged.
*args – Ignored
**kwargs – Ignored

Returns

arg - The same value of the first positional argument.

Return type

Any

References

WikiIdentity: https://en.wikipedia.org/wiki/Identity_function

Example

>>> import ubelt as ub
>>> ub.identity(42)
42
>>> ub.identity(42, 43)
42
>>> ub.identity()
None

ubelt.import_module_from_name(modname)[source]¶

Imports a module from its string name (i.e. __name__)

This is a simple wrapper around importlib.import_module(), but is provided as a companion function to import_module_from_path(), which contains functionality not provided in the Python standard library.

Parameters: modname (str) – module name
Returns: module
Return type: ModuleType

SeeAlso:: import_module_from_path()

Example

>>> # test with modules that won't be imported in normal circumstances
>>> # todo write a test where we guarantee this
>>> modname_list = [
>>>     'pickletools',
>>>     'lib2to3.fixes.fix_apply',
>>> ]
>>> #assert not any(m in sys.modules for m in modname_list)
>>> modules = [import_module_from_name(modname) for modname in modname_list]
>>> assert [m.__name__ for m in modules] == modname_list
>>> assert all(m in sys.modules for m in modname_list)

ubelt.import_module_from_path(modpath, index=-1)[source]¶

Imports a module via a filesystem path.

This works by modifying sys.path, importing the module name, and then attempting to undo the change to sys.path. This function may produce unexpected results in the case where the imported module itself itself modifies sys.path or if there is another conflicting module with the same name.

Parameters

modpath (str | PathLike) – Path to the module on disk or within a zipfile. Paths within a zipfile can be given by <path-to>.zip/<path-inside-zip>.py.
index (int) – Location at which we modify PYTHONPATH if necessary. If your module name does not conflict, the safest value is -1, However, if there is a conflict, then use an index of 0. The default may change to 0 in the future.

Returns

the imported module

Return type

ModuleType

References

SO_67631: https://stackoverflow.com/questions/67631/import-module-given-path

Raises

IOError - when the path to the module does not exist –
ImportError - when the module is unable to be imported –

Note

If the module is part of a package, the package will be imported first. These modules may cause problems when reloading via IPython magic

This can import a module from within a zipfile. To do this modpath should specify the path to the zipfile and the path to the module within that zipfile separated by a colon or pathsep. E.g. “/path/to/archive.zip:mymodule.pl”

Warning

It is best to use this with paths that will not conflict with previously existing modules.

If the modpath conflicts with a previously existing module name. And the target module does imports of its own relative to this conflicting path. In this case, the module that was loaded first will win.

For example if you try to import ‘/foo/bar/pkg/mod.py’ from the folder structure:

- foo/
  +- bar/
     +- pkg/
        +  __init__.py
        |- mod.py
        |- helper.py

If there exists another module named pkg already in sys.modules and mod.py does something like from . import helper, Python will assume helper belongs to the pkg module already in sys.modules. This can cause a NameError or worse — a incorrect helper module.

SeeAlso:: import_module_from_name()

Example

>>> import xdoctest
>>> modpath = xdoctest.__file__
>>> module = import_module_from_path(modpath)
>>> assert module is xdoctest

Example

>>> # Test importing a module from within a zipfile
>>> import zipfile
>>> from xdoctest import utils
>>> from os.path import join, expanduser, normpath
>>> dpath = expanduser('~/.cache/xdoctest')
>>> dpath = utils.ensuredir(dpath)
>>> #dpath = utils.TempDir().ensure()
>>> # Write to an external module named bar
>>> external_modpath = join(dpath, 'bar.py')
>>> # For pypy support we have to write this using with
>>> with open(external_modpath, 'w') as file:
>>>     file.write('testvar = 1')
>>> internal = 'folder/bar.py'
>>> # Move the external bar module into a zipfile
>>> zippath = join(dpath, 'myzip.zip')
>>> with zipfile.ZipFile(zippath, 'w') as myzip:
>>>     myzip.write(external_modpath, internal)
>>> # Import the bar module from within the zipfile
>>> modpath = zippath + ':' + internal
>>> modpath = zippath + os.path.sep + internal
>>> module = import_module_from_path(modpath)
>>> assert normpath(module.__name__) == normpath('folder/bar')
>>> assert module.testvar == 1

Example

>>> import pytest
>>> with pytest.raises(IOError):
>>>     import_module_from_path('does-not-exist')
>>> with pytest.raises(IOError):
>>>     import_module_from_path('does-not-exist.zip/')

ubelt.indent(text, prefix=' ')[source]¶

Indents a block of text

Parameters

text (str) – text to indent
prefix (str, default = ‘ ‘) – prefix to add to each line

Returns

indented text

Return type

Example

>>> import ubelt as ub
>>> NL = chr(10)  # newline character
>>> text = 'Lorem ipsum' + NL + 'dolor sit amet'
>>> prefix = '    '
>>> result = ub.indent(text, prefix)
>>> assert all(t.startswith(prefix) for t in result.split(NL))

ubelt.indexable_allclose(dct1, dct2, rel_tol=1e-09, abs_tol=0.0, return_info=False)[source]¶

Walks through two nested data structures and ensures that everything is roughly the same.

Parameters

dct1 (dict) – a nested indexable item
dct2 (dict) – a nested indexable item
rel_tol (float) – maximum difference for being considered “close”, relative to the magnitude of the input values
abs_tol (float) – maximum difference for being considered “close”, regardless of the magnitude of the input values
return_info (bool, default=False) – if true, return extra info

Returns

A boolean result if return_info is false, otherwise a tuple of the boolean result and an “info” dict containing detailed results indicating what matched and what did not.

Return type

bool | Tuple[bool, Dict]

Example

>>> import ubelt as ub
>>> dct1 = {
>>>     'foo': [1.222222, 1.333],
>>>     'bar': 1,
>>>     'baz': [],
>>> }
>>> dct2 = {
>>>     'foo': [1.22222, 1.333],
>>>     'bar': 1,
>>>     'baz': [],
>>> }
>>> flag, return_info =  ub.indexable_allclose(dct1, dct2, return_info=True)
>>> print('return_info = {}'.format(ub.repr2(return_info, nl=1)))
>>> print('flag = {!r}'.format(flag))

>>> walker1 = return_info['walker1']
>>> for p1, v1, v2  in return_info['faillist']:
>>>     v1_ = walker1[p1]
>>>     print('*fail p1, v1, v2 = {}, {}, {}'.format(p1, v1, v2))
>>> for p1 in return_info['passlist']:
>>>     v1_ = walker1[p1]
>>>     print('*pass p1, v1_ = {}, {}'.format(p1, v1_))
>>> assert not flag

>>> import ubelt as ub
>>> dct1 = {
>>>     'foo': [1.0000000000000000000000001, 1.],
>>>     'bar': 1,
>>>     'baz': [],
>>> }
>>> dct2 = {
>>>     'foo': [0.9999999999999999, 1.],
>>>     'bar': 1,
>>>     'baz': [],
>>> }
>>> flag, return_info =  ub.indexable_allclose(dct1, dct2, return_info=True)
>>> print('return_info = {}'.format(ub.repr2(return_info, nl=1)))
>>> print('flag = {!r}'.format(flag))

Example

>>> import ubelt as ub
>>> flag, return_info =  ub.indexable_allclose([], [], return_info=True)
>>> print('return_info = {!r}'.format(return_info))
>>> print('flag = {!r}'.format(flag))

Example

>>> import ubelt as ub
>>> flag =  ub.indexable_allclose([], [], return_info=False)
>>> print('flag = {!r}'.format(flag))

Example

>>> import ubelt as ub
>>> flag, return_info =  ub.indexable_allclose([], [1], return_info=True)
>>> print('return_info = {!r}'.format(return_info))
>>> print('flag = {!r}'.format(flag))

ubelt.inject_method(self, func, name=None)[source]¶

Injects a function into an object instance as a bound method

The main use case of this function is for monkey patching. While monkey patching is sometimes necessary it should generally be avoided. Thus, we simply remind the developer that there might be a better way.

Parameters

self (T) – Instance to inject a function into.
func (Callable[…, Any]) – The function to inject (must contain an arg for self).
name (str, default=None) – Name of the method. optional. If not specified the name of the function is used.

Example

>>> import ubelt as ub
>>> class Foo(object):
>>>     def bar(self):
>>>         return 'bar'
>>> def baz(self):
>>>     return 'baz'
>>> self = Foo()
>>> assert self.bar() == 'bar'
>>> assert not hasattr(self, 'baz')
>>> ub.inject_method(self, baz)
>>> assert not hasattr(Foo, 'baz'), 'should only change one instance'
>>> assert self.baz() == 'baz'
>>> ub.inject_method(self, baz, 'bar')
>>> assert self.bar() == 'baz'

ubelt.invert_dict(dict_, unique_vals=True, cls=None)[source]¶

Swaps the keys and values in a dictionary.

Parameters

dict_ (Dict[KT, VT]) – dictionary to invert
unique_vals (bool, default=True) – if False, the values of the new dictionary are sets of the original keys.
cls (type | None) – specifies the dict subclassof the result. if unspecified will be dict or OrderedDict. This behavior may change.

Returns

the inverted dictionary

Return type

Dict[VT, KT] | Dict[VT, Set[KT]]

Note

The must values be hashable.

If the original dictionary contains duplicate values, then only one of the corresponding keys will be returned and the others will be discarded. This can be prevented by setting unique_vals=False, causing the inverted keys to be returned in a set.

Example

>>> import ubelt as ub
>>> dict_ = {'a': 1, 'b': 2}
>>> inverted = ub.invert_dict(dict_)
>>> assert inverted == {1: 'a', 2: 'b'}

Example

>>> import ubelt as ub
>>> dict_ = ub.odict([(2, 'a'), (1, 'b'), (0, 'c'), (None, 'd')])
>>> inverted = ub.invert_dict(dict_)
>>> assert list(inverted.keys())[0] == 'a'

Example

>>> import ubelt as ub
>>> dict_ = {'a': 1, 'b': 0, 'c': 0, 'd': 0, 'f': 2}
>>> inverted = ub.invert_dict(dict_, unique_vals=False)
>>> assert inverted == {0: {'b', 'c', 'd'}, 1: {'a'}, 2: {'f'}}

ubelt.iter_window(iterable, size=2, step=1, wrap=False)[source]¶

Iterates through iterable with a window size. This is essentially a 1D sliding window.

Parameters

iterable (Iterable[T]) – an iterable sequence
size (int, default=2) – sliding window size
step (int, default=1) – sliding step size
wrap (bool, default=False) – wraparound flag

Returns

returns a possibly overlapping windows in a sequence

Return type

Iterable[T]

Notes

Similar to more_itertools.windowed(), Similar to more_itertools.pairwise(), Similar to more_itertools.triplewise(), Similar to more_itertools.sliding_window()

Example

>>> import ubelt as ub
>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 1, True
>>> window_iter = ub.iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = %r' % (window_list,))
window_list = [(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 1), (6, 1, 2)]

Example

>>> import ubelt as ub
>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 2, True
>>> window_iter = ub.iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = {!r}'.format(window_list))
window_list = [(1, 2, 3), (3, 4, 5), (5, 6, 1)]

Example

>>> import ubelt as ub
>>> iterable = [1, 2, 3, 4, 5, 6]
>>> size, step, wrap = 3, 2, False
>>> window_iter = ub.iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = {!r}'.format(window_list))
window_list = [(1, 2, 3), (3, 4, 5)]

Example

>>> import ubelt as ub
>>> iterable = []
>>> size, step, wrap = 3, 2, False
>>> window_iter = ub.iter_window(iterable, size, step, wrap)
>>> window_list = list(window_iter)
>>> print('window_list = {!r}'.format(window_list))
window_list = []

ubelt.iterable(obj, strok=False)[source]¶

Checks if the input implements the iterator interface. An exception is made for strings, which return False unless strok is True

Parameters

obj (object) – a scalar or iterable input
strok (bool, default=False) – if True allow strings to be interpreted as iterable

Returns

True if the input is iterable

Return type

Example

>>> import ubelt as ub
>>> obj_list = [3, [3], '3', (3,), [3, 4, 5], {}]
>>> result = [ub.iterable(obj) for obj in obj_list]
>>> assert result == [False, True, False, True, True, True]
>>> result = [ub.iterable(obj, strok=True) for obj in obj_list]
>>> assert result == [False, True, True, True, True, True]

ubelt.map_keys(func, dict_, cls=None)[source]¶

Apply a function to every key in a dictionary.

Creates a new dictionary with the same values and modified keys. An error is raised if the new keys are not unique.

Parameters

func (Callable[[KT], T] | Mapping[KT, T]) – a function or indexable object
dict_ (Dict[KT, VT]) – a dictionary
cls (type | None) – specifies the dict subclassof the result. if unspecified will be dict or OrderedDict. This behavior may change.

Returns

transformed dictionary

Return type

Dict[T, VT]

Raises

Exception – if multiple keys map to the same value

Example

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> func = ord
>>> newdict = ub.map_keys(func, dict_)
>>> print(newdict)
>>> assert newdict == {97: [1, 2, 3], 98: []}
>>> dict_ = {0: [1, 2, 3], 1: []}
>>> func = ['a', 'b']
>>> newdict = ub.map_keys(func, dict_)
>>> print(newdict)
>>> assert newdict == {'a': [1, 2, 3], 'b': []}

ubelt.map_vals(func, dict_, cls=None)¶

Apply a function to every value in a dictionary.

Creates a new dictionary with the same keys and modified values.

Parameters

func (Callable[[VT], T] | Mapping[VT, T]) – a function or indexable object
dict_ (Dict[KT, VT]) – a dictionary
cls (type | None) – specifies the dict subclassof the result. if unspecified will be dict or OrderedDict. This behavior may change.

Returns

transformed dictionary

Return type

Dict[KT, T]

Notes

Similar to :module:`dictmap.dict_map`

Example

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> newdict = ub.map_values(len, dict_)
>>> assert newdict ==  {'a': 3, 'b': 0}

Example

>>> # Can also use an indexable as ``func``
>>> import ubelt as ub
>>> dict_ = {'a': 0, 'b': 1}
>>> func = [42, 21]
>>> newdict = ub.map_values(func, dict_)
>>> assert newdict ==  {'a': 42, 'b': 21}
>>> print(newdict)

ubelt.map_values(func, dict_, cls=None)[source]¶

Apply a function to every value in a dictionary.

Creates a new dictionary with the same keys and modified values.

Parameters

func (Callable[[VT], T] | Mapping[VT, T]) – a function or indexable object
dict_ (Dict[KT, VT]) – a dictionary
cls (type | None) – specifies the dict subclassof the result. if unspecified will be dict or OrderedDict. This behavior may change.

Returns

transformed dictionary

Return type

Dict[KT, T]

Notes

Similar to :module:`dictmap.dict_map`

Example

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> newdict = ub.map_values(len, dict_)
>>> assert newdict ==  {'a': 3, 'b': 0}

Example

>>> # Can also use an indexable as ``func``
>>> import ubelt as ub
>>> dict_ = {'a': 0, 'b': 1}
>>> func = [42, 21]
>>> newdict = ub.map_values(func, dict_)
>>> assert newdict ==  {'a': 42, 'b': 21}
>>> print(newdict)

ubelt.memoize(func)[source]¶

memoization decorator that respects args and kwargs

In Python 3.9. The functools introduces the cache method, which is currently faster than memoize for simple functions [FunctoolsCache]. However, memoize can handle more general non-natively hashable inputs.

Parameters: func (Callable) – live python function
Returns: memoized wrapper
Return type: Callable

References

WikiMemoize: https://wiki.python.org/moin/PythonDecoratorLibrary#Memoize
FunctoolsCache: https://docs.python.org/3/library/functools.html

Example

>>> import ubelt as ub
>>> closure = {'a': 'b', 'c': 'd'}
>>> incr = [0]
>>> def foo(key):
>>>     value = closure[key]
>>>     incr[0] += 1
>>>     return value
>>> foo_memo = ub.memoize(foo)
>>> assert foo('a') == 'b' and foo('c') == 'd'
>>> assert incr[0] == 2
>>> print('Call memoized version')
>>> assert foo_memo('a') == 'b' and foo_memo('c') == 'd'
>>> assert incr[0] == 4
>>> assert foo_memo('a') == 'b' and foo_memo('c') == 'd'
>>> print('Counter should no longer increase')
>>> assert incr[0] == 4
>>> print('Closure changes result without memoization')
>>> closure = {'a': 0, 'c': 1}
>>> assert foo('a') == 0 and foo('c') == 1
>>> assert incr[0] == 6
>>> assert foo_memo('a') == 'b' and foo_memo('c') == 'd'

class ubelt.memoize_method(func)[source]¶

Bases: object

memoization decorator for a method that respects args and kwargs

References

ActiveState_Miller_2010: http://code.activestate.com/recipes/577452-a-memoize-decorator-for-instance-methods

Example

>>> import ubelt as ub
>>> closure = {'a': 'b', 'c': 'd'}
>>> incr = [0]
>>> class Foo(object):
>>>     @ub.memoize_method
>>>     def foo_memo(self, key):
>>>         "Wrapped foo_memo docstr"
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value
>>>     def foo(self, key):
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value
>>> self = Foo()
>>> assert self.foo('a') == 'b' and self.foo('c') == 'd'
>>> assert incr[0] == 2
>>> print('Call memoized version')
>>> assert self.foo_memo('a') == 'b' and self.foo_memo('c') == 'd'
>>> assert incr[0] == 4
>>> assert self.foo_memo('a') == 'b' and self.foo_memo('c') == 'd'
>>> print('Counter should no longer increase')
>>> assert incr[0] == 4
>>> print('Closure changes result without memoization')
>>> closure = {'a': 0, 'c': 1}
>>> assert self.foo('a') == 0 and self.foo('c') == 1
>>> assert incr[0] == 6
>>> assert self.foo_memo('a') == 'b' and self.foo_memo('c') == 'd'
>>> print('Constructing a new object should get a new cache')
>>> self2 = Foo()
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> assert self.foo_memo.__doc__ == 'Wrapped foo_memo docstr'
>>> assert self.foo_memo.__name__ == 'foo_memo'

ubelt.memoize_property(fget)[source]¶

Return a property attribute for new-style classes that only calls its getter on the first access. The result is stored and on subsequent accesses is returned, preventing the need to call the getter any more.

This decorator can either be used by itself or by decorating another property. In either case the method will always become a property.

Note

implementation is a modified version of [estebistec_memoize].

References

estebistec_memoize: https://github.com/estebistec/python-memoized-property

Example

>>> import ubelt as ub
>>> class C(object):
...     load_name_count = 0
...     @ub.memoize_property
...     def name(self):
...         "name's docstring"
...         self.load_name_count += 1
...         return "the name"
...     @ub.memoize_property
...     @property
...     def another_name(self):
...         "name's docstring"
...         self.load_name_count += 1
...         return "the name"
>>> c = C()
>>> c.load_name_count
0
>>> c.name
'the name'
>>> c.load_name_count
1
>>> c.name
'the name'
>>> c.load_name_count
1
>>> c.another_name

ubelt.modname_to_modpath(modname, hide_init=True, hide_main=False, sys_path=None)[source]¶

Finds the path to a python module from its name.

Determines the path to a python module without directly import it

Converts the name of a module (__name__) to the path (__file__) where it is located without importing the module. Returns None if the module does not exist.

Parameters

modname (str) – The name of a module in sys_path.
hide_init (bool) – if False, __init__.py will be returned for packages. Defaults to True.
hide_main (bool) – if False, and hide_init is True, __main__.py will be returned for packages, if it exists. Defautls to False.
sys_path (None | List[str | PathLike]) – The paths to search for the module. If unspecified, defaults to sys.path.

Returns

modpath - path to the module, or None if it doesn’t exist

Return type

str | None

Example

>>> modname = 'xdoctest.__main__'
>>> modpath = modname_to_modpath(modname, hide_main=False)
>>> assert modpath.endswith('__main__.py')
>>> modname = 'xdoctest'
>>> modpath = modname_to_modpath(modname, hide_init=False)
>>> assert modpath.endswith('__init__.py')
>>> # xdoctest: +REQUIRES(CPython)
>>> modpath = basename(modname_to_modpath('_ctypes'))
>>> assert 'ctypes' in modpath

ubelt.modpath_to_modname(modpath, hide_init=True, hide_main=False, check=True, relativeto=None)[source]¶

Determines importable name from file path

Converts the path to a module (__file__) to the importable python name (__name__) without importing the module.

The filename is converted to a module name, and parent directories are recursively included until a directory without an __init__.py file is encountered.

Parameters

modpath (str) – module filepath
hide_init (bool, default=True) – removes the __init__ suffix
hide_main (bool, default=False) – removes the __main__ suffix
check (bool, default=True) – if False, does not raise an error if modpath is a dir and does not contain an __init__ file.
relativeto (str, default=None) – if specified, all checks are ignored and this is considered the path to the root module.

Todo

[ ] Does this need modification to support PEP 420?
https://www.python.org/dev/peps/pep-0420/

Returns: modname
Return type: str
Raises: ValueError – if check is True and the path does not exist

Example

>>> from xdoctest import static_analysis
>>> modpath = static_analysis.__file__.replace('.pyc', '.py')
>>> modpath = modpath.replace('.pyc', '.py')
>>> modname = modpath_to_modname(modpath)
>>> assert modname == 'xdoctest.static_analysis'

Example

>>> import xdoctest
>>> assert modpath_to_modname(xdoctest.__file__.replace('.pyc', '.py')) == 'xdoctest'
>>> assert modpath_to_modname(dirname(xdoctest.__file__.replace('.pyc', '.py'))) == 'xdoctest'

Example

>>> # xdoctest: +REQUIRES(CPython)
>>> modpath = modname_to_modpath('_ctypes')
>>> modname = modpath_to_modname(modpath)
>>> assert modname == '_ctypes'

Example

>>> modpath = '/foo/libfoobar.linux-x86_64-3.6.so'
>>> modname = modpath_to_modname(modpath, check=False)
>>> assert modname == 'libfoobar'

ubelt.named_product(_=None, **basis)[source]¶

Generates the Cartesian product of the basis.values(), where each generated item labeled by basis.keys().

In other words, given a dictionary that maps each “axes” (i.e. some variable) to its “basis” (i.e. the possible values that it can take), generate all possible points in that grid (i.e. unique assignments of variables to values).

Parameters

_ (Dict[str, List[VT]] | None, default=None) – Use of this positional argument is not recommend. Instead specify all arguments as keyword args.

If specified, this should be a dictionary is unioned with the keyword args. This exists to support ordered dictionaries before Python 3.6, and may eventually be removed.
basis (Dict[str, List[VT]]) – A dictionary where the keys correspond to “columns” and the values are a list of possible values that “column” can take.

I.E. each key corresponds to an “axes”, the values are the list of possible values for that “axes”.

Yields

Dict[str, VT] – a “row” in the “longform” data containing a point in the Cartesian product.

Note

This function is similar to itertools.product(), the only difference is that the generated items are a dictionary that retains the input keys instead of an tuple.

This function used to be called “basis_product”, but “named_product” might be more appropriate. This function exists in other places ([minstrel271_namedproduct], [pytb_namedproduct], and [Hettinger_namedproduct]).

References

minstrel271_namedproduct: https://gist.github.com/minstrel271/d51654af3fa4e6411267
pytb_namedproduct: https://py-toolbox.readthedocs.io/en/latest/modules/itertools.html#
Hettinger_namedproduct: https://twitter.com/raymondh/status/970380630822305792

Example

>>> # An example use case is looping over all possible settings in a
>>> # configuration dictionary for a grid search over parameters.
>>> import ubelt as ub
>>> basis = {
>>>     'arg1': [1, 2, 3],
>>>     'arg2': ['A1', 'B1'],
>>>     'arg3': [9999, 'Z2'],
>>>     'arg4': ['always'],
>>> }
>>> import ubelt as ub
>>> # sort input data for older python versions
>>> basis = ub.odict(sorted(basis.items()))
>>> got = list(ub.named_product(basis))
>>> print(ub.repr2(got, nl=-1))
[
    {'arg1': 1, 'arg2': 'A1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 1, 'arg2': 'A1', 'arg3': 'Z2', 'arg4': 'always'},
    {'arg1': 1, 'arg2': 'B1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 1, 'arg2': 'B1', 'arg3': 'Z2', 'arg4': 'always'},
    {'arg1': 2, 'arg2': 'A1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 2, 'arg2': 'A1', 'arg3': 'Z2', 'arg4': 'always'},
    {'arg1': 2, 'arg2': 'B1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 2, 'arg2': 'B1', 'arg3': 'Z2', 'arg4': 'always'},
    {'arg1': 3, 'arg2': 'A1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 3, 'arg2': 'A1', 'arg3': 'Z2', 'arg4': 'always'},
    {'arg1': 3, 'arg2': 'B1', 'arg3': 9999, 'arg4': 'always'},
    {'arg1': 3, 'arg2': 'B1', 'arg3': 'Z2', 'arg4': 'always'}
]

Example

>>> import ubelt as ub
>>> list(ub.named_product(a=[1, 2, 3]))
[{'a': 1}, {'a': 2}, {'a': 3}]
>>> # xdoctest: +IGNORE_WANT
>>> list(ub.named_product(a=[1, 2, 3], b=[4, 5]))
[{'a': 1, 'b': 4},
 {'a': 1, 'b': 5},
 {'a': 2, 'b': 4},
 {'a': 2, 'b': 5},
 {'a': 3, 'b': 4},
 {'a': 3, 'b': 5}]

ubelt.odict¶: alias of OrderedDict

ubelt.oset¶: alias of OrderedSet

ubelt.paragraph(text)[source]¶

Wraps multi-line strings and restructures the text to remove all newlines, heading, trailing, and double spaces.

Useful for writing log messages

Parameters: text (str) – typically a multiline string
Returns: the reduced text block
Return type: str

Example

>>> import ubelt as ub
>>> text = (
>>>     '''
>>>     Lorem ipsum dolor sit amet, consectetur adipiscing
>>>     elit, sed do eiusmod tempor incididunt ut labore et
>>>     dolore magna aliqua.
>>>     ''')
>>> out = ub.paragraph(text)
>>> assert chr(10) in text
>>> assert chr(10) not in out
>>> print('text = {!r}'.format(text))
>>> print('out = {!r}'.format(out))

ubelt.peek(iterable, default=NoParam)[source]¶

Look at the first item of an iterable. If the input is an iterator, then the next element is exhausted (i.e. a pop operation).

Parameters

iterable (Iterable[T]) – an iterable
default (T) – default item to return if the iterable is empty, otherwise a StopIteration error is raised

Returns

item - the first item of ordered sequence, a popped item from an: iterator, or an arbitrary item from an unordered collection.

Return type

T

Notes

Similar to more_itertools.peekable()

Example

>>> import ubelt as ub
>>> data = [0, 1, 2]
>>> ub.peek(data)
0
>>> iterator = iter(data)
>>> print(ub.peek(iterator))
0
>>> print(ub.peek(iterator))
1
>>> print(ub.peek(iterator))
2
>>> ub.peek(range(3))
0
>>> ub.peek([], 3)
3

ubelt.platform_cache_dir()[source]¶

Returns a directory which should be writable for any application This should be used for temporary deletable data.

Returns: path to the cache dir used by the current operating system
Return type: str

ubelt.platform_config_dir()[source]¶

Returns a directory which should be writable for any application This should be used for persistent configuration files.

Returns: path to the cache dir used by the current operating system
Return type: str

ubelt.platform_data_dir()[source]¶

Returns path for user-specific data files

Returns: path to the data dir used by the current operating system
Return type: str

ubelt.readfrom(fpath, aslines=False, errors='replace', verbose=None)[source]¶

Reads (utf8) text from a file.

Note

You probably should use ub.Path(<fpath>).read_text() instead. This function exists as a convenience for writing in Python2. After 2020-01-01, we may consider deprecating the function.

Parameters

fpath (str | PathLike) – file path
aslines (bool) – if True returns list of lines
verbose (bool) – verbosity flag

Returns

text from fpath (this is unicode)

Return type

ubelt.repr2(data, **kwargs)[source]¶

Makes a pretty string representation of data.

Makes a pretty and easy-to-doctest string representation. Has nice handling of common nested datatypes. This is an alternative to repr, and pprint.pformat().

This output of this function are configurable. By default it aims to produce strings that are consistent, compact, and executable. This makes them great for doctests.

Note

This function has many keyword arguments that can be used to customize the final representation. For convenience some of the more frequently used kwargs have short aliases. See “Kwargs” for more details.

Parameters: data (object) – an arbitrary python object to form the string “representation” of

Kwargs:

si, stritems, (bool):

dict/list items use str instead of repr

strkeys, sk (bool):

dict keys use str instead of repr

strvals, sv (bool):

dict values use str instead of repr

nl, newlines (int | bool):

number of top level nestings to place a newline after. If true all items are followed by newlines regardless of nesting level. Defaults to 1 for lists and True for dicts.

nobr, nobraces (bool, default=False):

if True, text will not contain outer braces for containers

cbr, compact_brace (bool, default=False):

if True, braces are compactified (i.e. they will not have newlines placed directly after them, think java / K&R / 1TBS)

trailsep, trailing_sep (bool):

if True, a separator is placed after the last item in a sequence. By default this is True if there are any nl > 0.

explicit (bool, default=False):

changes dict representation from {k1: v1, ...} to dict(k1=v1, ...).

Modifies:: default kvsep is modified to '=' dict braces from {} to dict().

compact (bool, default=False):

Produces values more suitable for space constrianed environments

Modifies:: default kvsep is modified to '=' default itemsep is modified to '' default nobraces is modified to 1. default newlines is modified to 0. default strkeys to True default strvals to True

precision (int, default=None):

if specified floats are formatted with this precision

kvsep (str, default=’: ‘):

separator between keys and values

itemsep (str, default=’ ‘):

separator between items. This separator is placed after commas, which are currently not configurable. This may be modified in the future.

sort (bool | callable, default=None):

if None, then sort unordered collections, but keep the ordering of ordered collections. This option attempts to be deterministic in most cases.

New in 0.8.0: if sort is callable, it will be used as a key-function to sort all collections.

if False, then nothing will be sorted, and the representation of unordered collections will be arbitrary and possibly non-determenistic.

if True, attempts to sort all collections in the returned text. Currently if True this WILL sort lists. Currently if True this WILL NOT sort OrderedDicts.

NOTE:: The previous behavior may not be intuitive, as such the behavior of this arg is subject to change.

suppress_small (bool):

passed to numpy.array2string() for ndarrays

max_line_width (int):

passed to numpy.array2string() for ndarrays

with_dtype (bool):

only relevant to numpy.ndarrays. if True includes the dtype. Defaults to not strvals.

align (bool | str, default=False):

if True, will align multi-line dictionaries by the kvsep

extensions (FormatterExtensions):

a custom FormatterExtensions instance that can overwrite or define how different types of objects are formatted.

Returns: outstr - output string
Return type: str

Note

There are also internal kwargs, which should not be used:

_return_info (bool): return information about child context

_root_info (depth): information about parent context

RelatedWork:: rich.pretty.pretty_repr() pprint.pformat()

Example

>>> import ubelt as ub
>>> dict_ = {
...     'custom_types': [slice(0, 1, None), 1/3],
...     'nest_dict': {'k1': [1, 2, {3: {4, 5}}],
...                   'key2': [1, 2, {3: {4, 5}}],
...                   'key3': [1, 2, {3: {4, 5}}],
...                   },
...     'nest_dict2': {'k': [1, 2, {3: {4, 5}}]},
...     'nested_tuples': [tuple([1]), tuple([2, 3]), frozenset([4, 5, 6])],
...     'one_tup': tuple([1]),
...     'simple_dict': {'spam': 'eggs', 'ham': 'jam'},
...     'simple_list': [1, 2, 'red', 'blue'],
...     'odict': ub.odict([(1, '1'), (2, '2')]),
... }
>>> # In the interest of saving space we are only going to show the
>>> # output for the first example.
>>> result = ub.repr2(dict_, nl=1, precision=2)
>>> print(result)
{
    'custom_types': [slice(0, 1, None), 0.33],
    'nest_dict': {'k1': [1, 2, {3: {4, 5}}], 'key2': [1, 2, {3: {4, 5}}], 'key3': [1, 2, {3: {4, 5}}]},
    'nest_dict2': {'k': [1, 2, {3: {4, 5}}]},
    'nested_tuples': [(1,), (2, 3), {4, 5, 6}],
    'odict': {1: '1', 2: '2'},
    'one_tup': (1,),
    'simple_dict': {'ham': 'jam', 'spam': 'eggs'},
    'simple_list': [1, 2, 'red', 'blue'],
}
>>> # You can try the rest yourself.
>>> result = ub.repr2(dict_, nl=3, precision=2); print(result)
>>> result = ub.repr2(dict_, nl=2, precision=2); print(result)
>>> result = ub.repr2(dict_, nl=1, precision=2, itemsep='', explicit=True); print(result)
>>> result = ub.repr2(dict_, nl=1, precision=2, nobr=1, itemsep='', explicit=True); print(result)
>>> result = ub.repr2(dict_, nl=3, precision=2, cbr=True); print(result)
>>> result = ub.repr2(dict_, nl=3, precision=2, si=True); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=True); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=False, trailing_sep=False); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=False, trailing_sep=False, nobr=True); print(result)

Example

>>> import ubelt as ub
>>> def _nest(d, w):
...     if d == 0:
...         return {}
...     else:
...         return {'n{}'.format(d): _nest(d - 1, w + 1), 'm{}'.format(d): _nest(d - 1, w + 1)}
>>> dict_ = _nest(d=4, w=1)
>>> result = ub.repr2(dict_, nl=6, precision=2, cbr=1)
>>> print('---')
>>> print(result)
>>> result = ub.repr2(dict_, nl=-1, precision=2)
>>> print('---')
>>> print(result)

Example

>>> import ubelt as ub
>>> data = {'a': 100, 'b': [1, '2', 3], 'c': {20:30, 40: 'five'}}
>>> print(ub.repr2(data, nl=1))
{
    'a': 100,
    'b': [1, '2', 3],
    'c': {20: 30, 40: 'five'},
}
>>> # Compact is useful for things like timerit.Timerit labels
>>> print(ub.repr2(data, compact=True))
a=100,b=[1,2,3],c={20=30,40=five}
>>> print(ub.repr2(data, compact=True, nobr=False))
{a=100,b=[1,2,3],c={20=30,40=five}}

ubelt.schedule_deprecation(modname, name='?', type='?', migration='', deprecate=None, error=None, remove=None, warncls=<class 'DeprecationWarning'>)[source]¶

Deprecation machinery to help provide users with a smoother transition.

This function provides a concise way to mark a feature as deprecated by providing a description of the deprecated feature, documentation on how to migrate away from the deprecated feature, and the versions that the feature is scheduled for deprecation and eventual removal. Based on the version of the library and the specified schedule this function will either do nothing, emit a warning, or raise an error with helpful messages for both users and developers.

Parameters

modname (str) – The name of the underlying module associated with the feature to be deprecated. The module must already be imported and have a passable __version__ attribute.
name (str) – The name of the feature to deprecate. This is usually a function or argument name.
type (str) – A description of what the feature is. This is not a formal type, but rather a prose description: e.g. “argument to my_func”.
migration (str) – A description that lets users know what they should do instead of using the deprecated feature.
deprecate (str | None) – The version when the feature is officially deprecated and this function should start to emit a deprecation warning.
error (str | None) – The version when the feature is officially no longer supported, and will start to raise a RuntimeError.
remove (str | None) – The version when the feature is completely removed. An AssertionError will be raised if this function is still present reminding the developer to remove the feature (or extend the remove version).
warncls (type) – This is the category of warning to use. Defaults to DeprecationWarning.

Note

The DeprecationWarning is not visible by default. https://docs.python.org/3/library/warnings.html

Example

>>> from ubelt import schedule_deprecation
>>> import sys
>>> import types
>>> import pytest
>>> dummy_module = sys.modules['dummy_module'] = types.ModuleType('dummy_module')
>>> # When less than the deprecated version this does nothing
>>> dummy_module.__version__ = '1.0.0'
>>> schedule_deprecation(
...     'dummy_module', 'myfunc', 'function', 'do something else',
...     deprecate='1.1.0', error='1.2.0', remove='1.3.0')
>>> # Now this raises warning
>>> with pytest.warns(DeprecationWarning):
...     dummy_module.__version__ = '1.1.0'
...     schedule_deprecation(
...         'dummy_module', 'myfunc', 'function', 'do something else',
...         deprecate='1.1.0', error='1.2.0', remove='1.3.0')
>>> # Now this raises an error for the user
>>> with pytest.raises(RuntimeError):
...     dummy_module.__version__ = '1.2.0'
...     schedule_deprecation(
...         'dummy_module', 'myfunc', 'function', 'do something else',
...         deprecate='1.1.0', error='1.2.0', remove='1.3.0')
>>> # Now this raises an error for the developer
>>> with pytest.raises(AssertionError):
...     dummy_module.__version__ = '1.3.0'
...     schedule_deprecation(
...         'dummy_module', 'myfunc', 'function', 'do something else',
...         deprecate='1.1.0', error='1.2.0', remove='1.3.0')
>>> # When no versions are specified, it simply emits the warning
>>> with pytest.warns(DeprecationWarning):
...     dummy_module.__version__ = '1.1.0'
...     schedule_deprecation(
...         'dummy_module', 'myfunc', 'function', 'do something else')

ubelt.sdict¶: alias of SetDict

ubelt.shrinkuser(path, home='~')[source]¶

Inverse of os.path.expanduser().

Parameters

path (str | PathLike) – path in system file structure
home (str) – symbol used to replace the home path. Defaults to ‘~’, but you might want to use ‘$HOME’ or ‘%USERPROFILE%’ instead.

Returns

path - shortened path replacing the home directory with a symbol

Return type

Example

>>> from ubelt.util_path import *  # NOQA
>>> path = expanduser('~')
>>> assert path != '~'
>>> assert shrinkuser(path) == '~'
>>> assert shrinkuser(path + '1') == path + '1'
>>> assert shrinkuser(path + '/1') == join('~', '1')
>>> assert shrinkuser(path + '/1', '$HOME') == join('$HOME', '1')
>>> assert shrinkuser('.') == '.'

ubelt.sorted_keys(dict_, key=None, reverse=False, cls=<class 'collections.OrderedDict'>)[source]¶

Return an ordered dictionary sorted by its keys

Parameters

dict_ (Dict[KT, VT]) – dictionary to sort. The keys must be of comparable types.
key (Callable[[KT], Any] | None) – If given as a callable, customizes the sorting by ordering using transformed keys.
reverse (bool, default=False) – if True returns in descending order
cls (type) – specifies the dict return type

Returns

new dictionary where the keys are ordered

Return type

OrderedDict[KT, VT]

Example

>>> import ubelt as ub
>>> dict_ = {'spam': 2.62, 'eggs': 1.20, 'jam': 2.92}
>>> newdict = sorted_keys(dict_)
>>> print(ub.repr2(newdict, nl=0))
{'eggs': 1.2, 'jam': 2.92, 'spam': 2.62}
>>> newdict = sorted_keys(dict_, reverse=True)
>>> print(ub.repr2(newdict, nl=0))
{'spam': 2.62, 'jam': 2.92, 'eggs': 1.2}
>>> newdict = sorted_keys(dict_, key=lambda x: sum(map(ord, x)))
>>> print(ub.repr2(newdict, nl=0))
{'jam': 2.92, 'eggs': 1.2, 'spam': 2.62}

ubelt.sorted_vals(dict_, key=None, reverse=False, cls=<class 'collections.OrderedDict'>)¶

Return an ordered dictionary sorted by its values

Parameters

dict_ (Dict[KT, VT]) – dictionary to sort. The values must be of comparable types.
key (Callable[[VT], Any] | None) – If given as a callable, customizes the sorting by ordering using transformed values.
reverse (bool, default=False) – if True returns in descending order
cls (type) – specifies the dict return type

Returns

new dictionary where the values are ordered

Return type

OrderedDict[KT, VT]

Example

>>> import ubelt as ub
>>> dict_ = {'spam': 2.62, 'eggs': 1.20, 'jam': 2.92}
>>> newdict = sorted_values(dict_)
>>> print(ub.repr2(newdict, nl=0))
{'eggs': 1.2, 'spam': 2.62, 'jam': 2.92}
>>> newdict = sorted_values(dict_, reverse=True)
>>> print(ub.repr2(newdict, nl=0))
{'jam': 2.92, 'spam': 2.62, 'eggs': 1.2}
>>> newdict = sorted_values(dict_, key=lambda x: x % 1.6)
>>> print(ub.repr2(newdict, nl=0))
{'spam': 2.62, 'eggs': 1.2, 'jam': 2.92}

ubelt.sorted_values(dict_, key=None, reverse=False, cls=<class 'collections.OrderedDict'>)[source]¶

Return an ordered dictionary sorted by its values

Parameters

dict_ (Dict[KT, VT]) – dictionary to sort. The values must be of comparable types.
key (Callable[[VT], Any] | None) – If given as a callable, customizes the sorting by ordering using transformed values.
reverse (bool, default=False) – if True returns in descending order
cls (type) – specifies the dict return type

Returns

new dictionary where the values are ordered

Return type

OrderedDict[KT, VT]

Example

>>> import ubelt as ub
>>> dict_ = {'spam': 2.62, 'eggs': 1.20, 'jam': 2.92}
>>> newdict = sorted_values(dict_)
>>> print(ub.repr2(newdict, nl=0))
{'eggs': 1.2, 'spam': 2.62, 'jam': 2.92}
>>> newdict = sorted_values(dict_, reverse=True)
>>> print(ub.repr2(newdict, nl=0))
{'jam': 2.92, 'spam': 2.62, 'eggs': 1.2}
>>> newdict = sorted_values(dict_, key=lambda x: x % 1.6)
>>> print(ub.repr2(newdict, nl=0))
{'spam': 2.62, 'eggs': 1.2, 'jam': 2.92}

ubelt.split_archive(fpath, ext='.zip')[source]¶

If fpath specifies a file inside a zipfile, it breaks it into two parts the path to the zipfile and the internal path in the zipfile.

Example

>>> split_archive('/a/b/foo.txt')
>>> split_archive('/a/b/foo.zip/bar.txt')
>>> split_archive('/a/b/foo.zip/baz/biz.zip/bar.py')
>>> split_archive('archive.zip')
>>> import ubelt as ub
>>> split_archive(ub.Path('/a/b/foo.zip/baz/biz.zip/bar.py'))
>>> split_archive('/a/b/foo.zip/baz.pt/bar.zip/bar.zip', '.pt')

Todo

Fix got/want for win32

(None, None) (‘/a/b/foo.zip’, ‘bar.txt’) (‘/a/b/foo.zip/baz/biz.zip’, ‘bar.py’) (‘archive.zip’, None) (‘/a/b/foo.zip/baz/biz.zip’, ‘bar.py’) (‘/a/b/foo.zip/baz.pt’, ‘bar.zip/bar.zip’)

ubelt.split_modpath(modpath, check=True)[source]¶

Splits the modpath into the dir that must be in PYTHONPATH for the module to be imported and the modulepath relative to this directory.

Parameters

modpath (str) – module filepath
check (bool) – if False, does not raise an error if modpath is a directory and does not contain an __init__.py file.

Returns

(directory, rel_modpath)

Return type

Tuple[str, str]

Raises

ValueError – if modpath does not exist or is not a package

Example

>>> from xdoctest import static_analysis
>>> modpath = static_analysis.__file__.replace('.pyc', '.py')
>>> modpath = abspath(modpath)
>>> dpath, rel_modpath = split_modpath(modpath)
>>> recon = join(dpath, rel_modpath)
>>> assert recon == modpath
>>> assert rel_modpath == join('xdoctest', 'static_analysis.py')

ubelt.symlink(real_path, link_path, overwrite=False, verbose=0)[source]¶

Create a link link_path that mirrors real_path.

This function attempts to create a real symlink, but will fall back on a hard link or junction if symlinks are not supported.

Parameters

path (str | PathLike) – path to real file or directory
link_path (str | PathLike) – path to desired location for symlink
overwrite (bool, default=False) – overwrite existing symlinks. This will not overwrite real files on systems with proper symlinks. However, on older versions of windows junctions are indistinguishable from real files, so we cannot make this guarantee.
verbose (int, default=0) – verbosity level

Returns

link path

Return type

str | PathLike

Note

On systems that do not contain support for symlinks (e.g. some versions / configurations of Windows), this function will fall back on hard links or junctions [WikiNTFSLinks], [WikiHardLink]. The differences between the two are explained in [WikiSymLink].

If symlinks are not available, then link_path and real_path must exist on the same filesystem. Given that, this function always works in the sense that (1) link_path will mirror the data from real_path, (2) updates to one will effect the other, and (3) no extra space will be used.

More details can be found in ubelt._win32_links. On systems that support symlinks (e.g. Linux), none of the above applies.

References

WikiSymLink: https://en.wikipedia.org/wiki/Symbolic_link
WikiHardLink: https://en.wikipedia.org/wiki/Hard_link
WikiNTFSLinks: https://en.wikipedia.org/wiki/NTFS_links

Example

>>> import ubelt as ub
>>> dpath = ub.Path.appdir('ubelt', 'test_symlink0').delete().ensuredir()
>>> real_path = (dpath / 'real_file.txt')
>>> link_path = (dpath / 'link_file.txt')
>>> real_path.write_text('foo')
>>> result = ub.symlink(real_path, link_path)
>>> assert ub.Path(result).read_text() == 'foo'
>>> dpath.delete()  # clenaup

Example

>>> import ubelt as ub
>>> from ubelt.util_links import _dirstats
>>> dpath = ub.Path.appdir('ubelt', 'test_symlink1').delete().ensuredir()
>>> _dirstats(dpath)
>>> real_dpath = (dpath / 'real_dpath').ensuredir()
>>> link_dpath = real_dpath.augment(stem='link_dpath')
>>> real_path = (dpath / 'afile.txt')
>>> link_path = (dpath / 'afile.txt')
>>> real_path.write_text('foo')
>>> result = ub.symlink(real_dpath, link_dpath)
>>> assert link_path.read_text() == 'foo', 'read should be same'
>>> link_path.write_text('bar')
>>> _dirstats(dpath)
>>> assert link_path.read_text() == 'bar', 'very bad bar'
>>> assert real_path.read_text() == 'bar', 'changing link did not change real'
>>> real_path.write_text('baz')
>>> _dirstats(dpath)
>>> assert real_path.read_text() == 'baz', 'very bad baz'
>>> assert link_path.read_text() == 'baz', 'changing real did not change link'
>>> ub.delete(link_dpath, verbose=1)
>>> _dirstats(dpath)
>>> assert not link_dpath.exists(), 'link should not exist'
>>> assert real_path.exists(), 'real path should exist'
>>> _dirstats(dpath)
>>> ub.delete(dpath, verbose=1)
>>> _dirstats(dpath)
>>> assert not real_path.exists()

Example

>>> # Specifying bad paths should error.
>>> import ubelt as ub
>>> import pytest
>>> dpath = ub.Path.appdir('ubelt', 'test_symlink2').ensuredir()
>>> real_path = dpath / 'real_file.txt'
>>> link_path = dpath / 'link_file.txt'
>>> real_path.write_text('foo')
>>> with pytest.raises(ValueError, match='link_path .* cannot be empty'):
>>>     ub.symlink(real_path, '')
>>> with pytest.raises(ValueError, match='real_path .* cannot be empty'):
>>>     ub.symlink('', link_path)

ubelt.take(items, indices, default=NoParam)[source]¶

Lookup a subset of an indexable object using a sequence of indices.

The items input is usually a list or dictionary. When items is a list, this should be a sequence of integers. When items is a dict, this is a list of keys to lookup in that dictionary.

For dictionaries, a default may be specified as a placeholder to use if a key from indices is not in items.

Parameters

items (Sequence[VT] | Mapping[KT, VT]) – An indexable object to select items from.
indices (Iterable[int | KT]) – A sequence of indexes into items.
default (Any, default=NoParam) – if specified items must support the get method.

Yields

VT – a selected item within the list

SeeAlso:: ubelt.dict_subset()

Note

ub.take(items, indices) is equivalent to (items[i] for i in indices) when default is unspecified.

Notes

This is based on the numpy.take() function, but written in pure python.

Do not confuse this with more_itertools.take(), the behavior is very different.

Example

>>> import ubelt as ub
>>> items = [0, 1, 2, 3]
>>> indices = [2, 0]
>>> list(ub.take(items, indices))
[2, 0]

Example

>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> keys = [1, 2, 3, 4, 5]
>>> result = list(ub.take(dict_, keys, None))
>>> assert result == ['a', 'b', 'c', None, None]

Example

>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> keys = [1, 2, 3, 4, 5]
>>> try:
>>>     print(list(ub.take(dict_, keys)))
>>>     raise AssertionError('did not get key error')
>>> except KeyError:
>>>     print('correctly got key error')

ubelt.timeparse(stamp, default_timezone='local', allow_dateutil=True)[source]¶

Create a datetime.datetime object from a string timestamp.

Without any extra dependencies this will parse the output of ubelt.util_time.timestamp() into a datetime object. In the case where the format differs, dateutil.parser.parse will be used if the python-dateutil package is installed.

Parameters

stamp (str) – a string encoded timestamp
default_timezone (str) – if the input does not specify a timezone, assume this one. Can be “local” or “utc”.
allow_dateutil (bool) – if False we only use the minimal parsing and do not allow a fallback to dateutil.

Returns

the parsed datetime

Return type

datetime.datetime

Raises

ValueError – if if parsing fails.

Todo

[ ] Allow defaulting to local or utm timezone (currently default is local)

Example

>>> import ubelt as ub
>>> # Demonstrate a round trip of timestamp and timeparse
>>> stamp = ub.timestamp()
>>> datetime = ub.timeparse(stamp)
>>> assert ub.timestamp(datetime) == stamp
>>> # Round trip with precision
>>> stamp = ub.timestamp(precision=4)
>>> datetime = ub.timeparse(stamp)
>>> assert ub.timestamp(datetime, precision=4) == stamp

Example

>>> import ubelt as ub
>>> # We should always be able to parse these
>>> good_stamps = [
>>>     '2000-11-22',
>>>     '2000-11-22T111111.44444Z',
>>>     '2000-11-22T111111.44444+5',
>>>     '2000-11-22T111111.44444-05',
>>>     '2000-11-22T111111.44444-0500',
>>>     '2000-11-22T111111.44444+0530',
>>>     '2000-11-22T111111Z',
>>>     '2000-11-22T111111+5',
>>>     '2000-11-22T111111+0530',
>>> ]
>>> for stamp in good_stamps:
>>>     print(f'----')
>>>     print(f'stamp={stamp}')
>>>     result = ub.timeparse(stamp, allow_dateutil=0)
>>>     print(f'result={result!r}')
>>>     recon = ub.timestamp(result)
>>>     print(f'recon={recon}')

Example

>>> import ubelt as ub
>>> # We require dateutil to handle these types of stamps
>>> import pytest
>>> conditional_stamps = [
>>>         '2000-01-02T11:23:58.12345+5:30',
>>>         '09/25/2003',
>>>         'Thu Sep 25 10:36:28 2003',
>>> ]
>>> for stamp in conditional_stamps:
>>>     with pytest.raises(ValueError):
>>>         result = ub.timeparse(stamp, allow_dateutil=False)
>>> have_dateutil = bool(ub.modname_to_modpath('dateutil'))
>>> if have_dateutil:
>>>     for stamp in conditional_stamps:
>>>         result = ub.timeparse(stamp)

ubelt.timestamp(datetime=None, precision=0, default_timezone='local', allow_dateutil=True)[source]¶

Make a concise iso8601 timestamp suitable for use in filenames.

Parameters

datetime (datetime.datetime | datetime.date | None) – A datetime to format into a timestamp. If unspecified, the current local time is used. If given as a date, the time 00:00 is used.
precision (int) – if non-zero, adds up to 6 digits of sub-second precision.
default_timezone (str | datetime.timezone) – if the input does not specify a timezone, assume this one. Can be “local” or “utc”, or a standardized code if dateutil is installed.
allow_dateutil (bool) – if True, will use dateutil to lookup the default timezone if needed

Returns

The timestamp, which will always contain a date, time, and timezone.

Return type

Note

For more info see [WikiISO8601], [PyStrptime], [PyTime].

References

WikiISO8601: https://en.wikipedia.org/wiki/ISO_8601
PyStrptime: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
PyTime: https://docs.python.org/3/library/time.html

Example

>>> import ubelt as ub
>>> stamp = ub.timestamp()
>>> print('stamp = {!r}'.format(stamp))
stamp = ...-...-...T...

Example

>>> import ubelt as ub
>>> import datetime as datetime_mod
>>> from datetime import datetime as datetime_cls
>>> # Create a datetime object with timezone information
>>> ast_tzinfo = datetime_mod.timezone(datetime_mod.timedelta(hours=-4), 'AST')
>>> datetime = datetime_cls.utcfromtimestamp(123456789.123456789).replace(tzinfo=ast_tzinfo)
>>> stamp = ub.timestamp(datetime, precision=2)
>>> print('stamp = {!r}'.format(stamp))
stamp = '1973-11-29T213309.12-4'

>>> # Demo with a fractional hour timezone
>>> act_tzinfo = datetime_mod.timezone(datetime_mod.timedelta(hours=+9.5), 'ACT')
>>> datetime = datetime_cls.utcfromtimestamp(123456789.123456789).replace(tzinfo=act_tzinfo)
>>> stamp = ub.timestamp(datetime, precision=2)
>>> print('stamp = {!r}'.format(stamp))
stamp = '1973-11-29T213309.12+0930'

>>> # Can accept datetime or date objects with local, utc, or custom default timezones
>>> act_tzinfo = datetime_mod.timezone(datetime_mod.timedelta(hours=+9.5), 'ACT')
>>> datetime_utc = ub.timeparse('2020-03-05T112233', default_timezone='utc')
>>> datetime_act = ub.timeparse('2020-03-05T112233', default_timezone=act_tzinfo)
>>> datetime_notz = datetime_utc.replace(tzinfo=None)
>>> date = datetime_utc.date()
>>> stamp_utc = ub.timestamp(datetime_utc)
>>> stamp_act = ub.timestamp(datetime_act)
>>> stamp_date_utc = ub.timestamp(date, default_timezone='utc')
>>> print(f'stamp_utc      = {stamp_utc}')
>>> print(f'stamp_act      = {stamp_act}')
>>> print(f'stamp_date_utc = {stamp_date_utc}')
stamp_utc      = 2020-03-05T112233+0
stamp_act      = 2020-03-05T112233+0930
stamp_date_utc = 2020-03-05T000000+0

Example

>>> # xdoctest: +REQUIRES(module:dateutil)
>>> # Make sure we are compatible with dateutil
>>> import ubelt as ub
>>> from dateutil.tz import tzlocal
>>> import datetime as datetime_mod
>>> from datetime import datetime as datetime_cls
>>> tz_act = datetime_mod.timezone(datetime_mod.timedelta(hours=+9.5), 'ACT')
>>> tzinfo_list = [
>>>     tz_act,
>>>     datetime_mod.timezone(datetime_mod.timedelta(hours=-4), 'AST'),
>>>     datetime_mod.timezone(datetime_mod.timedelta(hours=0), 'UTC'),
>>>     datetime_mod.timezone.utc,
>>>     None,
>>>     tzlocal()
>>> ]
>>> # Note: there is a win32 bug here
>>> # https://bugs.python.org/issue37 that means we cant use
>>> # dates close to the epoch
>>> datetime_list = [
>>>     datetime_cls.utcfromtimestamp(123456789.123456789 + 315360000),
>>>     datetime_cls.utcfromtimestamp(0 + 315360000),
>>> ]
>>> basis = {
>>>     'precision': [0, 3, 9],
>>>     'tzinfo': tzinfo_list,
>>>     'datetime': datetime_list,
>>>     'default_timezone': ['local', 'utc', tz_act],
>>> }
>>> for params in ub.named_product(basis):
>>>     dtime = params['datetime'].replace(tzinfo=params['tzinfo'])
>>>     precision = params.get('precision', 0)
>>>     stamp = ub.timestamp(datetime=dtime, precision=precision)
>>>     recon = ub.timeparse(stamp)
>>>     alt = recon.strftime('%Y-%m-%dT%H%M%S.%f%z')
>>>     print('---')
>>>     print('params = {}'.format(ub.repr2(params, nl=1)))
>>>     print(f'dtime={dtime}')
>>>     print(f'stamp={stamp}')
>>>     print(f'recon={recon}')
>>>     print(f'alt  ={alt}')
>>>     shift = 10 ** precision
>>>     a = int(dtime.timestamp() * shift)
>>>     b = int(recon.timestamp() * shift)
>>>     assert a == b, f'{a} != {b}'

ubelt.touch(fpath, mode=438, dir_fd=None, verbose=0, **kwargs)[source]¶

change file timestamps

Works like the touch unix utility

Parameters

fpath (str | PathLike) – name of the file
mode (int) – file permissions (python3 and unix only)
dir_fd (io.IOBase) – optional directory file descriptor. If specified, fpath is interpreted as relative to this descriptor (python 3 only).
verbose (int) – verbosity
**kwargs – extra args passed to os.utime() (python 3 only).

Returns

path to the file

Return type

References

SO_1158076: https://stackoverflow.com/questions/1158076/implement-touch-using-python

Example

>>> import ubelt as ub
>>> from os.path import join
>>> dpath = ub.Path.appdir('ubelt').ensuredir()
>>> fpath = join(dpath, 'touch_file')
>>> assert not exists(fpath)
>>> ub.touch(fpath)
>>> assert exists(fpath)
>>> os.unlink(fpath)

ubelt.udict¶: alias of UDict

ubelt.unique(items, key=None)[source]¶

Generates unique items in the order they appear.

Parameters

items (Iterable[T]) – list of items
key (Callable[[T], Any], default=None) – custom normalization function. If specified returns items where key(item) is unique.

Yields

T – a unique item from the input sequence

Notes

Functionally equivalent to more_itertools.unique_everseen().

Example

>>> import ubelt as ub
>>> items = [4, 6, 6, 0, 6, 1, 0, 2, 2, 1]
>>> unique_items = list(ub.unique(items))
>>> assert unique_items == [4, 6, 0, 1, 2]

Example

>>> import ubelt as ub
>>> items = ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'D', 'E']
>>> unique_items = list(ub.unique(items, key=str.lower))
>>> assert unique_items == ['A', 'b', 'C', 'D', 'e']
>>> unique_items = list(ub.unique(items))
>>> assert unique_items == ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'E']

ubelt.unique_flags(items, key=None)[source]¶

Returns a list of booleans corresponding to the first instance of each unique item.

Parameters

items (Sequence[VT]) – indexable collection of items
key (Callable[[VT], Any] | None, default=None) – custom normalization function. If specified returns items where key(item) is unique.

Returns

flags the items that are unique

Return type

List[bool]

Example

>>> import ubelt as ub
>>> items = [0, 2, 1, 1, 0, 9, 2]
>>> flags = ub.unique_flags(items)
>>> assert flags == [True, True, True, False, False, True, False]
>>> flags = ub.unique_flags(items, key=lambda x: x % 2 == 0)
>>> assert flags == [True, False, True, False, False, False, False]

ubelt.userhome(username=None)[source]¶

Returns the path to some user’s home directory.

Parameters

username (str | None) – name of a user on the system. If not specified, the current user is inferred.

Returns

userhome_dpath - path to the specified home directory

Return type