ubelt.util_format module

Defines the function repr2(), which allows for a bit more customization than repr() or pprint(). See the docstring for more details.

Two main goals of repr2 are to provide nice string representations of nested data structures and make those “eval-able” whenever possible. As an example take the value float('inf'), which normally has a non-evalable repr of inf:

>>> import ubelt as ub
>>> ub.repr2(float('inf'))
"float('inf')"

The newline (or nl) keyword argument can control how deep in the nesting newlines are allowed.

>>> print(ub.repr2({1: float('nan'), 2: float('inf'), 3: 3.0}))
{
    1: float('nan'),
    2: float('inf'),
    3: 3.0,
}
>>> print(ub.repr2({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0))
{1: float('nan'), 2: float('inf'), 3: 3.0}

You can also define or overwrite how representations for different types are created. You can either create your own extension object, or you can monkey-patch ub.util_format._FORMATTER_EXTENSIONS without specifying the extensions keyword argument (although this will be a global change).

>>> extensions = ub.util_format.FormatterExtensions()
>>> @extensions.register(float)
>>> def my_float_formater(data, **kw):
>>>     return "monkey({})".format(data)
>>> print(ub.repr2({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0, extensions=extensions))
{1: monkey(nan), 2: monkey(inf), 3: monkey(3.0)}

As of ubelt 1.1.0 you can now access and update the default extensions via the repr2 function itself.

>>> # xdoctest: +SKIP
>>> # We skip this at test time to not modify global state
>>> @ub.repr2.EXTENSIONS.register(float)
>>> def my_float_formater(data, **kw):
>>>     return "monkey2({})".format(data)
>>> print(ub.repr2({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0))
ubelt.util_format.repr2(data, **kwargs)[source]

Makes a pretty string representation of data.

Makes a pretty and easy-to-doctest string representation. Has nice handling of common nested datatypes. This is an alternative to repr, and pprint.pformat().

This output of this function are configurable. By default it aims to produce strings that are consistent, compact, and executable. This makes them great for doctests.

Note

This function has many keyword arguments that can be used to customize the final representation. For convenience some of the more frequently used kwargs have short aliases. See “Kwargs” for more details.

Parameters

data (object) – an arbitrary python object to form the string “representation” of

Kwargs:
si, stritems, (bool):

dict/list items use str instead of repr

strkeys, sk (bool):

dict keys use str instead of repr

strvals, sv (bool):

dict values use str instead of repr

nl, newlines (int | bool):

number of top level nestings to place a newline after. If true all items are followed by newlines regardless of nesting level. Defaults to 1 for lists and True for dicts.

nobr, nobraces (bool, default=False):

if True, text will not contain outer braces for containers

cbr, compact_brace (bool, default=False):

if True, braces are compactified (i.e. they will not have newlines placed directly after them, think java / K&R / 1TBS)

trailsep, trailing_sep (bool):

if True, a separator is placed after the last item in a sequence. By default this is True if there are any nl > 0.

explicit (bool, default=False):

changes dict representation from {k1: v1, ...} to dict(k1=v1, ...).

Modifies:

default kvsep is modified to '=' dict braces from {} to dict().

compact (bool, default=False):

Produces values more suitable for space constrianed environments

Modifies:

default kvsep is modified to '=' default itemsep is modified to '' default nobraces is modified to 1. default newlines is modified to 0. default strkeys to True default strvals to True

precision (int, default=None):

if specified floats are formatted with this precision

kvsep (str, default=’: ‘):

separator between keys and values

itemsep (str, default=’ ‘):

separator between items. This separator is placed after commas, which are currently not configurable. This may be modified in the future.

sort (bool | callable, default=None):

if None, then sort unordered collections, but keep the ordering of ordered collections. This option attempts to be deterministic in most cases.

New in 0.8.0: if sort is callable, it will be used as a key-function to sort all collections.

if False, then nothing will be sorted, and the representation of unordered collections will be arbitrary and possibly non-determenistic.

if True, attempts to sort all collections in the returned text. Currently if True this WILL sort lists. Currently if True this WILL NOT sort OrderedDicts.

NOTE:

The previous behavior may not be intuitive, as such the behavior of this arg is subject to change.

suppress_small (bool):

passed to numpy.array2string() for ndarrays

max_line_width (int):

passed to numpy.array2string() for ndarrays

with_dtype (bool):

only relevant to numpy.ndarrays. if True includes the dtype. Defaults to not strvals.

align (bool | str, default=False):

if True, will align multi-line dictionaries by the kvsep

extensions (FormatterExtensions):

a custom FormatterExtensions instance that can overwrite or define how different types of objects are formatted.

Returns

outstr - output string

Return type

str

Note

There are also internal kwargs, which should not be used:

_return_info (bool): return information about child context

_root_info (depth): information about parent context

RelatedWork:

rich.pretty.pretty_repr() pprint.pformat()

Example

>>> import ubelt as ub
>>> dict_ = {
...     'custom_types': [slice(0, 1, None), 1/3],
...     'nest_dict': {'k1': [1, 2, {3: {4, 5}}],
...                   'key2': [1, 2, {3: {4, 5}}],
...                   'key3': [1, 2, {3: {4, 5}}],
...                   },
...     'nest_dict2': {'k': [1, 2, {3: {4, 5}}]},
...     'nested_tuples': [tuple([1]), tuple([2, 3]), frozenset([4, 5, 6])],
...     'one_tup': tuple([1]),
...     'simple_dict': {'spam': 'eggs', 'ham': 'jam'},
...     'simple_list': [1, 2, 'red', 'blue'],
...     'odict': ub.odict([(1, '1'), (2, '2')]),
... }
>>> # In the interest of saving space we are only going to show the
>>> # output for the first example.
>>> result = ub.repr2(dict_, nl=1, precision=2)
>>> print(result)
{
    'custom_types': [slice(0, 1, None), 0.33],
    'nest_dict': {'k1': [1, 2, {3: {4, 5}}], 'key2': [1, 2, {3: {4, 5}}], 'key3': [1, 2, {3: {4, 5}}]},
    'nest_dict2': {'k': [1, 2, {3: {4, 5}}]},
    'nested_tuples': [(1,), (2, 3), {4, 5, 6}],
    'odict': {1: '1', 2: '2'},
    'one_tup': (1,),
    'simple_dict': {'ham': 'jam', 'spam': 'eggs'},
    'simple_list': [1, 2, 'red', 'blue'],
}
>>> # You can try the rest yourself.
>>> result = ub.repr2(dict_, nl=3, precision=2); print(result)
>>> result = ub.repr2(dict_, nl=2, precision=2); print(result)
>>> result = ub.repr2(dict_, nl=1, precision=2, itemsep='', explicit=True); print(result)
>>> result = ub.repr2(dict_, nl=1, precision=2, nobr=1, itemsep='', explicit=True); print(result)
>>> result = ub.repr2(dict_, nl=3, precision=2, cbr=True); print(result)
>>> result = ub.repr2(dict_, nl=3, precision=2, si=True); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=True); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=False, trailing_sep=False); print(result)
>>> result = ub.repr2(dict_, nl=3, sort=False, trailing_sep=False, nobr=True); print(result)

Example

>>> import ubelt as ub
>>> def _nest(d, w):
...     if d == 0:
...         return {}
...     else:
...         return {'n{}'.format(d): _nest(d - 1, w + 1), 'm{}'.format(d): _nest(d - 1, w + 1)}
>>> dict_ = _nest(d=4, w=1)
>>> result = ub.repr2(dict_, nl=6, precision=2, cbr=1)
>>> print('---')
>>> print(result)
>>> result = ub.repr2(dict_, nl=-1, precision=2)
>>> print('---')
>>> print(result)

Example

>>> import ubelt as ub
>>> data = {'a': 100, 'b': [1, '2', 3], 'c': {20:30, 40: 'five'}}
>>> print(ub.repr2(data, nl=1))
{
    'a': 100,
    'b': [1, '2', 3],
    'c': {20: 30, 40: 'five'},
}
>>> # Compact is useful for things like timerit.Timerit labels
>>> print(ub.repr2(data, compact=True))
a=100,b=[1,2,3],c={20=30,40=five}
>>> print(ub.repr2(data, compact=True, nobr=False))
{a=100,b=[1,2,3],c={20=30,40=five}}
class ubelt.util_format.FormatterExtensions[source]

Bases: object

Helper class for managing non-builtin (e.g. numpy) format types.

This module (ubelt.util_format) maintains a global set of basic extensions, but it is also possible to create a locally scoped set of extensions and explicitly pass it to repr2. The following example demonstrates this.

Example

>>> import ubelt as ub
>>> class MyObject(object):
>>>     pass
>>> data = {'a': [1, 2.2222, MyObject()], 'b': MyObject()}
>>> # Create a custom set of extensions
>>> extensions = ub.FormatterExtensions()
>>> # Register a function to format your specific type
>>> @extensions.register(MyObject)
>>> def format_myobject(data, **kwargs):
>>>     return 'I can do anything here'
>>> # Repr2 will now respect the passed custom extensions
>>> # Note that the global extensions will still be respected
>>> # unless they are overloaded.
>>> print(ub.repr2(data, nl=-1, precision=1, extensions=extensions))
{
    'a': [1, 2.2, I can do anything here],
    'b': I can do anything here
}
>>> # Overload the formatter for float and int
>>> @extensions.register((float, int))
>>> def format_myobject(data, **kwargs):
>>>     return str((data + 10) // 2)
>>> print(ub.repr2(data, nl=-1, precision=1, extensions=extensions))
{
    'a': [5, 6.0, I can do anything here],
    'b': I can do anything here
}
register(key)[source]

Registers a custom formatting function with ub.repr2

Parameters

key (Type | Tuple[Type] | str) – indicator of the type

Returns

decorator function

Return type

Callable

lookup(data)[source]

Returns an appropriate function to format data if one has been registered.