ubelt.util_repr module¶
Defines the function urepr()
, which allows for a bit more customization
than repr()
or pprint.pformat()
. See the docstring for more details.
Two main goals of urepr are to provide nice string representations of nested
data structures and make those “eval-able” whenever possible. As an example
take the value float('inf')
, which normally has a non-evalable repr of
inf
:
>>> import ubelt as ub
>>> ub.urepr(float('inf'))
"float('inf')"
The newline
(or nl
) keyword argument can control how deep in the
nesting newlines are allowed.
>>> print(ub.urepr({1: float('nan'), 2: float('inf'), 3: 3.0}))
{
1: float('nan'),
2: float('inf'),
3: 3.0,
}
>>> print(ub.urepr({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0))
{1: float('nan'), 2: float('inf'), 3: 3.0}
You can also define or overwrite how representations for different types are
created. You can either create your own extension object, or you can
monkey-patch ub.util_repr._REPR_EXTENSIONS
without specifying the
extensions keyword argument (although this will be a global change).
>>> import ubelt as ub
>>> extensions = ub.util_repr.ReprExtensions()
>>> @extensions.register(float)
>>> def my_float_formater(data, **kw):
>>> return "monkey({})".format(data)
>>> print(ub.urepr({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0, extensions=extensions))
{1: monkey(nan), 2: monkey(inf), 3: monkey(3.0)}
As of ubelt 1.1.0 you can now access and update the default extensions via the
EXTENSIONS
attribute of the urepr()
function itself.
>>> # xdoctest: +SKIP
>>> # We skip this at test time to not modify global state
>>> import ubelt as ub
>>> @ub.urepr.EXTENSIONS.register(float)
>>> def my_float_formater(data, **kw):
>>> return "monkey2({})".format(data)
>>> print(ub.urepr({1: float('nan'), 2: float('inf'), 3: 3.0}, nl=0))
- ubelt.util_repr.urepr(data, **kwargs)[source]¶
Makes a pretty string representation of
data
.Makes a pretty and easy-to-doctest string representation. Has nice handling of common nested datatypes. This is an alternative to repr, and
pprint.pformat()
.This output of this function are configurable. By default it aims to produce strings that are consistent, compact, and executable. This makes them great for doctests.
Note
This function has many keyword arguments that can be used to customize the final representation. For convenience some of the more frequently used kwargs have short aliases. See “Kwargs” for more details.
Note
For large data items, this can be noticeably slower than pprint.pformat and much slower than the builtin repr. Benchmarks exist in the repo under dev/bench/bench_urepr_vs_alternatives.py
- Parameters:
data (object) – an arbitrary python object to form the string “representation” of
- Kwargs:
- si, stritems, (bool):
dict/list items use str instead of repr
- strkeys, sk (bool):
dict keys use str instead of repr
- strvals, sv (bool):
dict values use str instead of repr
- nl, newlines (int | bool):
number of top level nestings to place a newline after. If true all items are followed by newlines regardless of nesting level. Defaults to 1 for lists and True for dicts.
- nobr, nobraces (bool):
if True, text will not contain outer braces for containers. Defaults to False.
- cbr, compact_brace (bool):
if True, braces are compactified (i.e. they will not have newlines placed directly after them, think java / K&R / 1TBS). Defaults to False.
- trailsep, trailing_sep (bool):
if True, a separator is placed after the last item in a sequence. By default this is True if there are any
nl > 0
.- explicit (bool):
changes dict representation from
{k1: v1, ...}
todict(k1=v1, ...)
. Defaults to False.- Modifies:
default kvsep is modified to
'='
dict braces from {} to dict().
- compact (bool):
Produces values more suitable for space constrianed environments Defaults to False.
- Modifies:
default kvsep is modified to
'='
default itemsep is modified to''
default nobraces is modified to1
. default newlines is modified to0
. default strkeys toTrue
default strvals toTrue
- precision (int | None):
if specified floats are formatted with this precision. Defaults to None
- kvsep (str):
separator between keys and values. Defaults to ‘: ‘
- itemsep (str):
separator between items. This separator is placed after commas, which are currently not configurable. This may be modified in the future. Defaults to ‘ ‘.
- sort (bool | callable | None):
if ‘auto’, then sort unordered collections, but keep the ordering of ordered collections. This option attempts to be deterministic in most cases. Defaults to None.
if True, then ALL collections will be sorted in the returned text.
- suppress_small (bool):
passed to
numpy.array2string()
for ndarrays- max_line_width (int):
passed to
numpy.array2string()
for ndarrays- with_dtype (bool):
only relevant to numpy.ndarrays. if True includes the dtype. Defaults to not strvals.
- align (bool | str):
if True, will align multi-line dictionaries by the kvsep. Defaults to False.
- extensions (ReprExtensions):
a custom
ReprExtensions
instance that can overwrite or define how different types of objects are formatted.
- Returns:
outstr - output string
- Return type:
Note
There are also internal kwargs, which should not be used:
_return_info (bool): return information about child context
_root_info (depth): information about parent context
- RelatedWork:
Example
>>> import ubelt as ub >>> dict_ = { ... 'custom_types': [slice(0, 1, None), 1/3], ... 'nest_dict': {'k1': [1, 2, {3: {4, 5}}], ... 'key2': [1, 2, {3: {4, 5}}], ... 'key3': [1, 2, {3: {4, 5}}], ... }, ... 'nest_dict2': {'k': [1, 2, {3: {4, 5}}]}, ... 'nested_tuples': [tuple([1]), tuple([2, 3]), frozenset([4, 5, 6])], ... 'one_tup': tuple([1]), ... 'simple_dict': {'spam': 'eggs', 'ham': 'jam'}, ... 'simple_list': [1, 2, 'red', 'blue'], ... 'odict': ub.odict([(2, '1'), (1, '2')]), ... } >>> # In the interest of saving space we are only going to show the >>> # output for the first example. >>> result = ub.urepr(dict_, nl=1, precision=2) >>> import pytest >>> import sys >>> if sys.version_info[0:2] <= (3, 6): >>> # dictionary order is not guaranteed in 3.6 use repr2 instead >>> pytest.skip() >>> print(result) { 'custom_types': [slice(0, 1, None), 0.33], 'nest_dict': {'k1': [1, 2, {3: {4, 5}}], 'key2': [1, 2, {3: {4, 5}}], 'key3': [1, 2, {3: {4, 5}}]}, 'nest_dict2': {'k': [1, 2, {3: {4, 5}}]}, 'nested_tuples': [(1,), (2, 3), {4, 5, 6}], 'one_tup': (1,), 'simple_dict': {'spam': 'eggs', 'ham': 'jam'}, 'simple_list': [1, 2, 'red', 'blue'], 'odict': {2: '1', 1: '2'}, } >>> # You can try the rest yourself. >>> result = ub.urepr(dict_, nl=3, precision=2); print(result) >>> result = ub.urepr(dict_, nl=2, precision=2); print(result) >>> result = ub.urepr(dict_, nl=1, precision=2, itemsep='', explicit=True); print(result) >>> result = ub.urepr(dict_, nl=1, precision=2, nobr=1, itemsep='', explicit=True); print(result) >>> result = ub.urepr(dict_, nl=3, precision=2, cbr=True); print(result) >>> result = ub.urepr(dict_, nl=3, precision=2, si=True); print(result) >>> result = ub.urepr(dict_, nl=3, sort=True); print(result) >>> result = ub.urepr(dict_, nl=3, sort=False, trailing_sep=False); print(result) >>> result = ub.urepr(dict_, nl=3, sort=False, trailing_sep=False, nobr=True); print(result)
Example
>>> import ubelt as ub >>> def _nest(d, w): ... if d == 0: ... return {} ... else: ... return {'n{}'.format(d): _nest(d - 1, w + 1), 'm{}'.format(d): _nest(d - 1, w + 1)} >>> dict_ = _nest(d=4, w=1) >>> result = ub.urepr(dict_, nl=6, precision=2, cbr=1) >>> print('---') >>> print(result) >>> result = ub.urepr(dict_, nl=-1, precision=2) >>> print('---') >>> print(result)
Example
>>> import ubelt as ub >>> data = {'a': 100, 'b': [1, '2', 3], 'c': {20:30, 40: 'five'}} >>> print(ub.urepr(data, nl=1)) { 'a': 100, 'b': [1, '2', 3], 'c': {20: 30, 40: 'five'}, } >>> # Compact is useful for things like timerit.Timerit labels >>> print(ub.urepr(data, compact=True)) a=100,b=[1,2,3],c={20=30,40=five} >>> print(ub.urepr(data, compact=True, nobr=False)) {a=100,b=[1,2,3],c={20=30,40=five}}
- class ubelt.util_repr.ReprExtensions[source]¶
Bases:
object
Helper class for managing non-builtin (e.g. numpy) format types.
This module (
ubelt.util_repr
) maintains a global set of basic extensions, but it is also possible to create a locally scoped set of extensions and explicitly pass it to urepr. The following example demonstrates this.Example
>>> import ubelt as ub >>> class MyObject(object): >>> pass >>> data = {'a': [1, 2.2222, MyObject()], 'b': MyObject()} >>> # Create a custom set of extensions >>> extensions = ub.ReprExtensions() >>> # Register a function to format your specific type >>> @extensions.register(MyObject) >>> def format_myobject(data, **kwargs): >>> return 'I can do anything here' >>> # Repr2 will now respect the passed custom extensions >>> # Note that the global extensions will still be respected >>> # unless they are overloaded. >>> print(ub.urepr(data, nl=-1, precision=1, extensions=extensions)) { 'a': [1, 2.2, I can do anything here], 'b': I can do anything here } >>> # Overload the formatter for float and int >>> @extensions.register((float, int)) >>> def format_myobject(data, **kwargs): >>> return str((data + 10) // 2) >>> print(ub.urepr(data, nl=-1, precision=1, extensions=extensions)) { 'a': [5, 6.0, I can do anything here], 'b': I can do anything here }
- register(key)[source]¶
Registers a custom formatting function with ub.urepr
- Parameters:
key (Type | Tuple[Type] | str) – indicator of the type
- Returns:
decorator function
- Return type:
Callable
- lookup(data)[source]¶
Returns an appropriate function to format
data
if one has been registered.- Parameters:
data (Any) – an instance that may have a registered formatter
- Returns:
the formatter for the given type
- Return type:
Callable
- _register_pandas_extensions()[source]¶
Example
>>> # xdoctest: +REQUIRES(module:pandas) >>> # xdoctest: +IGNORE_WHITESPACE >>> import pandas as pd >>> import numpy as np >>> import ubelt as ub >>> rng = np.random.RandomState(0) >>> data = pd.DataFrame(rng.rand(3, 3)) >>> print(ub.urepr(data)) >>> print(ub.urepr(data, precision=2)) >>> print(ub.urepr({'akeyfdfj': data}, precision=2))
- _register_numpy_extensions()[source]¶
Example
>>> # xdoctest: +REQUIRES(module:numpy) >>> import sys >>> import pytest >>> import ubelt as ub >>> if not ub.modname_to_modpath('numpy'): ... raise pytest.skip() >>> # xdoctest: +IGNORE_WHITESPACE >>> import numpy as np >>> data = np.array([[.2, 42, 5], [21.2, 3, .4]]) >>> print(ub.urepr(data)) np.array([[ 0.2, 42. , 5. ], [21.2, 3. , 0.4]], dtype=np.float64) >>> print(ub.urepr(data, with_dtype=False)) np.array([[ 0.2, 42. , 5. ], [21.2, 3. , 0.4]]) >>> print(ub.urepr(data, strvals=True)) [[ 0.2, 42. , 5. ], [21.2, 3. , 0.4]] >>> data = np.empty((0, 10), dtype=np.float64) >>> print(ub.urepr(data, strvals=False)) np.empty((0, 10), dtype=np.float64) >>> print(ub.urepr(data, strvals=True)) [] >>> data = np.ma.empty((0, 10), dtype=np.float64) >>> print(ub.urepr(data, strvals=False)) np.ma.empty((0, 10), dtype=np.float64)