ubelt.util_cache module¶

This module exposes Cacher and CacheStamp classes, which provide a simple API for on-disk caching.

The Cacher class is the simplest and most direct method of caching. In fact, it only requires four lines of boilerplate, which is the smallest general and robust way that I (Jon Crall) have achieved, and I don’t think its possible to do better. These four lines implement the following necessary and sufficient steps for general robust on-disk caching.

Defining the cache dependencies

Checking if the cache missed

Loading the cache on a hit

Executing the process and saving the result on a miss.

The following example illustrates these four points.

Example

>>> import ubelt as ub
>>> # Define a cache name and dependencies (which is fed to `ub.hash_data`)
>>> cacher = ub.Cacher('name', depends='set-of-deps')  # boilerplate:1
>>> # Calling tryload will return your data on a hit and None on a miss
>>> data = cacher.tryload(on_error='clear')            # boilerplate:2
>>> # Check if you need to recompute your data
>>> if data is None:                                   # boilerplate:3
>>>     # Your code to recompute data goes here (this is not boilerplate).
>>>     data = 'mydata'
>>>     # Cache the computation result (via pickle)
>>>     cacher.save(data)                              # boilerplate:4

Surprisingly this uses just as many boilerplate lines as a decorator style cacher, but it is much more extensible. It is possible to use Cacher in more sophisticated ways (e.g. metadata), but the simple in-line use is often easier and cleaner. The following example illustrates this:

Example

>>> import ubelt as ub

>>> @ub.Cacher('name', depends={'dep1': 1, 'dep2': 2})  # boilerplate:1
>>> def func():                                         # boilerplate:2
>>>     data = 'mydata'
>>>     return data                                     # boilerplate:3
>>> data = func()                                       # boilerplate:4

>>> cacher = ub.Cacher('name', depends=['dependencies'])  # boilerplate:1
>>> data = cacher.tryload(on_error='clear')               # boilerplate:2
>>> if data is None:                                      # boilerplate:3
>>>     data = 'mydata'
>>>     cacher.save(data)                                 # boilerplate:4

While the above two are equivalent, the second version provides a simpler traceback, explicit procedures, and makes it easier to use breakpoint debugging (because there is no closure scope).

While Cacher is used to store direct results of in-line code in a pickle format, the CacheStamp object is used to cache processes that produces an on-disk side effects other than the main return value. For instance, consider the following example:

Example

>>> import ubelt as ub
>>> def compute_many_files(dpath):
...     for i in range(10):
...         fpath = '{}/file{}.txt'.format(dpath, i)
...         with open(fpath, 'w') as file:
...             file.write('foo' + str(i))
>>> dpath = ub.Path.appdir('ubelt/demo/cache').delete().ensuredir()
>>> # You must specify a directory, unlike in Cacher where it is optional
>>> self = ub.CacheStamp('name', dpath=dpath, depends={'a': 1, 'b': 2})
>>> if self.expired():
>>>     compute_many_files(dpath)
>>>     # Instead of caching the whole processes, we just write a file
>>>     # that signals the process has been done.
>>>     self.renew()
>>> assert not self.expired()

The CacheStamp is lightweight in that it simply marks that a process has been completed, but the job of saving / loading the actual data is left to the developer. The expired method checks if the stamp exists, and renew writes the stamp to disk.

In ubelt version 1.1.0, several additional features were added to CacheStamp. In addition to specifying parameters via depends, it is also possible for CacheStamp to determine if an associated file has been modified. To do this, the paths of the files must be known a-priori and passed to CacheStamp via the product argument. This will allow the CacheStamp to detect if the files have been modified since the renew method was called. It does this by remembering the size, modified time, and checksum of each file. If the hash of the expected hash of the product is known in advance, it is also possible to specify the expected hash_prefix of each product. In this case, renew will raise an Exception if this specified hash prefix does not match the files on disk. Lastly, it is possible to specify an expiration time via expires, after which the CacheStamp will always be marked as invalid. This is now the mechanism via which the cache in ubelt.util_download.grabdata() works.

Example

>>> import ubelt as ub
>>> dpath = ub.Path.appdir('ubelt/demo/cache').delete().ensuredir()
>>> params = {'a': 1, 'b': 2}
>>> expected_fpaths = [dpath / 'file{}.txt'.format(i) for i in range(2)]
>>> hash_prefix = ['a7a8a91659601590e17191301dc1',
...                '55ae75d991c770d8f3ef07cbfde1']
>>> self = ub.CacheStamp('name', dpath=dpath, depends=params,
>>>                      hash_prefix=hash_prefix, hasher='sha256',
>>>                      product=expected_fpaths, expires='2101-01-01T000000Z')
>>> if self.expired():
>>>     for fpath in expected_fpaths:
...         fpath.write_text(fpath.name)
>>>     self.renew()
>>> # modifying or removing the file will cause the stamp to expire
>>> expected_fpaths[0].write_text('corrupted')
>>> assert self.expired()

RelatedWork:: https://github.com/shaypal5/cachier

class ubelt.util_cache.Cacher(fname, depends=None, dpath=None, appname='ubelt', ext='.pkl', meta=None, verbose=None, enabled=True, log=None, hasher='sha1', protocol=-1, cfgstr=None, backend='auto')[source]¶

Bases: object

Saves data to disk and reloads it based on specified dependencies.

Cacher uses pickle to save/load data to/from disk. Dependencies of the cached process can be specified, which ensures the cached data is recomputed if the dependencies change. If the location of the cache is not specified, it will default to the system user’s cache directory.

Related:: ..[JobLibMemory] https://joblib.readthedocs.io/en/stable/memory.html

Example

>>> import ubelt as ub
>>> depends = 'repr-of-params-that-uniquely-determine-the-process'
>>> # Create a cacher and try loading the data
>>> cacher = ub.Cacher('demo_process', depends, verbose=4)
>>> cacher.clear()
>>> print(f'cacher.fpath={cacher.fpath}')
>>> data = cacher.tryload()
>>> if data is None:
>>>     # Put expensive functions in if block when cacher misses
>>>     myvar1 = 'result of expensive process'
>>>     myvar2 = 'another result'
>>>     # Tell the cacher to write at the end of the if block
>>>     # It is idomatic to put results in an object named data
>>>     data = myvar1, myvar2
>>>     cacher.save(data)
>>> # Last part of the Cacher pattern is to unpack the data object
>>> myvar1, myvar2 = data
>>> #
>>> # If we know the data exists, we can also simply call load
>>> data = cacher.tryload()

Example

>>> # The previous example can be shorted if only a single value
>>> from ubelt.util_cache import Cacher
>>> depends = 'repr-of-params-that-uniquely-determine-the-process'
>>> # Create a cacher and try loading the data
>>> cacher = Cacher('demo_process', depends)
>>> myvar = cacher.tryload()
>>> if myvar is None:
>>>     myvar = ('result of expensive process', 'another result')
>>>     cacher.save(myvar)
>>> assert cacher.exists(), 'should now exist'

Parameters:

fname (str) – A file name. This is the prefix that will be used by the cache. It will always be used as-is.
depends (str | List[str] | None) – Indicate dependencies of this cache. If the dependencies change, then the cache is recomputed. New in version 0.8.9, replaces cfgstr.
dpath (str | PathLike | None) – Specifies where to save the cache. If unspecified, Cacher defaults to an application cache dir as given by appname. See ub.get_app_cache_dir() for more details.
appname (str) – Application name Specifies a folder in the application cache directory where to cache the data if dpath is not specified. Defaults to ‘ubelt’.
ext (str) – File extension for the cache format. Can be '.pkl' or '.json'. Defaults to '.pkl'.
meta (object | None) – Metadata that is also saved with the cfgstr. This can be useful to indicate how the cfgstr was constructed. Note: this is a candidate for deprecation.
verbose (int) – Level of verbosity. Can be 1, 2 or 3. Defaults to 1.
enabled (bool) – If set to False, then the load and save methods will do nothing. Defaults to True.
log (Callable[[str], Any]) – Overloads the print function. Useful for sending output to loggers (e.g. logging.info, tqdm.tqdm.write, …)
hasher (str) – Type of hashing algorithm to use if cfgstr needs to be condensed to less than 49 characters. Defaults to sha1.
protocol (int) – Protocol version used by pickle. Defaults to the -1 which is the latest protocol.
backend (str) – Set to either 'pickle' or 'json' to force backend. Defaults to auto which chooses one based on the extension.
cfgstr (str | None) – Deprecated in favor of depends.

VERBOSE = 1¶

FORCE_DISABLE = False¶

_rectify_cfgstr(cfgstr=None)[source]¶

_condense_cfgstr(cfgstr=None)[source]¶

property fpath: PathLike¶

get_fpath(cfgstr=None)[source]¶

Reports the filepath that the cacher will use.

It will attempt to use ‘{fname}_{cfgstr}{ext}’ unless that is too long. Then cfgstr will be hashed.

Parameters:: cfgstr (str | None) – overrides the instance-level cfgstr
Returns:: str | PathLike

Example

>>> # xdoctest: +REQUIRES(module:pytest)
>>> from ubelt.util_cache import Cacher
>>> import pytest
>>> #with pytest.warns(UserWarning):
>>> if 1:  # we no longer warn here
>>>     cacher = Cacher('test_cacher1')
>>>     cacher.get_fpath()
>>> self = Cacher('test_cacher2', depends='cfg1')
>>> self.get_fpath()
>>> self = Cacher('test_cacher3', depends='cfg1' * 32)
>>> self.get_fpath()

exists(cfgstr=None)[source]¶

Check to see if the cache exists

Parameters:: cfgstr (str | None) – overrides the instance-level cfgstr
Returns:: bool

existing_versions()[source]¶

Returns data with different cfgstr values that were previously computed with this cacher.

Yields:: str – paths to cached files corresponding to this cacher

Example

>>> # Ensure that some data exists
>>> import ubelt as ub
>>> dpath = ub.Path.appdir(
>>>     'ubelt/tests/util_cache',
>>>     'test-existing-versions').delete().ensuredir()
>>> cacher = ub.Cacher('versioned_data_v2', depends='1', dpath=dpath)
>>> cacher.ensure(lambda: 'data1')
>>> known_fpaths = set()
>>> known_fpaths.add(cacher.get_fpath())
>>> cacher = ub.Cacher('versioned_data_v2', depends='2', dpath=dpath)
>>> cacher.ensure(lambda: 'data2')
>>> known_fpaths.add(cacher.get_fpath())
>>> # List previously computed configs for this type
>>> from os.path import basename
>>> cacher = ub.Cacher('versioned_data_v2', depends='2', dpath=dpath)
>>> exist_fpaths = set(cacher.existing_versions())
>>> exist_fnames = list(map(basename, exist_fpaths))
>>> print('exist_fnames = {!r}'.format(exist_fnames))
>>> print('exist_fpaths = {!r}'.format(exist_fpaths))
>>> print('known_fpaths={!r}'.format(known_fpaths))
>>> assert exist_fpaths.issubset(known_fpaths)

clear(cfgstr=None)[source]¶

Removes the saved cache and metadata from disk

Parameters:: cfgstr (str | None) – overrides the instance-level cfgstr

tryload(cfgstr=None, on_error='raise')[source]¶

Like load, but returns None if the load fails due to a cache miss.

Parameters:

cfgstr (str | None) – overrides the instance-level cfgstr
on_error (str) – How to handle non-io errors errors. Either ‘raise’, which re-raises the exception, or ‘clear’ which deletes the cache and returns None. Defaults to ‘raise’.

Returns:

the cached data if it exists, otherwise returns None

Return type:

None | object

load(cfgstr=None)[source]¶

Load the data cached and raise an error if something goes wrong.

Parameters:: cfgstr (str | None) – overrides the instance-level cfgstr
Returns:: the cached data
Return type:: object
Raises:: IOError - if the data is unable to be loaded. This could be due to – a cache miss or because the cache is disabled.

Example

>>> from ubelt.util_cache import *  # NOQA
>>> # Setting the cacher as enabled=False turns it off
>>> cacher = Cacher('test_disabled_load', '', enabled=True,
>>>                 appname='ubelt/tests/util_cache')
>>> cacher.save('data')
>>> assert cacher.load() == 'data'
>>> cacher.enabled = False
>>> assert cacher.tryload() is None

save(data, cfgstr=None)[source]¶

Writes data to path specified by self.fpath.

Metadata containing information about the cache will also be appended to an adjacent file with the .meta suffix.

Parameters:

data (object) – arbitrary pickleable object to be cached
cfgstr (str | None) – overrides the instance-level cfgstr

Example

>>> from ubelt.util_cache import *  # NOQA
>>> # Normal functioning
>>> depends = 'long-cfg' * 32
>>> cacher = Cacher('test_enabled_save', depends=depends,
>>>                 appname='ubelt/tests/util_cache')
>>> cacher.save('data')
>>> assert exists(cacher.get_fpath()), 'should be enabled'
>>> assert exists(cacher.get_fpath() + '.meta'), 'missing metadata'
>>> # Setting the cacher as enabled=False turns it off
>>> cacher2 = Cacher('test_disabled_save', 'params', enabled=False,
>>>                  appname='ubelt/tests/util_cache')
>>> cacher2.save('data')
>>> assert not exists(cacher2.get_fpath()), 'should be disabled'

_backend_load(data_fpath)[source]¶

Example

>>> import ubelt as ub
>>> cacher = ub.Cacher('test_other_backend', depends=['a'], ext='.json')
>>> cacher.save(['data'])
>>> cacher.tryload()

>>> import ubelt as ub
>>> cacher = ub.Cacher('test_other_backend2', depends=['a'], ext='.yaml', backend='json')
>>> cacher.save({'data': [1, 2, 3]})
>>> cacher.tryload()

>>> import pytest
>>> with pytest.raises(ValueError):
>>>     ub.Cacher('test_other_backend2', depends=['a'], ext='.yaml', backend='does-not-exist')
>>> cacher = ub.Cacher('test_other_backend2', depends=['a'], ext='.really-a-pickle', backend='auto')
>>> assert cacher.backend == 'pickle', 'should be default'

_backend_dump(data_fpath, data)[source]¶

ensure(func, *args, **kwargs)[source]¶

Wraps around a function. A cfgstr must be stored in the base cacher.

Parameters:

func (Callable) – function that will compute data on cache miss
*args – passed to func
**kwargs – passed to func

Example

>>> from ubelt.util_cache import *  # NOQA
>>> def func():
>>>     return 'expensive result'
>>> fname = 'test_cacher_ensure'
>>> depends = 'func params'
>>> cacher = Cacher(fname, depends=depends)
>>> cacher.clear()
>>> data1 = cacher.ensure(func)
>>> data2 = cacher.ensure(func)
>>> assert data1 == 'expensive result'
>>> assert data1 == data2
>>> cacher.clear()

class ubelt.util_cache.CacheStamp(fname, dpath, cfgstr=None, product=None, hasher='sha1', verbose=None, enabled=True, depends=None, meta=None, hash_prefix=None, expires=None, ext='.pkl')[source]¶

Bases: object

Quickly determine if a file-producing computation has been done.

Check if the computation needs to be redone by calling expired. If the stamp is not expired, the user can expect that the results exist and could be loaded. If the stamp is expired, the computation should be redone. After the result is updated, the calls renew, which writes a “stamp” file to disk that marks that the procedure has been done.

There are several ways to control how a stamp expires. At a bare minimum, removing the stamp file will force expiration. However, in this circumstance CacheStamp only knows that something has been done, but it doesn’t have any information about what was done, so in general this is not sufficient.

To achieve more robust expiration behavior, the user should specify the product argument, which is a list of file paths that are expected to exist whenever the stamp is renewed. When this is specified the CacheStamp will expire if any of these products are deleted, their size changes, their modified timestamp changes, or their hash (i.e. checksum) changes. Note that by setting hasher=None, running and verifying checksums can be disabled.

If the user knows what the hash of the file should be this can be specified to prevent renewal of the stamp unless these match the files on disk. This can be useful for security purposes.

The stamp can also be set to expire at a specified time or after a specified duration using the expires argument.

Notes

The size, mtime, and hash mechanism is similar to how Makefile and redo caches work.

Variables:: cacher (Cacher) – underlying cacher object

Example

>>> import ubelt as ub
>>> # Stamp the computation of expensive-to-compute.txt
>>> dpath = ub.Path.appdir('ubelt/tests/cache-stamp')
>>> dpath.delete().ensuredir()
>>> product = dpath / 'expensive-to-compute.txt'
>>> self = ub.CacheStamp('somedata', depends='someconfig', dpath=dpath,
>>>                      product=product, hasher='sha256')
>>> self.clear()
>>> print(f'self.fpath={self.fpath}')
>>> if self.expired():
>>>     product.write_text('very expensive')
>>>     self.renew()
>>> assert not self.expired()
>>> # corrupting the output will cause the stamp to expire
>>> product.write_text('very corrupted')
>>> assert self.expired()

Parameters:

fname (str) – Name of the stamp file
dpath (str | PathLike | None) – Where to store the cached stamp file
product (str | PathLike | Sequence[str | PathLike] | None) – Path or paths that we expect the computation to produce. If specified the hash of the paths are stored.
hasher (str) – The type of hasher used to compute the file hash of product. If None, then we assume the file has not been corrupted or changed if the mtime and size are the same. Defaults to sha1.
verbose (bool | None) – Passed to internal ubelt.Cacher object. Defaults to None.
enabled (bool) – if False, expired always returns True. Defaults to True.
depends (str | List[str] | None) – Indicate dependencies of this cache. If the dependencies change, then the cache is recomputed. New to CacheStamp in version 0.9.2.
meta (object | None) – Metadata that is also saved as a sidecar file. New to CacheStamp in version 0.9.2. Note: this is a candidate for deprecation.
expires (str | int | datetime.datetime | datetime.timedelta | None) – If specified, sets an expiration date for the certificate. This can be an absolute datetime or a timedelta offset. If specified as an int, this is interpreted as a time delta in seconds. If specified as a str, this is interpreted as an absolute timestamp. Time delta offsets are coerced to absolute times at “renew” time.
hash_prefix (None | str | List[str]) – If specified, we verify that these match the hash(s) of the product(s) in the stamp certificate.
ext (str) – File extension for the cache format. Can be '.pkl' or '.json'. Defaults to '.pkl'.
cfgstr (str | None) – DEPRECATED.

property fpath¶

clear()[source]¶: Delete the stamp (the products are untouched)

_get_certificate(cfgstr=None)[source]¶: Returns the stamp certificate if it exists

_rectify_products(product=None)[source]¶

puts products in a normalized format

Returns:: List[Path]

_rectify_hash_prefixes()[source]¶: puts products in a normalized format

_product_info(product=None)[source]¶: Compute summary info about each product on disk.

_product_file_stats(product=None)[source]¶

_product_file_hash(product=None)[source]¶

expired(cfgstr=None, product=None)[source]¶

Check to see if a previously existing stamp is still valid, if the expected result of that computation still exists, and if all other expiration criteria are met.

Parameters:

cfgstr (Any) – DEPRECATED
product (Any) – DEPRECATED

Returns:

True(-thy) if the stamp is invalid, expired, or does not exist. When the stamp is expired, the reason for expiration is returned as a string. If the stamp is still valid, False is returned.

Return type:

bool | str

Example

>>> import ubelt as ub
>>> import time
>>> import os
>>> # Stamp the computation of expensive-to-compute.txt
>>> dpath = ub.Path.appdir('ubelt/tests/cache-stamp-expired')
>>> dpath.delete().ensuredir()
>>> products = [
>>>     dpath / 'product1.txt',
>>>     dpath / 'product2.txt',
>>> ]
>>> self = ub.CacheStamp('myname', depends='myconfig', dpath=dpath,
>>>                      product=products, hasher='sha256',
>>>                      expires=0)
>>> if self.expired():
>>>     for fpath in products:
>>>         fpath.write_text(fpath.name)
>>>     self.renew()
>>> fpath = products[0]
>>> # Because we set the expiration delta to 0, we should already be expired
>>> assert self.expired() == 'expired_cert'
>>> # Disable the expiration date, renew and we should be ok
>>> self.expires = None
>>> self.renew()
>>> assert not self.expired()
>>> # Modify the mtime to cause expiration
>>> orig_atime = fpath.stat().st_atime
>>> orig_mtime = fpath.stat().st_mtime
>>> os.utime(fpath, (orig_atime, orig_mtime + 200))
>>> assert self.expired() == 'mtime_diff'
>>> self.renew()
>>> assert not self.expired()
>>> # rewriting the file will cause the size constraint to fail
>>> # even if we hack the mtime to be the same
>>> orig_atime = fpath.stat().st_atime
>>> orig_mtime = fpath.stat().st_mtime
>>> fpath.write_text('corrupted')
>>> os.utime(fpath, (orig_atime, orig_mtime))
>>> assert self.expired() == 'size_diff'
>>> self.renew()
>>> assert not self.expired()
>>> # Force a situation where the hash is the only thing
>>> # that saves us, write a different file with the same
>>> # size and mtime.
>>> orig_atime = fpath.stat().st_atime
>>> orig_mtime = fpath.stat().st_mtime
>>> fpath.write_text('corrApted')
>>> os.utime(fpath, (orig_atime, orig_mtime))
>>> assert self.expired() == 'hash_diff'
>>> # Test what a wrong hash prefix causes expiration
>>> certificate = self.renew()
>>> self.hash_prefix = certificate['hash']
>>> self.expired()
>>> self.hash_prefix = ['bad', 'hashes']
>>> self.expired()
>>> # A bad hash will not allow us to renew
>>> import pytest
>>> with pytest.raises(RuntimeError):
...     self.renew()

_check_certificate_hashes(certificate)[source]¶

_expires(now=None)[source]¶

Returns:: the absolute local time when the stamp expires
Return type:: datetime.datetime

Example

>>> import ubelt as ub
>>> dpath = ub.Path.appdir('ubelt/tests/cache-stamp-expires')
>>> self = ub.CacheStamp('myname', depends='myconfig', dpath=dpath)
>>> # Test str input
>>> self.expires = '2020-01-01T000000Z'
>>> assert self._expires().replace(tzinfo=None).isoformat() == '2020-01-01T00:00:00'
>>> # Test datetime input
>>> dt = ub.timeparse(ub.timestamp())
>>> self.expires = dt
>>> assert self._expires() == dt
>>> # Test None input
>>> self.expires = None
>>> assert self._expires() is None
>>> # Test int input
>>> self.expires = 0
>>> assert self._expires(dt) == dt
>>> self.expires = 10
>>> assert self._expires(dt) > dt
>>> self.expires = -10
>>> assert self._expires(dt) < dt
>>> # Test timedelta input
>>> import datetime as datetime_mod
>>> self.expires = datetime_mod.timedelta(seconds=-10)
>>> assert self._expires(dt) == dt + self.expires

_new_certificate(cfgstr=None, product=None)[source]¶

Returns:: certificate information
Return type:: dict

Example

>>> import ubelt as ub
>>> # Stamp the computation of expensive-to-compute.txt
>>> dpath = ub.Path.appdir('ubelt/tests/cache-stamp-cert').ensuredir()
>>> product = dpath / 'product1.txt'
>>> product.write_text('hi')
>>> self = ub.CacheStamp('myname', depends='myconfig', dpath=dpath,
>>>                      product=product)
>>> cert = self._new_certificate()
>>> assert cert['expires'] is None
>>> self.expires = '2020-01-01T000000'
>>> self.renew()
>>> cert = self._new_certificate()
>>> assert cert['expires'] is not None

renew(cfgstr=None, product=None)[source]¶

Recertify that the product has been recomputed by writing a new certificate to disk.

Parameters:

cfgstr (None | str) – deprecated, do not use.
product (None | str | List) – deprecated, do not use.

Returns:

certificate information if enabled otherwise None.

Return type:

None | dict

Example

>>> # Test that renew does nothing when the cacher is disabled
>>> import ubelt as ub
>>> dpath = ub.Path.appdir('ubelt/tests/cache-stamp-renew').ensuredir()
>>> self = ub.CacheStamp('foo', dpath=dpath, enabled=False)
>>> assert self.renew() is None

ubelt.util_cache._localnow()[source]¶

ubelt.util_cache._byte_str(num, unit='auto', precision=2)[source]¶

Automatically chooses relevant unit (KB, MB, or GB) for displaying some number of bytes.

Parameters:

num (int) – number of bytes
unit (str) – which unit to use, can be auto, B, KB, MB, GB, or TB

References

[WikiOrdersOfMag]

https://en.wikipedia.org/wiki/Orders_of_magnitude_(data)

Returns:: string representing the number of bytes with appropriate units
Return type:: str

Example

>>> from ubelt.util_cache import _byte_str
>>> import ubelt as ub
>>> num_list = [1, 100, 1024,  1048576, 1073741824, 1099511627776]
>>> result = ub.urepr(list(map(_byte_str, num_list)), nl=0)
>>> print(result)
['0.00KB', '0.10KB', '1.00KB', '1.00MB', '1.00GB', '1.00TB']
>>> _byte_str(10, unit='B')
10.00B