law.util

law.util#

Helpful utility functions.

Data:

no_value

Unique dummy value that is used to denote missing values and always evaluates to False.

Functions:

`rel_path`(anchor, *paths)	Returns a path made of framgment paths relativ to an anchor path.
`law_src_path`(*paths)	Returns the law installation directory, optionally joined with paths.
`law_home_path`(*paths)	Returns the law home directory, optionally joined with paths.
`law_run`(argv, **kwargs)	Runs a task with certain parameters as defined in argv, which can be a string or a list of strings.
`print_err`(*args[, flush])	Same as print, but outputs to stderr.
`abort`([msg, exitcode, color])	Aborts the process (sys.exit) with an exitcode.
`import_file`(path[, attr])	Loads the content of a python file located at path and returns its package content as a dictionary.
`get_terminal_width`([fallback])	Returns the terminal width when possible, and None otherwise.
`is_classmethod`(func[, cls])	Returns True if func is a classmethod of cls, and False otherwise.
`is_number`(n)	Returns True if n is a number, i.e., integer or float, and in particular no boolean.
`is_float`(v)	Takes any value v and tries to convert it to a float.
`try_int`(n)	Takes a number n and tries to convert it to an integer.
`round_discrete`(n[, base, round_fn])	Rounds a number n to a discrete base.
`str_to_int`(s)	Converts a string s into an integer under consideration of binary, octal, decimal and hexadecimal representations, such as `"0o0660"`.
`flag_to_bool`(s[, silent])	Takes a string flag s and returns whether it evaluates to True (values `"1"`, `"true"` `"yes"`, `"y"`, `"on"`, case-insensitive) or False (values `"0"`, `"false"`, "no"`, `"n"`, `"off"`, case-insensitive).
`empty_context`()	Yields an empty context that can be used in case of dynamically choosing context managers while maintaining code structure.
`common_task_params`(task_instance, task_cls)	Returns the parameters that are common between a task_instance and a task_cls in a dictionary with values taken directly from the task instance.
`colored`(msg[, color, background, style, force])	Return the colored version of a string msg.
`uncolored`(s)	Removes all color codes from a string s and returns it.
`query_choice`(msg, choices[, default, ...])	Interactively query a choice from the prompt until the input matches one of the choices.
`is_pattern`(s)	Returns True if the string s represents a pattern, i.e., if it contains characters such as `"*"` or `"?"`.
`brace_expand`(s[, split_csv, escape_csv_sep])	Expands brace statements in a string s and returns a list containing all possible string combinations.
`range_expand`(s[, include_end, min_value, ...])	Takes a string, or a sequence of strings in the format `"1:3"`, or a tuple or a sequence of tuples containing start and stop values of a range and returns a list of all intermediate values.
`range_join`(numbers[, to_str, include_end, ...])	Takes a sequence of positive integer numbers given either as integer or string types, and returns a sequence 1- and 2-tuples, denoting either single numbers or start and end values of possible ranges.
`multi_match`(name, patterns[, mode, regex])	Compares name to multiple patterns and returns True in case of at least one match (mode = any, the default), or in case all patterns match (mode = all).
`is_iterable`(obj)	Returns True when an object obj is iterable and False otherwise.
`is_lazy_iterable`(obj)	Returns whether obj is iterable lazily, such as generators, range objects, maps, etc.
`make_list`(obj[, cast])	Converts an object obj to a list and returns it.
`make_tuple`(obj[, cast])	Converts an object obj to a tuple and returns it.
`make_set`(obj[, cast])	Converts an object obj to a set and returns it.
`make_unique`(obj)	Takes a list or tuple obj, removes duplicate elements in order of their appearance and returns the sequence of remaining, unique elements.
`is_nested`(obj)	Takes a list or tuple obj and checks whether it only contains items of types list and tuple.
`flatten`(*structs[, flatten_dict, ...])	Takes one or multiple complex structured objects structs, flattens them, and returns a single list.
`merge_dicts`(*dicts[, inplace, cls, deep])	Takes multiple dicts and returns a single merged dict.
`unzip`(struct[, fill_none])	Unzips a struct consisting of sequences with equal lengths and returns lists with 1st, 2nd, etc elements.
`which`(prog)	Pythonic `which` implementation.
`map_verbose`(func, seq[, msg, every, start, ...])	Same as the built-in map function but prints a msg after chunks of size every iterations.
`map_struct`(func, struct[, map_dict, ...])	Applies a function func to each value of a complex structured object struct and returns the output in the same structure.
`mask_struct`(mask, struct[, replace, ...])	Masks a complex structured object struct with a mask and returns the remaining values.
`tmp_file`(args, *kwargs)	Context manager that creates an empty, temporary file, yields the file descriptor number and temporary path, and eventually removes it.
`perf_counter`()	Returns `time.perf_counter()` for python 3 and `time.time()` for python 2.
`interruptable_popen`(args, *kwargs)	interruptable_popen(args, stdin_callback=None, stdin_delay=0, interrupt_callback=None, kill_timeout=None, *kwargs) # noqa Shorthand to :py:class:`Popen` followed by :py:meth:`Popen.communicate` which can be interrupted by KeyboardInterrupt.
`readable_popen`(args, *kwargs)	Creates a `Popen` object and a generator function yielding the output line-by-line as it comes in.
`create_hash`(inp[, l, algo, to_int])	Takes an arbitrary input inp and creates a hexadecimal string hash based on an algorithm algo.
`create_random_string`([prefix, l])	Creates and returns a random string consisting of l characters using a uuid4 hash.
`copy_no_perm`(src, dst)	Copies a file from src to dst including meta data except for permission bits.
`makedirs`(path[, perm])	Recursively creates directories up to path.
`user_owns_file`(path[, uid])	Returns whether a file located at path is owned by the user with uid.
`iter_chunks`(l, size)	Returns a generator containing chunks of size of a list, integer or generator l.
`human_bytes`(n[, unit, fmt])	Takes a number of bytes n, assigns the best matching unit and returns the respective number and unit string in a tuple.
`parse_bytes`(s[, input_unit, unit])	Takes a string s, interprets it as a size with an optional unit, and returns a float that represents that size in a given unit.
`human_duration`([colon_format, plural])	Returns a human readable duration.
`parse_duration`(s[, input_unit, unit])	Takes a string s, interprets it as a duration with an optional unit, and returns a float that represents that size in a given unit.
`is_file_exists_error`(e)	Returns whether the exception e was raised due to an already existing file or directory.
`send_mail`(recipient, sender[, subject, ...])	Lightweight mail functionality.
`open_compat`(path, args, *kwargs)	Polyfill for python's `open` factory, returning the plain `open` in python 3, and `io.open` in python 2 with a patched `write` method that internally handles unicode conversion of its first argument.
`patch_object`(obj, attr, value[, reset, ...])	Context manager that temporarily patches an object obj by replacing its attribute attr with value.
`join_generators`(*generators[, on_error])	Joins multiple generators and returns a single generator for simplified iteration.
`quote_cmd`(cmd)	Takes a shell command cmd given as a list and returns a single string representation of that command with proper quoting.
`escape_markdown`(s)	Escapes all characters in a string s that coupld be confused for markdown formatting strings and returns it.
`classproperty`(func)	Propety decorator for class-level methods.

Classes:

`DotDict`	Subclass of OrderedDict that provides read access for items via attributes by implementing `__getattr__`.
`ShorthandDict`(**kwargs)	Subclass of OrderedDict that implements `__getattr__` and `__setattr__` for a configurable list of attributes.
`TeeStream`(*consumers[, mode])	Multi-stream object that forwards calls to `write()` and `flush()` to all registered consumer streams.
`FilteredStream`(stream, filter_fn, **kwargs)	Stream object that accepts in input stream and a function filter_fn which is called upon every call to `write()`.

no_value = law.util.no_value#: Unique dummy value that is used to denote missing values and always evaluates to False.

rel_path(anchor, *paths)[source]#: Returns a path made of framgment paths relativ to an anchor path. When anchor is a file, its absolute directory is used instead.

law_src_path(*paths)[source]#: Returns the law installation directory, optionally joined with paths.

law_home_path(*paths)[source]#: Returns the law home directory, optionally joined with paths.

law_run(argv, **kwargs)[source]#

Runs a task with certain parameters as defined in argv, which can be a string or a list of strings. It must start with the family of the task to run, followed by the desired parameters. All kwargs are forwarded to luigi.interface.run(). Example:

law_run(["MyTask", "--param", "value"])
law_run("MyTask --param value")

print_err(*args, flush=False)[source]#: Same as print, but outputs to stderr. If flush is True, stderr is flushed after printing.

abort(msg=None, exitcode=1, color=True)[source]#: Aborts the process (sys.exit) with an exitcode. If msg is not None, it is printed first to stdout if exitcode is 0 or None, and to stderr otherwise. When color is True and exitcode is not 0 or None, the message is printed in red.

import_file(path, attr=None)[source]#

Loads the content of a python file located at path and returns its package content as a dictionary. When attr is set, only the attribute with that name is returned.

The file is not required to be importable as its content is loaded directly into the interpreter. While this approach is not necessarily clean, it can be useful in places where custom code must be loaded.

get_terminal_width(fallback=False)[source]#: Returns the terminal width when possible, and None otherwise. By default, the width is obtained through os.get_terminal_size, querying the sys.__stdout__ which might fail in case no valid output device is connected. However, when fallback is True, shutil.get_terminal_size is used instead, which priotizes the COLUMNS variable if set.

is_classmethod(func, cls=None)[source]#: Returns True if func is a classmethod of cls, and False otherwise. When cls is None, it is extracted from the function’s qualified name and module name.

is_number(n)[source]#: Returns True if n is a number, i.e., integer or float, and in particular no boolean.

is_float(v)[source]#: Takes any value v and tries to convert it to a float. Returns True success, and False otherwise.

try_int(n)[source]#: Takes a number n and tries to convert it to an integer. When n has no decimals, an integer is returned with the same value as n. Otherwise, a float is returned.

round_discrete(n, base=1.0, round_fn='round')[source]#

Rounds a number n to a discrete base. round_fn can be a function used for rounding and defaults to the built-in round function. It also accepts string values "round", "floor" and "ceil" which are resolved to the corresponding math functions. Example:

round_discrete(17, 5)
# -> 15.0

round_discrete(17, 2.5)
# -> 17.5

round_discrete(17, 2.5)
# -> 17.5

round_discrete(17, 2.5, math.floor)
round_discrete(17, 2.5, "floor")
# -> 15.0

str_to_int(s)[source]#: Converts a string s into an integer under consideration of binary, octal, decimal and hexadecimal representations, such as "0o0660".

flag_to_bool(s, silent=False)[source]#: Takes a string flag s and returns whether it evaluates to True (values "1", "true" "yes", "y", "on", case-insensitive) or False (values "0", "false", “no”`, "n", "off", case-insensitive). When s is already a boolean, it is returned unchanged. An error is thrown when s is neither of the allowed values and silent is False. Otherwise, None is returned.

empty_context()[source]#: Yields an empty context that can be used in case of dynamically choosing context managers while maintaining code structure.

common_task_params(task_instance, task_cls)[source]#: Returns the parameters that are common between a task_instance and a task_cls in a dictionary with values taken directly from the task instance. The difference with respect to luigi.util.common_params is that the values are not parsed using the parameter objects of the task class, which might be faster for some purposes.

colored(msg, color=None, background=None, style=None, force=False)[source]#: Return the colored version of a string msg. For color, background and style options, see https://misc.flogisoft.com/bash/tip_colors_and_formatting. They can also be explicitely set to "random" to get a random value. Unless force is True, the msg string is returned unchanged in case the output is neither a tty nor an IPython output stream.

uncolored(s)[source]#: Removes all color codes from a string s and returns it.

query_choice(msg, choices, default=None, descriptions=None, lower=True)[source]#: Interactively query a choice from the prompt until the input matches one of the choices. The prompt can be configured using msg and descriptions, which, if set, must have the same length as choices. When default is not None it must be one of the choices and is used when the input is empty. When lower is True, the input is compared to the choices in lower case.

is_pattern(s)[source]#: Returns True if the string s represents a pattern, i.e., if it contains characters such as "*" or "?".

brace_expand(s, split_csv=False, escape_csv_sep=True)[source]#

Expands brace statements in a string s and returns a list containing all possible string combinations. When split_csv is True, the input string is split by all comma characters located outside braces, except for escaped ones when escape_csv_sep is True, and the expansion is performed sequentially on all elements. Example:

brace_expand("A{1,2}B")
# -> ["A1B", "A2B"]

brace_expand("A{1,2}B{3,4}C")
# -> ["A1B3C", "A1B4C", "A2B3C", "A2B4C"]

brace_expand("A{1,2}B,C{3,4}D")
# note the full 2x2 expansion
# -> ["A1B,C3D", "A1B,C4D", "A2B,C3D", "A2B,C4D"]

brace_expand("A{1,2}B,C{3,4}D", split_csv=True)
# note the 2+2 sequential expansion
# -> ["A1B", "A2B", "C3D", "C4D"]

brace_expand("A{1,2}B,C{3}D", split_csv=True)
# note the 2+1 sequential expansion
# -> ["A1B", "A2B", "C3D"]

range_expand(s, include_end=False, min_value=None, max_value=None, sep=':')[source]#

Takes a string, or a sequence of strings in the format "1:3", or a tuple or a sequence of tuples containing start and stop values of a range and returns a list of all intermediate values. When include_end is True, the end value is included.

One sided range expressions such as ":4" or "4:" for strings and (None, 4) or (4, None) for tuples are also expanded but they require min_value and max_value to be set (an exception is raised otherwise), with max_value being either included or not, depending on include_end.

Also, when a min_value (max_value) is set, the minimum (maximum) of expanded range is limited at this value.

Example:

range_expand("5:8")
# -> [5, 6, 7]

range_expand((6, 9))
# -> [6, 7, 8]

range_expand("5:8", include_end=True)
# -> [5, 6, 7, 8]

range_expand(["5-8", "10"])
# -> [5, 6, 7, 10]

range_expand(["5-8", "10-"])
# -> Exception, no max_value set

range_expand(["5-8", "10-"], max_value=12)
# -> [5, 6, 7, 10, 11]

range_expand(["5-8", "10-"], max_value=12, include_end=True)
# -> [5, 6, 7, 8, 10, 11, 12]

range_join(numbers, to_str=False, include_end=False, sep=',', range_sep=':')[source]#

Takes a sequence of positive integer numbers given either as integer or string types, and returns a sequence 1- and 2-tuples, denoting either single numbers or start and end values of possible ranges. Unless include_end is True, end values are not included. When to_str is True, a string is returned in a format consistent to range_expand() with ranges constructed by range_sep and merged with sep. Example:

range_join([1, 2, 3, 5])
# -> [(1, 4), (5,)]

range_join([1, 2, 3, 5], include_end=True)
# -> [(1, 3), (5,)]

range_join([1, 2, 3, 5, 7, 8, 9])
# -> [(1, 4), (5,), (7, 10)]

range_join([1, 2, 3, 5, 7, 8, 9], to_str=True)
# -> "1:4,5,7:10"

multi_match(name, patterns, mode=<built-in function any>, regex=False)[source]#: Compares name to multiple patterns and returns True in case of at least one match (mode = any, the default), or in case all patterns match (mode = all). Otherwise, False is returned. When regex is True, re.match is used instead of fnmatch.fnmatch.

is_iterable(obj)[source]#: Returns True when an object obj is iterable and False otherwise.

is_lazy_iterable(obj)[source]#: Returns whether obj is iterable lazily, such as generators, range objects, maps, etc.

make_list(obj, cast=True)[source]#: Converts an object obj to a list and returns it. Objects of types tuple and set are converted if cast is True. Otherwise, and for all other types, obj is put in a new list.

make_tuple(obj, cast=True)[source]#: Converts an object obj to a tuple and returns it. Objects of types list and set are converted if cast is True. Otherwise, and for all other types, obj is put in a new tuple.

make_set(obj, cast=True)[source]#: Converts an object obj to a set and returns it. Objects of types list and tuple are converted if cast is True. Otherwise, and for all other types, obj is put in a new set.

make_unique(obj)[source]#: Takes a list or tuple obj, removes duplicate elements in order of their appearance and returns the sequence of remaining, unique elements. The sequence type is preserved. When obj is neither a list nor a tuple, but iterable, a list is returned. Otherwise, a TypeError is raised.

is_nested(obj)[source]#: Takes a list or tuple obj and checks whether it only contains items of types list and tuple.

flatten(*structs, flatten_dict=True, flatten_list=True, flatten_tuple=True, flatten_set=True)[source]#: Takes one or multiple complex structured objects structs, flattens them, and returns a single list. flatten_dict, flatten_list, flatten_tuple and flatten_set configure if objects of the respective types are flattened (the default). If not, they are returned unchanged.

merge_dicts(*dicts, inplace=False, cls=None, deep=False)[source]#

Takes multiple dicts and returns a single merged dict. The merging takes place in order of the passed dicts and therefore, values of rear objects have precedence in case of field collisions.

By default, a new dictionary is returned. However, when inplace is True, all update operations are performed inplace on the first object in dicts.

When not inplace, the class of the returned merged dict is configurable via cls. If it is None, the class is inferred from the first dict object in dicts.

When deep is True, dictionary types within the dictionaries to merge are updated recursively such that their fields are merged. This is only possible when input dictionaries have a similar structure. Example:

merge_dicts({"foo": 1, "bar": {"a": 1, "b": 2}}, {"bar": {"c": 3}})
# -> {"foo": 1, "bar": {"c": 3}}  # fully replaced "bar"

merge_dicts({"foo": 1, "bar": {"a": 1, "b": 2}}, {"bar": {"c": 3}}, deep=True)
# -> {"foo": 1, "bar": {"a": 1, "b": 2, "c": 3}}  # inserted entry bar.c

merge_dicts({"foo": 1, "bar": {"a": 1, "b": 2}}, {"bar": 2}, deep=True)
# -> {"foo": 1, "bar": 2}  # "bar" has a different type, so this just uses the rear value

unzip(struct, fill_none=False)[source]#

Unzips a struct consisting of sequences with equal lengths and returns lists with 1st, 2nd, etc elements. This function can be thought of as the opposite of the zip builtin.

The number of elements per returned list is determined by the length of the first sequence in struct. In case a sequence does contain fewer items an exception is raised. However, if fill_none is True, None is inserted instead.

unzip([(1, 2), (3, 4)])
# -> ([1, 3], [2, 4])

unzip([(1, 2), (3,)])
# -> ValueError

unzip([(1, 2), (3,)], fill_none=True)
# -> ([1, 3], [2, None])

which(prog)[source]#: Pythonic which implementation. Returns the path to an executable prog by searching in PATH, or None when it could not be found.

map_verbose(func, seq, msg='{}', every=25, start=True, end=True, offset=0, callback=None)[source]#

Same as the built-in map function but prints a msg after chunks of size every iterations. When start (stop) is True, the msg is also printed after the first (last) iteration. Note that msg is supposed to be a template string that will be formatted with the current iteration number (starting at 0) plus offset using str.format. When callback is callable, it is invoked instead of the default print method with the current iteration number (without offset) as the only argument. Example:

func = lambda x: x ** 2
msg = "computing square of {}"
squares = map_verbose(func, range(7), msg, every=3)
# ->
# computing square of 0
# computing square of 2
# computing square of 5
# computing square of 6

map_struct(func, struct, map_dict=True, map_list=True, map_tuple=False, map_set=False, cls=None, custom_mappings=None)[source]#

Applies a function func to each value of a complex structured object struct and returns the output in the same structure. Example:

struct = {"foo": [123, 456], "bar": [{"1": 1}, {"2": 2}]}
def times_two(i):
    return i * 2

map_struct(times_two, struct)
# -> {"foo": [246, 912], "bar": [{"1": 2}, {"2": 4}]}

map_dict, map_list, map_tuple and map_set configure if objects of the respective types are traversed or mapped as a whole. They can be booleans or integer values defining the depth of that setting in the struct. When cls is not None, it exclusively defines the class of objects that func is applied on. All other objects are unchanged. custom_mappings key be a dictionary that maps custom types to custom object traversal methods. The following example would tranverse lists backwards:

def traverse_lists(func, l, **kwargs):
    return [map_struct(func, v, **kwargs) for v in l[::-1]]

map_struct(times_two, struct, custom_mappings={list: traverse_lists})
# -> {"foo": [912, 246], "bar": [{"1": 2}, {"2": 4}]}

mask_struct(mask, struct, replace=law.util.no_value, keep_missing=True, convert_types=None)[source]#

Masks a complex structured object struct with a mask and returns the remaining values. When replace is set, masked values are replaced with that value instead of being removed. The mask can have a complex structure as well.

In case an item in struct is not matched by a value in mask, the item is kept unless keep_missing is False. When keep_missing is True, unmatched items are removed.

convert_types can be a dictionary containing conversion functions mapped to types (or tuples) thereof that is applied to objects during the struct traversal if their types match.

Examples:

struct = {"a": [1, 2], "b": [3, ["foo", "bar"]]}

# simple example
mask_struct({"a": [False, True], "b": False}, struct)
# => {"a": [2]}

# omitting mask information results in kept values
mask_struct({"a": [False, True]}, struct)
# => {"a": [2], "b": [3, ["foo", "bar"]]}

tmp_file(*args, **kwargs)[source]#: Context manager that creates an empty, temporary file, yields the file descriptor number and temporary path, and eventually removes it. All args and kwargs are passed to tempfile.mkstemp(). The behavior of this function is similar to tempfile.NamedTemporaryFile which, however, yields an already opened file object.

perf_counter()[source]#: Returns time.perf_counter() for python 3 and time.time() for python 2.

interruptable_popen(*args, **kwargs)[source]#

interruptable_popen(args, stdin_callback=None, stdin_delay=0, interrupt_callback=None, kill_timeout=None, **kwargs) # noqa Shorthand to :py:class:`Popen` followed by :py:meth:`Popen.communicate` which can be interrupted by *KeyboardInterrupt. The return code, standard output and standard error are returned in a 3-tuple.

stdin_callback can be a function accepting no arguments and whose return value is passed to communicate after a delay of stdin_delay to feed data input to the subprocess.

interrupt_callback can be a function, accepting the process instance as an argument, that is called immediately after a KeyboardInterrupt occurs. After that, a SIGTERM signal is send to the subprocess to allow it to gracefully shutdown.

When kill_timeout is set, and the process is still alive after that period (in seconds), a SIGKILL signal is sent to force the process termination.

All other args and kwargs are forwarded to the Popen constructor.

readable_popen(*args, **kwargs)[source]#

Creates a Popen object and a generator function yielding the output line-by-line as it comes in. All args and kwargs are forwarded to the Popen constructor. Example:

# create the popen object and line generator
p, lines = readable_popen(["some_executable", "--args"])

# loop through output lines as they come in
for line in lines:
    print(line)

if p.returncode != 0:
    raise Exception("complain ...")

communicate() is called automatically after the output iteration terminates which sets the subprocess’ returncode member.

create_hash(inp, l=10, algo='sha256', to_int=False)[source]#: Takes an arbitrary input inp and creates a hexadecimal string hash based on an algorithm algo. For valid algorithms, see python’s hashlib. l corresponds to the maximum length of the returned hash and is limited by the length of the hexadecimal representation produced by the hashing algorithm. When to_int is True, the decimal integer representation is returned.

create_random_string(prefix='', l=10)[source]#: Creates and returns a random string consisting of l characters using a uuid4 hash. When prefix is given, the string will have the format <prefix>_<random_string>.

copy_no_perm(src, dst)[source]#: Copies a file from src to dst including meta data except for permission bits.

makedirs(path, perm=None)[source]#: Recursively creates directories up to path. No exception is raised if path refers to an existing directory. If perm is set, the permissions of all newly created directories are set to this value.

user_owns_file(path, uid=None)[source]#: Returns whether a file located at path is owned by the user with uid. When uid is None, the user id of the current process is used.

iter_chunks(l, size)[source]#: Returns a generator containing chunks of size of a list, integer or generator l. A size smaller than 1 results in no chunking at all.

human_bytes(n, unit=None, fmt=False)[source]#

Takes a number of bytes n, assigns the best matching unit and returns the respective number and unit string in a tuple. When unit is set, that unit is used. When fmt is set, it is expected to be a string template with two elements that are filled via str.format. It can also be a boolean value in which case the template defaults to "{:.1f} {}" when True. Example:

human_bytes(3407872)
# -> (3.25, "MB")

human_bytes(3407872, "kB")
# -> (3328.0, "kB")

human_bytes(3407872, fmt="{:.2f} -- {}")
# -> "3.25 -- MB"

human_bytes(3407872, fmt=True)
# -> "3.25 MB"

parse_bytes(s, input_unit='bytes', unit='bytes')[source]#

Takes a string s, interprets it as a size with an optional unit, and returns a float that represents that size in a given unit. When no unit is found in s, input_unit is used as a default. A ValueError is raised, when s cannot be successfully converted. Example:

parse_bytes("100")
# -> 100.0

parse_bytes("2048", unit="kB")
# -> 2.0

parse_bytes("2048 kB", unit="kB")
# -> 2048.0

parse_bytes("2048 kB", unit="MB")
# -> 2.0

parse_bytes("2048", "kB", unit="MB")
# -> 2.0

parse_bytes(2048, "kB", unit="MB")  # note the float type of the first argument
# -> 2.0

human_duration(colon_format=False, plural=True, **kwargs)[source]#

Returns a human readable duration. The largest unit is days. When colon_format is True, the return value has the format "[d-][hh:]mm:ss[.ms]". colon_format can also be a string value referring to a limiting unit. In that case, the returned time string has no field above that unit, e.g. passing "m" results in a string "mm:ss[.ms]" where the minute field is potentially larger than 60. Passing "s" is a special case. Since the colon format always has a minute field (to mark it as colon format in the first place), the returned string will have the format "00:ss[.ms]". Unless plural is False, units corresponding to values other than exactly one are used in plural e.g. "1 second" but "1.5 seconds". All other kwargs are passed to datetime.timedelta to get the total duration in seconds. Example:

human_duration(seconds=1233) # -> “20 minutes, 33 seconds”

human_duration(seconds=90001) # -> “1 day, 1 hour, 1 second”

human_duration(seconds=1233, colon_format=True) # -> “20:33”

human_duration(seconds=-1233, colon_format=True) # -> “-20:33”

human_duration(seconds=90001, colon_format=True) # -> “1-01:00:01”

human_duration(seconds=90001, colon_format=”h”) # -> “25:00:01”

human_duration(seconds=65, colon_format=”s”) # -> “00:65”

human_duration(minutes=15, colon_format=True) # -> “15:00”

human_duration(minutes=15) # -> “15 minutes”

human_duration(minutes=15, plural=False) # -> “15 minute”

human_duration(minutes=-15) # -> “minus 15 minutes”

parse_duration(s, input_unit='s', unit='s')[source]#

Takes a string s, interprets it as a duration with an optional unit, and returns a float that represents that size in a given unit. When no unit is found in s, input_unit is used as a default. A ValueError is raised, when s cannot be successfully converted. Multiple input formats are parsed: Example:

# plain number
parse_duration(100)
# -> 100.0

parse_duration(100, unit="min")
# -> 1.667

parse_duration(100, input_unit="min")
# -> 6000.0

parse_duration(-100, input_unit="min")
# -> -6000.0

# strings in the format [d-][h:][m:]s[.ms] are interpreted with input_unit disregarded
parse_duration("2:1")
# -> 121.0

parse_duration("04:02:01.1")
# -> 14521.1

parse_duration("04:02:01.1", unit="min")
# -> 242.0183

parse_duration("0-4:2:1.1")
# -> 14521.1

# human-readable string, optionally multiple of them separated by comma
# missing units are interpreted as input_unit, unit works as above
parse_duration("10 mins")
# -> 600.0

parse_duration("10 mins", unit="min")
# -> 10.0

parse_duration("10", unit="min")
# -> 0.167

parse_duration("10", input_unit="min", unit="min")
# -> 10.0

parse_duration("10 mins, 15 secs")
# -> 615.0

parse_duration("10 mins and 15 secs")
# -> 615.0

parse_duration("minus 10 mins and 15 secs")
# -> -615.0

is_file_exists_error(e)[source]#: Returns whether the exception e was raised due to an already existing file or directory.

send_mail(recipient, sender, subject='', content='', smtp_host='127.0.0.1', smtp_port=25)[source]#: Lightweight mail functionality. Sends an mail from sender to recipient with subject and content. smtp_host and smtp_port are forwarded to the smtplib.SMTP constructor. True is returned on success, False otherwise.

class DotDict[source]#

Bases: OrderedDict

Subclass of OrderedDict that provides read access for items via attributes by implementing __getattr__. In case a item is accessed via attribute and it does not exist, an AttriuteError is raised rather than a KeyError. Example:

d = DotDict()
d["foo"] = 1

print(d["foo"])
# => 1

print(d.foo)
# => 1

print(d["bar"])
# => KeyError

print(d.bar)
# => AttributeError

Methods:

wrap(*args, **kwargs)

Takes a dictionary d and recursively replaces it and all other nested dictionary types with DotDict's for deep attribute-style access.

classmethod wrap(*args, **kwargs)[source]#: Takes a dictionary d and recursively replaces it and all other nested dictionary types with DotDict’s for deep attribute-style access.

class ShorthandDict(**kwargs)[source]#

Bases: OrderedDict

Subclass of OrderedDict that implements __getattr__ and __setattr__ for a configurable list of attributes. Example:

MyDict(ShorthandDict):
    attributes = {"foo": 1, "bar": 2}

d = MyDict(foo=9)

print(d.foo)
# => 9

print(d.bar)
# => 2

d.foo = 3
print(d.foo)
# => 3

open_compat(path, *args, **kwargs)[source]#: Polyfill for python’s open factory, returning the plain open in python 3, and io.open in python 2 with a patched write method that internally handles unicode conversion of its first argument. All args and kwargs are forwarded.

patch_object(obj, attr, value, reset=True, orig=law.util.no_value, lock=False)[source]#: Context manager that temporarily patches an object obj by replacing its attribute attr with value. The original value is set again when the context is closed unless reset is False. The original value is obtained through getattr or taken from orig if set. When lock is True, the py:attr:default_lock object is used to ensure the patch is thread-safe. When lock is a lock instance, this object is used instead.

join_generators(*generators, on_error=None)[source]#: Joins multiple generators and returns a single generator for simplified iteration. Yielded objects are transparently sent back to yield assignments of the same generator. When on_error is callable, it is invoked in case an exception is raised while iterating, including KeyboardInterrupt’s. If its return value evaluates to True, the state is reset and iterations continue. Otherwise, the exception is raised.

quote_cmd(cmd)[source]#

Takes a shell command cmd given as a list and returns a single string representation of that command with proper quoting. To denote nested commands (such as shown below), cmd can also contain nested lists. Example:

print(quote_cmd(["bash", "-c", "echo", "foobar"]))
# -> "bash -c echo foobar"

print(quote_cmd(["bash", "-c", ["echo", "foobar"]]))
# -> "bash -c 'echo foobar'"

escape_markdown(s)[source]#: Escapes all characters in a string s that coupld be confused for markdown formatting strings and returns it.

classproperty(func)[source]#: Propety decorator for class-level methods.

class TeeStream(*consumers, mode='w', **kwargs)[source]#

Bases: BaseStream

Multi-stream object that forwards calls to write() and flush() to all registered consumer streams. When a consumer is a string, it is interpreted as a file which is opened for writing (similar to tee in bash). All kwargs are forwarded to the BaseStream constructor.

Example:

tee = TeeStream("/path/to/log.txt", sys.__stdout__)
sys.stdout = tee

class FilteredStream(stream, filter_fn, **kwargs)[source]#

Bases: BaseStream

Stream object that accepts in input stream and a function filter_fn which is called upon every call to write(). The payload is written when the returned value evaluates to True. All kwargs are forwarded to the BaseStream constructor.

law.util

Contents

law.util#