tasks#

Tasks that provide common and often used functionality.

Task RunOnceTask#

class RunOnceTask(*args, **kwargs)#

Bases: Task

complete()#

If the task has any outputs, return True if all outputs exist. Otherwise, return False.

However, you may freely override this method with custom logic.

Task TransferLocalFile#

class TransferLocalFile(*args, **kwargs)#

Bases: Task

output()#

The output that this Task produces.

The output of the Task determines if the Task needs to be run–the task is considered finished iff the outputs all exist. Subclasses should override this method to return a single Target or a list of Target instances.

Implementation note

If running multiple workers, the output must be a resource that is accessible by all workers, such as a DFS or database. Otherwise, workers might compute the same output since they don’t see the work done by other workers.

See Task.output

run()#

The task run method, to be overridden in a subclass.

See Task.run

Task ForestMerge#

class ForestMerge(*args, **kwargs)#

Bases: LocalWorkflow

classmethod modify_param_values(params)#

Hook to modify command line arguments before instances of this class are created.

create_branch_map()#

Abstract method that must be overwritten by inheriting tasks to define the branch map.

workflow_requires()#

Hook to add workflow requirements. This method is expected to return a dictionary. When this method is called from a branch task, an exception is raised.

requires()#

The Tasks that this Task depends on.

A Task will only run if all of the Tasks that it requires are completed. If your Task does not require any other Tasks, then you don’t need to override this method. Otherwise, a subclass can override this method to return a single Task, a list of Task instances, or a dict whose values are Task instances.

See Task.requires

output()#

The output that this Task produces.

The output of the Task determines if the Task needs to be run–the task is considered finished iff the outputs all exist. Subclasses should override this method to return a single Target or a list of Target instances.

Implementation note

If running multiple workers, the output must be a resource that is accessible by all workers, such as a DFS or database. Otherwise, workers might compute the same output since they don’t see the work done by other workers.

See Task.output

run()#

The task run method, to be overridden in a subclass.

See Task.run