skdh.Pipeline#

class skdh.Pipeline(load_kwargs=None, flatten_results=False)#

Pipeline class that can have multiple steps that are processed sequentially. Some of the output is passed between steps. Has the ability to save results from processing steps as local files, as well as return the results in a dictionary following the processing.

Parameters:

load_kwargs{None, dict}, optional: Dictionary of key-word arguments that will get directly passed to the Pipeline.load() function. If None, no pipeline will be loaded (default).
flatten_resultsbool, optional: Flatten the results of the processing steps in the return dictionary. By default (False), results of a step will be stored under a key of the step’s class name. If True, all results will be on the same level, and an exception will be raised if keys would be overwritten.

Methods

`add`(process[, name, save_file, plot_file, ...])	Add a processing step to the pipeline
`load`([file, yaml_str, process_raise, ...])	Load a previously saved pipeline from a file or YAML string.
`run`(**kwargs)	Run through the pipeline, sequentially processing steps.
`save`(file)	Save the pipeline to a file for consistent pipeline generation.

Examples

Load a pipeline, saved in a file, on instantiation. Also set it to raise an error instead of a warning if one of the processes cannot be loaded:

>>> pipe = Pipeline(
>>>     load_kwargs={"file": "example_pipeline.skdh", "process_raise": False})

add(process, name=None, save_file=None, plot_file=None, make_copy=True)#

Add a processing step to the pipeline

Parameters:

processProcess: Process class that forms the step to be run
namestr, optional: Process name. Used to delineate multiple of the same process if required. Output results will be under this name. If None is provided, output results will be under the class name, with some mangling in the case of multiple of the same processes in the same pipeline.
save_file{None, str}, optional: Optionally formattable path for the save file. If left/set to None, the results will not be saved anywhere.
plot_file{None, str}, optional: Optionally formattable path for the output of plotting. If left/set to None, the plot will not be generated and saved.
make_copybool, optional: Create a shallow copy of process to add to the pipeline. This allows a single instance to be used in multiple pipelines while retaining custom save file names and other pipeline-specific attributes. Default is True.

Notes

Some of the avaible parameters for results saving and plotting are:

date : the current date, expressed as YYYYMMDD.
name : the name of the process doing the analysis.
file : the name of the input file passed to the pipeline.

Note that if no file was passed in initially, then the file would be an empty string. However, even if the first step of the pipeline is not one that would use a file keyword, you can still specify it and it will be ignored for everything but this parameter.

>>> p = Pipeline()
>>> p.add(Gait(), save_file="{file}_gait_results.csv")
>>> p.run(accel=accel, time=time)
No file was passed in, so the resulting output file would be
`_gait_results.csv`

However if the p.run call is now:

>>> p.run(accel=accel, time=time, file="example_file.txt")
then the output would be `example_file_gait_results.csv`.

Examples

Add Gait and save the results to a fixed file name:

>>> from skdh.gait import Gait
>>> p = Pipeline()
>>> p.add(Gait(), save_results="gait_results.csv")

Add a binary file reader without saving the results and gait processing with a variable file name:

>>> from skdh.io import ReadBin
>>> p = Pipeline()
>>> p.add(ReadBin(bases=0, periods=24), save_results=None)
>>> p.add(Gait(), save_results="{date}_{name}_results.csv")

If the date was, for example, May 18, 2021 then the results file would be 20210518_Gait_results.csv.

load(file=None, *, yaml_str=None, process_raise=False, noversion_raise=False, old_raise=False)#

Load a previously saved pipeline from a file or YAML string.

Parameters:

file{str, path-like}: File path to load the pipeline structure from.
yaml_strstr: YAML string of the pipeline. If provided, file is ignored.
process_raisebool, optional: Raise an error if a process in file or yaml_str cannot be added to the pipeline. Default is False, which issues a warning instead.
noversion_raisebool: Raise an error if no version is provided in the input data. Default is False, which issues a warning instead.
old_raisebool: Raise an error if the version used to create the pipeline is old enough to potentially cause compatibility/functionality errors. Default is False, which issues a warning instead.

run(**kwargs)#

Run through the pipeline, sequentially processing steps. Inputs must be provided as key-word arguments.

Parameters:

kwargs: Any key-word arguments. Will get passed to the first step of the pipeline, and therefore they must contain at least what the first process is expecting.

Returns:

resultsdict: Dictionary of the results of any steps of the pipeline that return results.

save(file)#

Save the pipeline to a file for consistent pipeline generation.

Parameters:

file{str, path-like}: File path to save the pipeline structure to