skdh.io.MultiReader#

class skdh.io.MultiReader(mode, reader, reader_kw=None, resample_to_lowest=True, fill_gaps=True, fill_value=None, gaps_error='raise', require_all_keys=True, ignore_file_size_check=True)#

A process for reading in multiple files into one set of datastreams for processing.

Parameters:

mode{‘combine’, ‘concatenate’, ‘leave’}: The mode to use when reading multiple files. Options are “combine”, which combines multiple data-streams from different files, or “concatenate” which combines the same data-stream from multiple files (Case 3 reader_kw not allowed). Finally, “leave” is only valid when providing case 3 for reader_kw (see Notes), and leaves the results in separate dictionaries with titles given by the keys of reader_kw and files as passed to the predict method.
readerstr: The name of the reader class to use. See File Reading (skdh.io).
reader_kw{None, array-like, dict}, optional: Reader key-word arguments to initialize the reader. See Notes for 3 specification options.
resample_to_lowestbool, optional: When mode is “combine”, resample separate datastreams to the lowest sampled stream. Default is True. False will re-sample all datastreams to the highest sampled stream.
fill_gapsbool, optional: Fill any gaps in data streams where possible (same size as time array). Default is True.
fill_value{None, dict}, optional: Dictionary with keys and values to fill data streams with. See Notes for default values if not provided.
gaps_error{‘raise’, ‘warn’, ‘ignore’}, optional: Behavior if there are gaps in the datastreams. Default is to raise an error.
require_all_keysbool, optional: Require all files to provide the same keys. Default is True.

Methods

`check_handle_gaps`(data)	Check for gaps, and fill if specified.
`concat`(data)	Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries
`convert_timestamps`(t)	Convert a timestamp/array of timestamps to a datetime object
`get_reader_kw`(idx)	Get the appropriate reader class key-word arguments
`handle_combine`(res)	Combine results
`handle_concatenation`(res)	Concatenate results.
`handle_results`(res)	Handle the combining of results.
`predict`(*[, files])	Read the files or files from a directory.
`save_results`(results, file_name)	Save the results of the processing pipeline to a csv file

handle_gaps_error

Notes

The combine mode should be used in the case when you have, for example, acceleration and temperature in separate files. Concatenate should be used when multiple files all contain acceleration, for example. In this case, read results will be concatenated together in order of the first timestamp.

reader_kw has three(four) different ways to be specified:

None for no keyword arguments overriding defaults
A dictionary of key-word arguments that will be the same for all files.
A list of dictionaries of key-word arguments that must equal the number of
files provided to the predict method, in order.
A dictionary of dictionaries of key-word arguments. In this case, the files
argument for predict should be a dictionary with the same key-names as reader_kw, and the key-word arguments will be associated with the path in the files dictionary.

Note that if the reader returns the same keys, and mode is “combine”, keys will be overwritten.

Default fill values are:

accel: numpy.array([0.0, 0.0, 1.0])
gyro: 0.0
temperature: 0.0
ecg: 0.0

Examples

Case 0:

>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=None)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 1:

>>> kw = {'time_col_name': 'timestamp', 'read_csv_kwargs': {'skiprows': 5}}
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 2:

>>> kw = [
>>>     {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}},
>>>     {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}},
>>> ]
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 3:

>>> kw = {
>>>     'f1': {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}},
>>>     'f2': {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}},
>>> }
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files={'f1': "file1.csv", 'f2': "file2.csv"})

check_handle_gaps(data)#

Check for gaps, and fill if specified.

Parameters:

datadict: Dictionary of data-streams

Returns:

res_dictdict: Dictionary of data with gaps filled, as specified.

static concat(data)#

Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries

Parameters:

datatuple: Tuple of either numpy.ndarrays to concatenate, or dictionaries whose keys should be concatenated.

Returns:

data{ndarray, dict}: Concatenated data.

get_reader_kw(idx)#

Get the appropriate reader class key-word arguments

Parameters:

idxstr, int: Index of the file to use to retrieve reader kwargs

handle_combine(res)#

Combine results

Parameters:

res{list, dict}

Returns:

resultsdict: Datastream results dictionary

handle_concatenation(res)#

Concatenate results.

Parameters:

reslist: List of results dictionaries

Returns:

resultsdict: Dictionary of results datastreams

handle_results(res)#

Handle the combining of results.

Parameters:

res{dict, list}: Dictionary or list of results.

Returns:

resultsdict: Dictionary of final results

predict(*, files=None, **kwargs)#

Read the files or files from a directory.

Parameters:

files{array-like, dict}: Either a list-like of files to read, or a dictionary of keys corresponding to files to read. Keys match with those provided to reader_kw upon initializing the process.

Notes

Note that any additional key-word arguments passed to MultiReader.predict will be passed along to the reader.predict method.