skdh.io.MultiReader#

class skdh.io.MultiReader(mode, reader, reader_kw=None, resample_to_lowest=True, fill_gaps=True, fill_value=None, gaps_error='raise', require_all_keys=True, ignore_file_size_check=True)#

A process for reading in multiple files into one set of datastreams for processing.

Parameters:
mode{‘combine’, ‘concatenate’, ‘leave’}

The mode to use when reading multiple files. Options are “combine”, which combines multiple data-streams from different files, or “concatenate” which combines the same data-stream from multiple files (Case 3 reader_kw not allowed). Finally, “leave” is only valid when providing case 3 for reader_kw (see Notes), and leaves the results in separate dictionaries with titles given by the keys of reader_kw and files as passed to the predict method.

readerstr

The name of the reader class to use. See File Reading (skdh.io).

reader_kw{None, array-like, dict}, optional

Reader key-word arguments to initialize the reader. See Notes for 3 specification options.

resample_to_lowestbool, optional

When mode is “combine”, resample separate datastreams to the lowest sampled stream. Default is True. False will re-sample all datastreams to the highest sampled stream.

fill_gapsbool, optional

Fill any gaps in data streams where possible (same size as time array). Default is True.

fill_value{None, dict}, optional

Dictionary with keys and values to fill data streams with. See Notes for default values if not provided.

gaps_error{‘raise’, ‘warn’, ‘ignore’}, optional

Behavior if there are gaps in the datastreams. Default is to raise an error.

require_all_keysbool, optional

Require all files to provide the same keys. Default is True.

Methods

check_handle_gaps(data)

Check for gaps, and fill if specified.

concat(data)

Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries

convert_timestamps(t)

Convert a timestamp/array of timestamps to a datetime object

get_reader_kw(idx)

Get the appropriate reader class key-word arguments

handle_combine(res)

Combine results

handle_concatenation(res)

Concatenate results.

handle_results(res)

Handle the combining of results.

predict(*[, files])

Read the files or files from a directory.

save_results(results, file_name)

Save the results of the processing pipeline to a csv file

handle_gaps_error

Notes

The combine mode should be used in the case when you have, for example, acceleration and temperature in separate files. Concatenate should be used when multiple files all contain acceleration, for example. In this case, read results will be concatenated together in order of the first timestamp.

reader_kw has three(four) different ways to be specified:

  1. None for no keyword arguments overriding defaults

  2. A dictionary of key-word arguments that will be the same for all files.

  3. A list of dictionaries of key-word arguments that must equal the number of

    files provided to the predict method, in order.

  4. A dictionary of dictionaries of key-word arguments. In this case, the files

    argument for predict should be a dictionary with the same key-names as reader_kw, and the key-word arguments will be associated with the path in the files dictionary.

Note that if the reader returns the same keys, and mode is “combine”, keys will be overwritten.

Default fill values are:

  • accel: numpy.array([0.0, 0.0, 1.0])

  • gyro: 0.0

  • temperature: 0.0

  • ecg: 0.0

Examples

Case 0:

>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=None)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 1:

>>> kw = {'time_col_name': 'timestamp', 'read_csv_kwargs': {'skiprows': 5}}
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 2:

>>> kw = [
>>>     {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}},
>>>     {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}},
>>> ]
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files=["file1.csv", "file2.csv"])

Case 3:

>>> kw = {
>>>     'f1': {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}},
>>>     'f2': {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}},
>>> }
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw)
>>> mrdr.predict(files={'f1': "file1.csv", 'f2': "file2.csv"})
check_handle_gaps(data)#

Check for gaps, and fill if specified.

Parameters:
datadict

Dictionary of data-streams

Returns:
res_dictdict

Dictionary of data with gaps filled, as specified.

static concat(data)#

Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries

Parameters:
datatuple

Tuple of either numpy.ndarrays to concatenate, or dictionaries whose keys should be concatenated.

Returns:
data{ndarray, dict}

Concatenated data.

get_reader_kw(idx)#

Get the appropriate reader class key-word arguments

Parameters:
idxstr, int

Index of the file to use to retrieve reader kwargs

handle_combine(res)#

Combine results

Parameters:
res{list, dict}
Returns:
resultsdict

Datastream results dictionary

handle_concatenation(res)#

Concatenate results.

Parameters:
reslist

List of results dictionaries

Returns:
resultsdict

Dictionary of results datastreams

handle_results(res)#

Handle the combining of results.

Parameters:
res{dict, list}

Dictionary or list of results.

Returns:
resultsdict

Dictionary of final results

predict(*, files=None, **kwargs)#

Read the files or files from a directory.

Parameters:
files{array-like, dict}

Either a list-like of files to read, or a dictionary of keys corresponding to files to read. Keys match with those provided to reader_kw upon initializing the process.

Notes

Note that any additional key-word arguments passed to MultiReader.predict will be passed along to the reader.predict method.