skdh.io.MultiReader#
- class skdh.io.MultiReader(mode, reader, reader_kw=None, resample_to_lowest=True, fill_gaps=True, fill_value=None, gaps_error='raise', require_all_keys=True, ignore_file_size_check=True)#
A process for reading in multiple files into one set of datastreams for processing.
- Parameters:
- mode{‘combine’, ‘concatenate’, ‘leave’}
The mode to use when reading multiple files. Options are “combine”, which combines multiple data-streams from different files, or “concatenate” which combines the same data-stream from multiple files (Case 3 reader_kw not allowed). Finally, “leave” is only valid when providing case 3 for reader_kw (see Notes), and leaves the results in separate dictionaries with titles given by the keys of reader_kw and files as passed to the predict method.
- readerstr
The name of the reader class to use. See File Reading (skdh.io).
- reader_kw{None, array-like, dict}, optional
Reader key-word arguments to initialize the reader. See Notes for 3 specification options.
- resample_to_lowestbool, optional
When mode is “combine”, resample separate datastreams to the lowest sampled stream. Default is True. False will re-sample all datastreams to the highest sampled stream.
- fill_gapsbool, optional
Fill any gaps in data streams where possible (same size as time array). Default is True.
- fill_value{None, dict}, optional
Dictionary with keys and values to fill data streams with. See Notes for default values if not provided.
- gaps_error{‘raise’, ‘warn’, ‘ignore’}, optional
Behavior if there are gaps in the datastreams. Default is to raise an error.
- require_all_keysbool, optional
Require all files to provide the same keys. Default is True.
Methods
check_handle_gaps(data)Check for gaps, and fill if specified.
concat(data)Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries
convert_timestamps(t)Convert a timestamp/array of timestamps to a datetime object
get_reader_kw(idx)Get the appropriate reader class key-word arguments
handle_combine(res)Combine results
handle_concatenation(res)Concatenate results.
handle_results(res)Handle the combining of results.
predict(*[, files])Read the files or files from a directory.
save_results(results, file_name)Save the results of the processing pipeline to a csv file
handle_gaps_error
Notes
The combine mode should be used in the case when you have, for example, acceleration and temperature in separate files. Concatenate should be used when multiple files all contain acceleration, for example. In this case, read results will be concatenated together in order of the first timestamp.
reader_kw has three(four) different ways to be specified:
None for no keyword arguments overriding defaults
A dictionary of key-word arguments that will be the same for all files.
- A list of dictionaries of key-word arguments that must equal the number of
files provided to the predict method, in order.
- A dictionary of dictionaries of key-word arguments. In this case, the files
argument for predict should be a dictionary with the same key-names as reader_kw, and the key-word arguments will be associated with the path in the files dictionary.
Note that if the reader returns the same keys, and mode is “combine”, keys will be overwritten.
Default fill values are:
accel: numpy.array([0.0, 0.0, 1.0])
gyro: 0.0
temperature: 0.0
ecg: 0.0
Examples
Case 0:
>>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=None) >>> mrdr.predict(files=["file1.csv", "file2.csv"])
Case 1:
>>> kw = {'time_col_name': 'timestamp', 'read_csv_kwargs': {'skiprows': 5}} >>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw) >>> mrdr.predict(files=["file1.csv", "file2.csv"])
Case 2:
>>> kw = [ >>> {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}}, >>> {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}}, >>> ] >>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw) >>> mrdr.predict(files=["file1.csv", "file2.csv"])
Case 3:
>>> kw = { >>> 'f1': {'time_col_name': 'ts', 'column_names': {'accel': ['ax', 'ay', 'az']}}, >>> 'f2': {'time_col_name': 'ts', 'column_names': {'temperature': 'temperature C'}}, >>> } >>> mrdr = MultiReader(mode='combine', reader='ReadCsv', reader_kw=kw) >>> mrdr.predict(files={'f1': "file1.csv", 'f2': "file2.csv"})
- check_handle_gaps(data)#
Check for gaps, and fill if specified.
- Parameters:
- datadict
Dictionary of data-streams
- Returns:
- res_dictdict
Dictionary of data with gaps filled, as specified.
- static concat(data)#
Custom concatenation function for data streams to handle inputs that are either tuples of arrays or tuples of dictionaries
- Parameters:
- datatuple
Tuple of either numpy.ndarrays to concatenate, or dictionaries whose keys should be concatenated.
- Returns:
- data{ndarray, dict}
Concatenated data.
- get_reader_kw(idx)#
Get the appropriate reader class key-word arguments
- Parameters:
- idxstr, int
Index of the file to use to retrieve reader kwargs
- handle_combine(res)#
Combine results
- Parameters:
- res{list, dict}
- Returns:
- resultsdict
Datastream results dictionary
- handle_concatenation(res)#
Concatenate results.
- Parameters:
- reslist
List of results dictionaries
- Returns:
- resultsdict
Dictionary of results datastreams
- handle_results(res)#
Handle the combining of results.
- Parameters:
- res{dict, list}
Dictionary or list of results.
- Returns:
- resultsdict
Dictionary of final results
- predict(*, files=None, **kwargs)#
Read the files or files from a directory.
- Parameters:
- files{array-like, dict}
Either a list-like of files to read, or a dictionary of keys corresponding to files to read. Keys match with those provided to reader_kw upon initializing the process.
Notes
Note that any additional key-word arguments passed to MultiReader.predict will be passed along to the reader.predict method.