deepextractor.data.preprocessing¶
Module Contents¶
- class deepextractor.data.preprocessing.ChannelStandardScaler[source]¶
Per-channel standard scaler for (N, C, T) time-series data.
Fits one mean and std per channel across all samples and time points. The mean_ and scale_ attributes (shape (C,)) are compatible with the HDF5Dataset input_scaler interface.
Compatible with joblib.dump / pickle for serialisation.
- fit(X: numpy.ndarray) ChannelStandardScaler[source]¶
Fit on array X of shape (N, C, T).
For large datasets use fit_from_hdf5 instead to avoid loading everything into memory.
- transform(X: numpy.ndarray) numpy.ndarray[source]¶
- inverse_transform(X: numpy.ndarray) numpy.ndarray[source]¶
- fit_transform(X: numpy.ndarray) numpy.ndarray[source]¶
- fit_from_hdf5(hdf5_path: str, key: str, chunk_size: int = 2048) ChannelStandardScaler[source]¶
Fit on a dataset too large to load at once.
Computes per-channel mean and variance in two online passes over the HDF5 dataset — first pass for the mean, second for the variance. Memory usage is O(chunk_size * C * T) rather than O(N * C * T).
- Parameters:
hdf5_path – Path to the HDF5 file.
key – Dataset key with shape (N, C, T).
chunk_size – Number of samples to process at a time.