deepextractor.data.preprocessing ================================ .. py:module:: deepextractor.data.preprocessing Module Contents --------------- .. py:class:: ChannelStandardScaler Per-channel standard scaler for (N, C, T) time-series data. Fits one mean and std per channel across all samples and time points. The mean_ and scale_ attributes (shape (C,)) are compatible with the HDF5Dataset input_scaler interface. Compatible with joblib.dump / pickle for serialisation. .. py:attribute:: mean_ :value: None .. py:attribute:: scale_ :value: None .. py:attribute:: n_channels_ :value: None .. py:method:: fit(X: numpy.ndarray) -> ChannelStandardScaler Fit on array X of shape (N, C, T). For large datasets use fit_from_hdf5 instead to avoid loading everything into memory. .. py:method:: transform(X: numpy.ndarray) -> numpy.ndarray .. py:method:: inverse_transform(X: numpy.ndarray) -> numpy.ndarray .. py:method:: fit_transform(X: numpy.ndarray) -> numpy.ndarray .. py:method:: fit_from_hdf5(hdf5_path: str, key: str, chunk_size: int = 2048) -> ChannelStandardScaler Fit on a dataset too large to load at once. Computes per-channel mean and variance in two online passes over the HDF5 dataset — first pass for the mean, second for the variance. Memory usage is O(chunk_size * C * T) rather than O(N * C * T). :param hdf5_path: Path to the HDF5 file. :param key: Dataset key with shape (N, C, T). :param chunk_size: Number of samples to process at a time.