deepextractor.data.preprocessing
================================

.. py:module:: deepextractor.data.preprocessing


Module Contents
---------------

.. py:class:: ChannelStandardScaler

   Per-channel standard scaler for (N, C, T) time-series data.

   Fits one mean and std per channel across all samples and time points.
   The mean_ and scale_ attributes (shape (C,)) are compatible with the
   HDF5Dataset input_scaler interface.

   Compatible with joblib.dump / pickle for serialisation.


   .. py:attribute:: mean_
      :value: None


   .. py:attribute:: scale_
      :value: None


   .. py:attribute:: n_channels_
      :value: None


   .. py:method:: fit(X: numpy.ndarray) -> ChannelStandardScaler

      Fit on array X of shape (N, C, T).

      For large datasets use fit_from_hdf5 instead to avoid loading
      everything into memory.


   .. py:method:: transform(X: numpy.ndarray) -> numpy.ndarray


   .. py:method:: inverse_transform(X: numpy.ndarray) -> numpy.ndarray


   .. py:method:: fit_transform(X: numpy.ndarray) -> numpy.ndarray


   .. py:method:: fit_from_hdf5(hdf5_path: str, key: str, chunk_size: int = 2048) -> ChannelStandardScaler

      Fit on a dataset too large to load at once.

      Computes per-channel mean and variance in two online passes over the
      HDF5 dataset — first pass for the mean, second for the variance.
      Memory usage is O(chunk_size * C * T) rather than O(N * C * T).

      :param hdf5_path: Path to the HDF5 file.
      :param key: Dataset key with shape (N, C, T).
      :param chunk_size: Number of samples to process at a time.