PERF: Batch spectrogram calls in Welch PSD computation

sharifhsn · claude · sharifhsn · commit 360af804a5a4 · 2026-03-16T23:12:29.000-04:00
Replace np.apply_along_axis (which calls scipy.signal.spectrogram
once per row) with chunked 2D calls. scipy.signal.spectrogram
handles multi-row input efficiently via vectorized FFT, so
processing ~10 MB chunks instead of individual rows eliminates
per-call Python dispatch overhead.

On 320 epochs x 376 channels (120K rows), psd_array_welch goes
from ~5.0s to ~0.19s (~26x speedup).

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/mne/time_frequency/psd.py b/mne/time_frequency/psd.py
@@ -62,14 +62,23 @@ def _decomp_aggregate_mask(epoch, func, average, freq_sl):
 
 def _spect_func(epoch, func, freq_sl, average, *, output="power"):
     """Aux function."""
-    # Decide if we should split this to save memory or not, since doing
-    # multiple calls will incur some performance overhead. Eventually we might
-    # want to write (really, go back to) our own spectrogram implementation
-    # that, if possible, averages after each transform, but this will incur
-    # a lot of overhead because of the many Python calls required.
+    # Process in chunks to balance vectorization (scipy.signal.spectrogram
+    # handles multi-row input efficiently) against memory usage.
     kwargs = dict(func=func, average=average, freq_sl=freq_sl)
     if epoch.nbytes > 10e6:
-        spect = np.apply_along_axis(_decomp_aggregate_mask, -1, epoch, **kwargs)
+        # Process in chunks of rows instead of one-by-one. Each chunk is
+        # passed to spectrogram as a 2D array, which is much faster than
+        # calling spectrogram per-row via np.apply_along_axis.
+        n_rows = epoch.shape[0]
+        # Target ~10 MB per chunk (same threshold as the original code)
+        row_bytes = epoch[0].nbytes
+        chunk_size = max(1, int(10e6 / row_bytes))
+        parts = []
+        for start in range(0, n_rows, chunk_size):
+            parts.append(
+                _decomp_aggregate_mask(epoch[start : start + chunk_size], **kwargs)
+            )
+        spect = np.concatenate(parts, axis=0)
     else:
         spect = _decomp_aggregate_mask(epoch, **kwargs)
     return spect