Skip to content

Skip the redundant permute in matricize, returning a view for dense arrays#183

Merged
mtfishman merged 1 commit into
mainfrom
mf/matricize-maybe-view
Jun 26, 2026
Merged

Skip the redundant permute in matricize, returning a view for dense arrays#183
mtfishman merged 1 commit into
mainfrom
mf/matricize-maybe-view

Conversation

@mtfishman

@mtfishman mtfishman commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

matricize/matricizeop now skip the permuted copy when the requested row and column grouping is already in storage order, instead of always permuting first. A new matricizekind classifier, dispatched on FusionStyle, decides this per bipermutation: the generic classifier recognizes the always-safe aligned case (skipping a no-op permute is valid for any style), and ReshapeFusion (dense) additionally recognizes a pure codomain/domain swap, which it realizes as a lazy transpose. For a dense array the aligned and swapped cases return a reshape/transpose view of the input. For a graded array matricize still gathers blocks into a new matrix, but the redundant permute copy beforehand is skipped. The fast paths require op === identity, since a plain view cannot carry a fused op like conj. The result may alias the input and is read-only, which matches the matricizeop docstring's existing contract.

This removes input-copy allocations from aligned contractions. At bond dimension 64 the dense matmul runs about 35% faster and a memory-bound rank-3 contraction about 50% faster, closing most of the gap to an optimized reference, and an aligned AbelianGradedArray contraction drops one operand's permute copy.

The contraction driver still allocates the product matrix between the matrix multiply and the destination, and the factorizations still copy their input. Removing the product temp and adding an ownership-preserving matricize_copy will land in follow-up PRs.

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.07%. Comparing base (289b6c7) to head (0fc13e3).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #183      +/-   ##
==========================================
+ Coverage   78.84%   79.07%   +0.23%     
==========================================
  Files          20       20              
  Lines         657      669      +12     
==========================================
+ Hits          518      529      +11     
- Misses        139      140       +1     
Flag Coverage Δ
docs 30.20% <63.63%> (+0.37%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mtfishman mtfishman force-pushed the mf/matricize-maybe-view branch from 4ba84bf to ec2331e Compare June 26, 2026 18:24
@mtfishman mtfishman changed the title [WIP] Return a view from dense matricize when the layout allows [WIP] Skip the redundant permute in matricize, returning a view for dense arrays Jun 26, 2026
…rrays

`matricize`/`matricizeop` now skip the permuted copy when the requested row and column grouping is already in storage order, instead of always permuting first. A new `matricizekind` classifier, dispatched on `FusionStyle`, decides this per bipermutation: the generic classifier recognizes the always-safe aligned case (skipping a no-op permute is valid for any style), and `ReshapeFusion` (dense) additionally recognizes a pure codomain/domain swap, which it realizes as a lazy `transpose`. For a dense array the aligned and swapped cases return a `reshape`/`transpose` view of the input. For a graded array `matricize` still gathers blocks into a new matrix, but the redundant permute copy beforehand is skipped. The fast paths require `op === identity`, since a plain view cannot carry a fused `op` like `conj`. The result may alias the input and is read-only, which matches the `matricizeop` docstring's existing contract.

This removes input-copy allocations from aligned contractions. At bond dimension 64 the dense matmul runs about 35% faster and a memory-bound rank-3 contraction about 50% faster, closing most of the gap to an optimized reference, and an aligned `AbelianGradedArray` contraction drops one operand's permute copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mtfishman mtfishman force-pushed the mf/matricize-maybe-view branch from ec2331e to 0fc13e3 Compare June 26, 2026 18:39
@mtfishman mtfishman changed the title [WIP] Skip the redundant permute in matricize, returning a view for dense arrays Skip the redundant permute in matricize, returning a view for dense arrays Jun 26, 2026
@mtfishman mtfishman marked this pull request as ready for review June 26, 2026 19:00
@mtfishman mtfishman enabled auto-merge (squash) June 26, 2026 19:00
@mtfishman mtfishman merged commit f558047 into main Jun 26, 2026
36 checks passed
@mtfishman mtfishman deleted the mf/matricize-maybe-view branch June 26, 2026 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant