Conversation
|
Stack from ghstack (oldest at bottom): |
Summary: Adding a diagram of the GPTQ update loop to simplify future work on this code. Also, var renames to simplify: * W_t_quantize_block -> B_cur * k_block_start -> B_cur_k_start * k_block_end -> B_cur_k_end * Err1 -> B_cur_Err1 * Hinv_quantize_block -> Hinv_cur * k_group_start -> G_k_start * k_group_end -> G_k_end Test Plan: CI, this is a comment-only diff ghstack-source-id: 24d0909 ghstack-comment-id: 4184643773 Pull-Request: #4237
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4237
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 4807cac with merge base 6e7a6e9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@claude verify that all var renames are correct, that the diagram matches the code, and that this is a pure refactor (no logic change) |
|
Claude finished @vkuzo's task in 8m 58s —— View job Verification Results
1. Variable Renames: PASSAll 7 renames were applied completely and consistently. Grep confirms zero occurrences of any old variable name remain in the file:
2. Diagram vs. Code: Minor inaccuracy in pseudocodeThe diagram structure (triple for loop, shapes, batch propagation) correctly describes the code at a high level. However, the inner loop pseudocode on lines 294-301 has an indexing issue: # for k in range(group_size):
# k2 = B_cur_k_start + k
# w_t = B_cur[:, G_cur_start + k2]In the actual code ( The same issue propagates to the error computation lines, where the diagram uses Suggested fix for the pseudocode — something like: # for k in range(G_start_local, G_end_local):
# w_t = B_cur[:, k]
# w_t_qdq = quant_dequant(w_t, base_config, qparams)
# err1 = (w_t - w_t_qdq) / Hinv_cur[k, k]
# B_cur[:, k:] -= err1.matmul(Hinv_cur[k, k:])
# B_cur_Err1[:, k] = err1.flatten()This is documentation-only; the executable code is correct. 3. Pure Refactor: PASSEvery changed line of executable code is a pure variable rename. Specifically verified:
SummaryThe variable renames and refactor are correct — this is a clean, logic-preserving change. The one item to consider fixing is the inner loop pseudocode in the diagram comment, where the indexing doesn't quite match the actual code's block-local |
Summary: Adding a diagram of the GPTQ update loop to simplify future work on this code. Also, var renames to simplify: * W_t_quantize_block -> B_cur * k_block_start -> B_cur_k_start * k_block_end -> B_cur_k_end * Err1 -> B_cur_Err1 * Hinv_quantize_block -> Hinv_cur * k_group_start -> G_k_start * k_group_end -> G_k_end Test Plan: CI, this is a comment-only diff ghstack-source-id: 5ae51c8 ghstack-comment-id: 4184643773 Pull-Request: #4237
| # * B1, ..., BN are chunks of size [N, B], where B is a hyperparameter of GPTQ | ||
| # * G1, ..., GN are chunks of size [N, G], where G is group_size of the quantization recipe | ||
| # | ||
| # 2. triple for loop, with every loop chunking along the K dimension: |
There was a problem hiding this comment.
is this the same thing the code is doing? if so, would it better to just do inline comment for code (including refactor code itself to be easier to read by adding helper functions etc.)?
|
I think the diagram is really useful. What you suggested is good as well, I
think this PR should land and we could implement your suggestion in a
future PR.
…On Fri, Apr 3, 2026 at 3:58 PM Jerry Zhang ***@***.***> wrote:
@ jerryzh168 commented on this pull request. In
torchao/prototype/gptq/api. py: > + # + # W_t (below) + # + #
|------------------ K --------------------| + # |---B1----|---B2----| .. .
+ # |-G1-|-G2-|-G1-|-G2-| .. . + #
|-----------------------------------------|
***@***.**** commented on this pull request.
------------------------------
In torchao/prototype/gptq/api.py
<#4237?email_source=notifications&email_token=AAMMEIMONABDDB2AQP4CM2T4UAJVFA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMBVG4YTAMBZG44KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJL3QOJPXEZLWNFSXOX3DNRUWG2Y#discussion_r3034189810>
:
> + #
+ # W_t (below)
+ #
+ # |------------------ K --------------------|
+ # |---B1----|---B2----| ...
+ # |-G1-|-G2-|-G1-|-G2-| ...
+ # |-----------------------------------------|
+ # N | 0 1 2 3 | 4 5 6 7 | ...
+ # | ...
+ # |-----------------------------------------|
+ #
+ # 1. start with W_t, with shape [N, K]
+ # * B1, ..., BN are chunks of size [N, B], where B is a hyperparameter of GPTQ
+ # * G1, ..., GN are chunks of size [N, G], where G is group_size of the quantization recipe
+ #
+ # 2. triple for loop, with every loop chunking along the K dimension:
is this the same thing the code is doing? if so, would it better to just
do inline comment for code (including refactor code itself to be easier to
read by adding helper functions etc.)?
—
Reply to this email directly, view it on GitHub
<#4237?email_source=notifications&email_token=AAMMEIKPP6KAHESWMWBI4T34UAJVFA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMBVG4YTAMBZG44KM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJPLQOJPXEZLWNFSXOX3ON52GSZTJMNQXI2LPNZZV6Y3MNFRWW#pullrequestreview-4057100978>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMMEIONVMSEW6DKBXH2S2D4UAJVFAVCNFSM6AAAAACXMA74Y6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHM2DANJXGEYDAOJXHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Summary:
Adding a diagram of the GPTQ update loop to simplify future work on this
code.
Also, var renames to simplify:
Test Plan: CI, GPTQ output did not change locally before+after