Skip to content
This repository was archived by the owner on Oct 26, 2022. It is now read-only.
This repository was archived by the owner on Oct 26, 2022. It is now read-only.

The gradient (Tensor.grad) of decoder weights is None #143

@NonvolatileMemory

Description

@NonvolatileMemory

Hello,
I want to get the gradient w.r.t the parameters in decoder like embedding layer's weights and ffn layer's weights.
However when I run following command the results are always None.

print(model.decoder.layers[0].fc1.weight.grad)

and the following command always return True even the FFN weights:

model.decoder.layers[0].fc1.weight.is_leaf

I don't know where going wrong, thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions