Skip to content
This repository was archived by the owner on Oct 26, 2022. It is now read-only.
This repository was archived by the owner on Oct 26, 2022. It is now read-only.

Data preparation #130

@zwx8981

Description

@zwx8981

Hi, thank you for you great work. I have a question of data preparation. To be specific, if I want to use the CNN-based sequence encoder and decoder as standalone modules which can be inserted to other translation models, how should I prepare source dictionary file which can be successfully loaded by fairseq.data.Dictionary.load() method? I read the source code where I find comments in Dictionary.load() method:

    """Loads the dictionary from a text file with the format:

    ```
    <symbol0> <count0>
    <symbol1> <count1>
    ...
    ```
    """

What is the count0 means?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions