Skip to content

Constraints for table concept_relationship (primary key and maybe coherence of start and end dates) #771

@lgautier

Description

@lgautier

The CDM definitions of fields for the table concept_relationship (in the CSV files) are likely missing useful constraints:

There is no primary key, which is otherwise recommended to help with data integrity. A composite primary key could be the easiest and least disruptive way to address this, even though if would not completely ensure the consistency/coherence of data in that table (more on this in points after this paragraph). I did not find the answer to the following questions in the documentation:

  • Can different relationship_ids exist between a given pair of concept IDs in concept_id_1 and concept_id_2?
  • Can a given pair of concept IDs in concept_id_1 and concept_id_2 exist during different periods of time defined by valid_start_date and valid_end_date?
  • Can a given pair of concept IDs in concept_id_1 and concept_id_2 (and with given relationship_id, valid_start_date, and valid_end_date depending on the answers to the question above) be invalidated for different reasons?

If the answer to all three questions is "No", a composite primary key could be concept_id_1 and concept_id_2. If some answers are "Yes", the composite key would need to add columns accordingly.

Also, valid_end_date is required (to have a value), and is meant to capture "the date when the relationship is invalidated.". Shouldn't valid_end_date be non-required (nullable?). Otherwise how are concept_relationship that are still active (not invalidated) recorded?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions