Skip to content

Commit 6fe93fc

Browse files
authored
Merge pull request #854 from OpenKnowledgeMaps/refactor/metadata-clean-up-indexes
refactor: metadata clean up indexes
2 parents a440fb6 + 22b8479 commit 6fe93fc

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

  • server/preprocessing/other-scripts

server/preprocessing/other-scripts/base.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@ etl <- function(res, repo, non_public) {
248248
subject_cleaned = gsub("(wikidata)?\\.org/entity/[qQ]([\\d]+)?", "", subject_cleaned) # remove wikidata classification
249249
subject_cleaned = gsub("</keyword><keyword>", "", subject_cleaned) # remove </keyword><keyword>
250250
subject_cleaned = gsub("\\[No keyword\\]", "", subject_cleaned)
251+
subject_cleaned = gsub("\\[[^]]*\\]", "", subject_cleaned) # remove any text inside square brackets
251252
subject_cleaned = gsub("\\[[^\\[]+\\][^\\;]+(;|$)?", "", subject_cleaned) # remove classification
252253
subject_cleaned = gsub("[0-9]{2,} [A-Z]+[^;]*(;|$)?", "", subject_cleaned) #remove classification
253254
subject_cleaned = gsub(" -- ", "; ", subject_cleaned) #replace inconsistent keyword separation

0 commit comments

Comments
 (0)