Commands to download the DBpedia Knowledge Graphs generated by Live Fusion. DBpedia Live Fusion publishes two different kinds of KGs:
- Open Core Knowledge Graphs under CC-BY-SA license, open with copyleft/share-alike, no registration needed
- Industry Knowledge Graphs under BUSL 1.1 license, unrestricted for research and experimentation, commercial license for productive use, free registration needed.
- If you do not have a DBpedia Account yet (Forum/Databus), please register at https://account.dbpedia.org
- Login at https://account.dbpedia.org and create your token.
- Save the token to a file
vault-token.dat.
The databus-python-client comes as docker or python with these patterns.
$DOWNLOADTARGET can be any Databus URI including collections OR SPARQL query (or several thereof). Details are documented below.
# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download $DOWNLOADTARGET --token vault-token.dat
# Python
python3 -m pip install databusclient
databusclient download $DOWNLOADTARGET --token vault-token.datTODO One slogan sentence. More information
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-snapshot --token vault-token.datDBpedia Wikipedia Extraction Enriched TODO One slogan sentence and link Currently EN DBpedia only.
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-snapshot --token vault-token.datDBpedia Wikidata Extraction Enriched TODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikidata-kg-enriched-snapshot --token vault-token.datTODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-snapshot TODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikidata-kg-snapshot A docker image is available at dbpedia/databus-python-client. See download section for details.
Installation
python3 -m pip install databusclientRunning
databusclient --helpUsage: databusclient [OPTIONS] COMMAND [ARGS]...
Options:
--install-completion [bash|zsh|fish|powershell|pwsh]
Install completion for the specified shell.
--show-completion [bash|zsh|fish|powershell|pwsh]
Show completion for the specified shell, to
copy it or customize the installation.
--help Show this message and exit.
Commands:
deploy
downloaddatabusclient download --help
Usage: databusclient download [OPTIONS] DATABUSURIS...
Arguments:
DATABUSURIS... databus uris to download from https://databus.dbpedia.org,
or a query statement that returns databus uris from https://databus.dbpedia.org/sparql
to be downloaded [required]
Download datasets from databus, optionally using vault access if vault
options are provided.
Options:
--localdir TEXT Local databus folder (if not given, databus folder
structure is created in current working directory)
--databus TEXT Databus URL (if not given, inferred from databusuri, e.g.
https://databus.dbpedia.org/sparql)
--token TEXT Path to Vault refresh token file
--authurl TEXT Keycloak token endpoint URL [default:
https://auth.dbpedia.org/realms/dbpedia/protocol/openid-
connect/token]
--clientid TEXT Client ID for token exchange [default: vault-token-
exchange]
--help Show this message and exit. Show this message and exit.
Examples of using download command
File: download of a single file
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2
Version: download of all files of a specific version
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
Artifact: download of all files with latest version of an artifact
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals
Group: download of all files with lates version of all artifacts of a group
databusclient download https://databus.dbpedia.org/dbpedia/mappings
If no --localdir is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the databus structure, i.e. ./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/.
Collection: download of all files within a collection
databusclient download https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12
Query: download of all files returned by a query (sparql endpoint must be provided with --databus)
databusclient download 'PREFIX dcat: <http://www.w3.org/ns/dcat#> SELECT ?x WHERE { ?sub dcat:downloadURL ?x . } LIMIT 10' --databus https://databus.dbpedia.org/sparql
databusclient deploy --help
Usage: databusclient deploy [OPTIONS] DISTRIBUTIONS...
Arguments:
DISTRIBUTIONS... distributions in the form of List[URL|CV|fileext|compression|sha256sum:contentlength] where URL is the
download URL and CV the key=value pairs (_ separted)
content variants of a distribution, fileExt and Compression can be set, if not they are inferred from the path [required]
Options:
--version-id TEXT Target databus version/dataset identifier of the form <h
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
RSION> [required]
--title TEXT Dataset title [required]
--abstract TEXT Dataset abstract max 200 chars [required]
--description TEXT Dataset description [required]
--license TEXT License (see dalicc.net) [required]
--apikey TEXT API key [required]
--help Show this message and exit.
Examples of using deploy command
databusclient deploy --version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
databusclient deploy --version-id https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
A few more notes for CLI usage:
- The content variants can be left out ONLY IF there is just one distribution
- For complete inferred: Just use the URL with
https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml - If other parameters are used, you need to leave them empty like
https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml||yml|7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653:367116
- For complete inferred: Just use the URL with
For downloading files from the vault, you need to provide a vault token. See getting-the-access-refresh-token for details. You can come back here once you have a vault-token.dat file. To use it, just provide the path to the file with --token /path/to/vault-token.dat.
Example:
databusclient download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23 --token vault-token.dat
If vault authentication is required for downloading a file, the client will use the token. If no vault authentication is required, the token will not be used.
A docker image is available at dbpedia/databus-python-client. You can use it like this:
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
If using vault authentication, make sure the token file is available in the container, e.g. by placing it in the current working directory.
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23/fusion_props=all_subjectns=commons-wikimedia-org_vocab=all.ttl.gz --token vault-token.dat
databusclient upload-and-deploy --helpUsage: databusclient upload-and-deploy [OPTIONS] [FILES]...
Upload files to Nextcloud and deploy to DBpedia Databus.
Arguments:
FILES... files in the form of List[path], where every path must exist locally, which will be uploaded and deployed
Options:
--webdav-url TEXT WebDAV URL (e.g.,
https://cloud.example.com/remote.php/webdav)
--remote TEXT rclone remote name (e.g., 'nextcloud')
--path TEXT Remote path on Nextcloud (e.g., 'datasets/mydataset')
--no-upload Skip file upload and use existing metadata
--metadata PATH Path to metadata JSON file (required if --no-upload is
used)
--version-id TEXT Target databus version/dataset identifier of the form <h
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
RSION> [required]
--title TEXT Dataset title [required]
--abstract TEXT Dataset abstract max 200 chars [required]
--description TEXT Dataset description [required]
--license TEXT License (see dalicc.net) [required]
--apikey TEXT API key [required]
--help Show this message and exit.
The script uploads all given files and all files in the given folders to the given remote. Then registers them on the databus.
databusclient upload-and-deploy \
--webdav-url https://cloud.scadsai.uni-leipzig.de/remote.php/webdav \
--remote scads-nextcloud \
--path test \
--version-id https://databus.org/user/dataset/version/1.0 \
--title "Test Dataset" \
--abstract "This is a short abstract of the test dataset." \
--description "This dataset was uploaded for testing the Nextcloud → Databus deployment pipeline." \
--license https://dalicc.net/licenselibrary/Apache-2.0 \
--apikey "API-KEY" \
/home/test \
/home/test_folder/testdatabusclient deploy-with-metadata --helpUsage: databusclient deploy-with-metadata [OPTIONS]
Deploy to DBpedia Databus using metadata json file.
Options:
--metadata PATH Path to metadata JSON file [required]
--version-id TEXT Target databus version/dataset identifier of the form <h
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
RSION> [required]
--title TEXT Dataset title [required]
--abstract TEXT Dataset abstract max 200 chars [required]
--description TEXT Dataset description [required]
--license TEXT License (see dalicc.net) [required]
--apikey TEXT API key [required]
--help Show this message and exit.
Use the metadata.json file (see databusclient/metadata.json) to list all files which should be added to the databus. The script registers all files on the databus.
databusclient deploy-with-metadata \
--metadata /home/metadata.json \
--version-id https://databus.org/user/dataset/version/1.0 \
--title "Test Dataset" \
--abstract "This is a short abstract of the test dataset." \
--description "This dataset was uploaded for testing the Nextcloud → Databus deployment pipeline." \
--license https://dalicc.net/licenselibrary/Apache-2.0 \
--apikey "API-KEY"from databusclient import create_distribution
# create a list
distributions = []
# minimal requirements
# compression and filetype will be inferred from the path
# this will trigger the download of the file to evaluate the shasum and content length
distributions.append(
create_distribution(url="https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml", cvs={"type": "swagger"})
)
# full parameters
# will just place parameters correctly, nothing will be downloaded or inferred
distributions.append(
create_distribution(
url="https://example.org/some/random/file.csv.bz2",
cvs={"type": "example", "realfile": "false"},
file_format="csv",
compression="bz2",
sha256_length_tuple=("7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653", 367116)
)
)A few notes:
- The dict for content variants can be empty ONLY IF there is just one distribution
- There can be no compression if there is no file format
from databusclient import create_dataset
# minimal way
dataset = create_dataset(
version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
title="Client Testing",
abstract="Testing the client....",
description="Testing the client....",
license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
distributions=distributions,
)
# with group metadata
dataset = create_dataset(
version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
title="Client Testing",
abstract="Testing the client....",
description="Testing the client....",
license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
distributions=distributions,
group_title="Title of group1",
group_abstract="Abstract of group1",
group_description="Description of group1"
)NOTE: To be used you need to set all group parameters, or it will be ignored
from databusclient import deploy
# to deploy something you just need the dataset from the previous step and an APIO key
# API key can be found (or generated) at https://$$DATABUS_BASE$$/$$USER$$#settings
deploy(dataset, "mysterious api key")