You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Download DBpedia Wikipedia Knowledge Graphs (CC-BY-SA, no registration needed)](#download-dbpedia-wikipedia-knowledge-graphs-cc-by-sa-no-registration-needed)
16
16
-[Download DBpedia Wikidata Knowledge Graphs (CC-BY-SA, no registration needed)](#download-dbpedia-wikidata-knowledge-graphs-cc-by-sa-no-registration-needed)
@@ -20,9 +20,6 @@ Command-line and Python client for downloading and deploying datasets on DBpedia
@@ -33,68 +30,21 @@ You can use either **Python** or **Docker**. Both methods support all client fea
33
30
34
31
### Python
35
32
36
-
Requirements: [Python 3.11+](https://www.python.org/downloads/) and [pip](https://pip.pypa.io/en/stable/installation/)
33
+
Requirements: [Python](https://www.python.org/downloads/) and [pip](https://pip.pypa.io/en/stable/installation/)
37
34
38
35
Before using the client, install it via pip:
39
36
40
37
```bash
41
38
python3 -m pip install databusclient
42
39
```
43
40
44
-
Note: the PyPI release was updated and this repository prepares version `0.15`. If you previously installed `databusclient` via `pip` and observe different CLI behavior, upgrade to the latest release:
docker run --rm -v $(pwd):/data dbpedia/databus-python-client --help
57
-
58
-
# Output:
59
-
Usage: databusclient [OPTIONS] COMMAND [ARGS]...
60
-
61
-
Databus Client CLI
62
-
63
-
Options:
64
-
--help Show this message and exit.
65
-
66
-
Commands:
67
-
deploy Flexible deploy to Databus command supporting three modes:
68
-
download Download datasets from databus, optionally using vault access...
69
-
```
70
-
71
-
<aid="cli-download"></a>
72
-
### Download
73
-
74
-
With the download command, you can download datasets or parts thereof from the Databus. The download command expects one or more Databus URIs or a SPARQL query as arguments. The URIs can point to files, versions, artifacts, groups, or collections. If a SPARQL query is provided, the query must return download URLs from the Databus which will be downloaded.
75
-
76
-
```bash
77
-
# Python
78
-
databusclient download $DOWNLOADTARGET
79
-
# Docker
80
-
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download $DOWNLOADTARGET
81
-
```
82
-
83
-
-`$DOWNLOADTARGET`
84
-
- Can be any Databus URI including collections OR SPARQL query (or several thereof).
85
-
-`--localdir`
86
-
- If no `--localdir` is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the Databus layout, i.e. `./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/`.
87
-
-`--vault-token`
88
-
- If the dataset/files to be downloaded require vault authentication, you need to provide a vault token with `--vault-token /path/to/vault-token.dat`. See [Registration (Access Token)](#registration-access-token) for details on how to get a vault token.
89
-
-`--databus-key`
90
-
- If the databus is protected and needs API key authentication, you can provide the API key with `--databus-key YOUR_API_KEY`.
91
-
92
-
**Help and further information on download command:**
93
-
```bash
94
-
# Python
45
+
databusclient deploy --help
95
46
databusclient download --help
96
-
# Docker
97
-
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download --help
47
+
```
98
48
99
49
### Docker
100
50
@@ -123,48 +73,48 @@ To download BUSL 1.1 licensed datasets, you need to register and get an access t
123
73
124
74
### DBpedia Knowledge Graphs
125
75
126
-
#### Download Live Fusion KG Dump (BUSL 1.1, registration needed)
127
-
High-frequency, conflict-resolved knowledge graph that merges Live Wikipedia and Wikidata signals into a single, queryable dumpfor enterprise consumption. [More information](https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-dump)
76
+
#### Download Live Fusion KG Snapshot (BUSL 1.1, registration needed)
77
+
High-frequency, conflict-resolved knowledge graph that merges Live Wikipedia and Wikidata signals into a single, queryable snapshot for enterprise consumption. [More information](https://databus.dev.dbpedia.link/fhofer/live-fusion-kg-dump)
DBpedia-based enrichment of structured Wikipedia extractions (currently EN DBpedia only). [More information](https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-dump)
89
+
DBpedia-based enrichment of structured Wikipedia extractions (currently EN DBpedia only). [More information](https://databus.dev.dbpedia.link/fhofer/dbpedia-wikipedia-kg-enriched-dump)
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-dump --vault-token vault-token.dat
95
+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dev.dbpedia.link/fhofer/dbpedia-wikipedia-kg-enriched-dump --vault-token vault-token.dat
146
96
```
147
97
148
98
#### Download DBpedia Wikipedia Knowledge Graphs (CC-BY-SA, no registration needed)
149
99
150
-
Original extraction of structured Wikipedia data before enrichment. [More information](https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-dump)
100
+
Original extraction of structured Wikipedia data before enrichment. [More information](https://databus.dev.dbpedia.link/fhofer/dbpedia-wikipedia-kg-dump)
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-dump
106
+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dev.dbpedia.link/fhofer/dbpedia-wikipedia-kg-dump
157
107
```
158
108
159
109
#### Download DBpedia Wikidata Knowledge Graphs (CC-BY-SA, no registration needed)
160
110
161
-
Original extraction of structured Wikidata data before enrichment. [More information](https://databus.dbpedia.org/dbpedia/dbpedia-wikidata-kg-dump)
111
+
Original extraction of structured Wikidata data before enrichment. [More information](https://databus.dev.dbpedia.link/fhofer/dbpedia-wikidata-kg-dump)
- If no `--localdir` is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the Databus layout, i.e. `./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/`.
211
161
-`--vault-token`
212
162
- If the dataset/files to be downloaded require vault authentication, you need to provide a vault token with `--vault-token /path/to/vault-token.dat`. See [Registration (Access Token)](#registration-access-token) for details on how to get a vault token.
213
-
214
-
Note: Vault tokens are only required for certain protected Databus hosts (for example: `data.dbpedia.io`, `data.dev.dbpedia.link`). The client now detects those hosts and will fail early with a clear message if a token is required but not provided. Do not pass `--vault-token`for public downloads.
215
163
-`--databus-key`
216
164
- If the databus is protected and needs API key authentication, you can provide the API key with `--databus-key YOUR_API_KEY`.
217
165
@@ -235,8 +183,6 @@ Options:
235
183
e.g. https://databus.dbpedia.org/sparql)
236
184
--vault-token TEXT Path to Vault refresh token file
237
185
--databus-key TEXT Databus API key to download from protected databus
238
-
--all-versions When downloading artifacts, download all versions
239
-
instead of only the latest
240
186
--authurl TEXT Keycloak token endpoint URL [default:
@@ -647,45 +551,3 @@ from databusclient import deploy
647
551
# API key can be found (or generated) at https://$$DATABUS_BASE$$/$$USER$$#settings
648
552
deploy(dataset, "mysterious API key")
649
553
```
650
-
651
-
## Development & Contributing
652
-
653
-
Install development dependencies yourself or via [Poetry](https://python-poetry.org/):
654
-
655
-
```bash
656
-
poetry install --with dev
657
-
```
658
-
659
-
### Linting
660
-
661
-
The used linter is [Ruff](https://ruff.rs/). Ruff is configured in `pyproject.toml` and is enforced in CI (`.github/workflows/ruff.yml`).
662
-
663
-
For development, you can run linting locally with `ruff check .` and optionally auto-format with `ruff format .`.
664
-
665
-
To ensure compatibility with the `pyproject.toml` configured dependencies, run Ruff via Poetry:
666
-
667
-
```bash
668
-
# To check for linting issues:
669
-
poetry run ruff check .
670
-
671
-
# To auto-format code:
672
-
poetry run ruff format .
673
-
```
674
-
675
-
### Testing
676
-
677
-
When developing new features please make sure to add appropriate tests and ensure that all tests pass. Tests are under `tests/` and use [pytest](https://docs.pytest.org/en/7.4.x/) as test framework.
678
-
679
-
When fixing bugs or refactoring existing code, please make sure to add tests that cover the affected functionality. The current test coverage is very low, so any additional tests are highly appreciated.
680
-
681
-
To run tests locally, use:
682
-
683
-
```bash
684
-
pytest tests/
685
-
```
686
-
687
-
Or to ensure compatibility with the `pyproject.toml` configured dependencies, run pytest via Poetry:
0 commit comments