Skip to content

Commit 0d7ea9f

Browse files
committed
Merge branch 'atrisovic-add-img'
2 parents 28c8f79 + 58c3bb1 commit 0d7ea9f

2 files changed

Lines changed: 59 additions & 3 deletions

File tree

34.1 KB
Loading

src/pyDataverse/docs/source/user/basic-usage.rst

Lines changed: 59 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,21 @@ can then be used (e. g. :meth:`json() <requests.Response.json>`).
7272
Create Dataverse Collection
7373
-----------------------------
7474

75-
The top-level data-type in the Dataverse software is called a Dataverse collection, so we will
76-
start with that.
75+
The top-level data-type in the Dataverse software is called a Dataverse collection, so we will start with that.
76+
Take a look at the figure below to better understand the relationship between a Dataverse collection, a dataset, and a datafile.
7777

78-
First, instantiate a :class:`Dataverse <pyDataverse.models.Dataverse>`
78+
.. figure:: ../_images/collection_dataset.png
79+
:align: center
80+
:alt: collection dataset datafile
81+
82+
A dataverse collection (also known as a :class:`Dataverse <pyDataverse.models.Dataverse>`) acts as a container for your :class:`Datasets<pyDataverse.models.Dataverse>`.
83+
It can also store other collections (:class:`Dataverses <pyDataverse.models.Dataverse>`).
84+
You could create your own Dataverse collections, but it is not a requirement.
85+
A Dataset is a container for :class:`Datafiles<pyDataverse.models.Datafile>`, such as data, documentation, code, metadata, etc.
86+
You need to create a Dataset to deposit your files. All Datasets are uniquely identified with a DOI at Dataverse.
87+
For more detailed explanations, check out `the Dataverse User Guide <https://guides.dataverse.org/en/latest/user/dataset-management.html>`_.
88+
89+
Going back to the example, first, instantiate a :class:`Dataverse <pyDataverse.models.Dataverse>`
7990
object and import the metadata from the Dataverse Software's own JSON format with
8091
:meth:`from_json() <pyDataverse.models.Dataverse.from_json>`:
8192

@@ -287,6 +298,51 @@ always leads to a major version change:
287298
Dataset doi:10.5072/FK2/EO7BNB published
288299

289300

301+
.. _user_basic-usage_download-data:
302+
303+
Download and save a dataset to disk
304+
----------------------------------------
305+
306+
You may want to download and explore an existing dataset from Dataverse. The following code snippet will show how to retrieve and save a dataset to your machine.
307+
308+
Note that if the dataset is public, you don't need to have an API_TOKEN. Furthermore, you don't even need to have a Dataverse account to use this functionality. The code would therefore look as follows:
309+
310+
::
311+
312+
>>> from pyDataverse.api import NativeApi, DataAccessApi
313+
>>> from pyDataverse.models import Dataverse
314+
315+
>>> base_url = 'https://dataverse.harvard.edu/'
316+
317+
>>> api = NativeApi(base_url)
318+
>>> data_api = DataAccessApi(base_url)
319+
320+
However, you need to know the DOI of the dataset that you want to download. In this example, we use ``doi:10.7910/DVN/KBHLOD``, which is hosted on Harvard's Dataverse instance that we specified as ``base_url``. The code looks as follows:
321+
322+
::
323+
324+
>>> DOI = "doi:10.7910/DVN/KBHLOD"
325+
>>> dataset = api.get_dataset(DOI)
326+
327+
As previously mentioned, every dataset comprises of datafiles, therefore, we need to get the list of datafiles by ID and save them on disk. That is done in the following code snippet:
328+
329+
::
330+
331+
>>> files_list = dataset.json()['data']['latestVersion']['files']
332+
333+
>>> for file in files_list:
334+
>>> filename = file["dataFile"]["filename"]
335+
>>> file_id = file["dataFile"]["id"]
336+
>>> print("File name {}, id {}".format(filename, file_id))
337+
338+
>>> response = data_api.get_datafile(file_id)
339+
>>> with open(filename, "wb") as f:
340+
>>> f.write(response.content)
341+
File name cat.jpg, id 2456195
342+
343+
Please note that in this example, the dataset will be saved in the execution directory. You could change that by adding a desired path in the ``open()`` function above.
344+
345+
290346
.. _user_basic-usage_get-data-tree:
291347

292348
Retrieve all created data as a Dataverse tree

0 commit comments

Comments
 (0)