Skip to content

Commit e3a2887

Browse files
committed
add readme to the getting started folder
1 parent 1426ae8 commit e3a2887

2 files changed

Lines changed: 60 additions & 0 deletions

File tree

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# NCI Imaging Data Commons: Getting started tutorial series
2+
3+
## Background
4+
5+
**What is Imaging Data Commons (IDC)?**
6+
7+
![whatis](https://raw.githubusercontent.com/ImagingDataCommons/IDC-Examples/master/notebooks/getting_started/what_is_idc.png)
8+
9+
* [NCI Imaging Data Commons (IDC)](https://datacommons.cancer.gov/repository/imaging-data-commons) is a cloud-based repository of publicly available cancer imaging data co-located with the analysis and exploration tools and resources
10+
* IDC is a node within the broader NCI Cancer Research Data Commons (CRDC) infrastructure that provides secure access to a large, comprehensive, and expanding collection of cancer research data
11+
* IDC provides unmatched capabilities in search, visualization and subsetting of public cancer imaging, image-derived and image-related (i.e., clinical) data (45 TB and counting!): radiology and digital pathology, clinical and preclinical, images and segmentations
12+
* Our mission is to make it easier for you to discover relevant cancer imaging data and analyze it using state of the art tools on the cloud
13+
14+
**Programmatic access to IDC content**
15+
16+
[IDC Portal](https://imaging.datacommons.cancer.gov/) is the interactive interface that allows exploring data available in IDC using a small subset of metadata attributes accompanying IDC data, visualize radiology and microscopy images and annotations, save cohorts (subsets of data) under user account based on the available metadata filters.
17+
18+
The real power of IDC comes, however, from programmatic interfaces available to work with IDC data. Most of the capabilities available through those interfaces and APIs are not available in the portal. As few examples
19+
* download of the image files available in IDC (portal allows you to download a manifest, but download of files referenced from the manifest is currently not easily doable)
20+
* flexible definition of the selection filters at the level of series, studies or patients (portal exposes a very tiny set of metadata attributes, does not have the flexibility in defining filters that is needed for analysis tasks)
21+
* portal does not expose clinical metadata available for imaging collections
22+
23+
## Learning objectives
24+
25+
In this tutorial you will:
26+
* set up the (very minimal!) prerequisites to start your explorations of programmatic interfaces to IDC data
27+
* understand the basics of IDC metadata organization
28+
* learn how to use SQL interface to query IDC data to define subsets relevant to specific tasks
29+
* practice how to visualize images from your cohort, download files and learn about licenses covering the data in your cohort
30+
* identify pointers to futher learning materials that demonstate how you can develop reproducible analysis workflows using Colab applied to IDC data
31+
32+
## Components of the tutorial
33+
34+
This tutorial will leverage learning materials available in the [IDC notebooks repository](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks):
35+
36+
1. [Setting up the prerequisites](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks/getting_started/part1_prerequisites.ipynb)
37+
2. [Basics of searching IDC data](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks/getting_started/part2_searching_basics.ipynb)
38+
3. [Working with the cohorts: visualization, download, licenses](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks/getting_started/part3_exploring_cohorts.ipynb)
39+
40+
## Notebooks to explore on your own
41+
42+
* [IDC Segmentation primer: using nnU-Net for segmenting abdominal organs in chest CT](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks/IDC_segmentation_primer.ipynb)
43+
* [Introduction to exploring clinical data in IDC](https://github.com/ImagingDataCommons/IDC-Examples/blob/master/notebooks/clinical_data_intro.ipynb)
44+
45+
A growing number of AI models available as Colab notebooks accompanied by demonstrations of applying them to data in IDC is available in the [ModelHub.AI](http://app.modelhub.ai/) platform.
46+
47+
## Support and additional resources
48+
49+
* You can contact IDC support by sending email to support@canceridc.dev or posting your question on [IDC User forum](https://discourse.canceridc.dev).
50+
* Please drop by IDC Office Hours to ask any questions about IDC: every Tuesday 16:30 – 17:30 (New York) and Wednesday 10:30-11:30 (New York) via Google Meet at [https://meet.google.com/xyt-vody-tvb ](https://imaging.datacommons.cancer.gov/).
51+
* Free cloud credits are available for those who want to explore features of Google Cloud not included in the free tier (e.g., Cloud Compute Engine, Vertex AI, using Healthcare API for your data): apply [here](https://docs.google.com/forms/d/e/1FAIpQLSfXvXqficGaVEalJI3ym6rKqarmW_YUUWG6A4U8pclvR8MmRQ/viewform)
52+
53+
## Acknowledgments
54+
55+
Imaging Data Commons has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.
56+
57+
If you use IDC in your research, please cite the following publication:
58+
59+
> Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S., Aerts, H. J. W. L., Homeyer, A., Lewis, R., Akbarzadeh, A., Bontempi, D., Clifford, W., Herrmann, M. D., Höfener, H., Octaviano, I., Osborne, C., Paquette, S., Petts, J., Punzo, D., Reyes, M., Schacherer, D. P., Tian, M., White, G., Ziegler, E., Shmulevich, I., Pihl, T., Wagner, U., Farahani, K. & Kikinis, R. NCI Imaging Data Commons. Cancer Res. 81, 4188–4193 (2021). http://dx.doi.org/10.1158/0008-5472.CAN-21-0950
60+
84.5 KB
Loading

0 commit comments

Comments
 (0)