Skip to content

Commit f826074

Browse files
committed
various typo fixes and clearer language
1 parent ad08bae commit f826074

5 files changed

Lines changed: 6 additions & 73 deletions

File tree

mimic-iii/buildmimic/duckdb/README.md

Lines changed: 3 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,6 @@ The scripts in this folder create the schema for MIMIC-III and
44
loads the data into the appropriate tables for
55
[DuckDB](https://duckdb.org/).
66

7-
The Python script (`import_duckdb.py`) also includes the option to
8-
add the [concepts views](../../concepts/README.md) to the database.
9-
This makes it much easier to use the concepts views as you do not
10-
have to install and setup PostgreSQL or use BigQuery.
11-
127
DuckDB, like SQLite, is serverless and
138
stores all information in a single file.
149
Unlike SQLite, an OLTP database,
@@ -34,8 +29,6 @@ wget -r -N -c -np -nH --cut-dirs=1 --user YOURUSERNAME --ask-password https://ph
3429

3530
Replace `YOURUSERNAME` with your physionet username.
3631

37-
This will make you `mimic_data_dir` be `mimiciii/1.4`.
38-
3932
The rest of these intructions assume the CSV files are in the folder structure as follows:
4033

4134
```
@@ -45,7 +38,7 @@ mimic_data_dir/
4538
...
4639
```
4740

48-
The CSV files can be uncompressed (end in `.csv`) or compressed (end in `.csv.gz`).
41+
By default, the above `wget` downloads the data into `mimiciii/1.4` (as we used `--cut-dirs=1` to remove the base folder). Thus, by default, `mimic_data_dir` is `mimiciii/1.4` (relative to the current folder). The CSV files can be uncompressed (end in `.csv`) or compressed (end in `.csv.gz`).
4942

5043

5144
## Shell script method (`import_duckdb.sh`)
@@ -74,7 +67,7 @@ e.g. `/usr/local/bin`.
7467

7568
### Create DuckDB database and load data
7669

77-
You can do all of this will one shell script, `import_duckdb.sh`,
70+
You can do all of this with one shell script, `import_duckdb.sh`,
7871
located in this repository.
7972

8073
See the help for it below:
@@ -105,66 +98,7 @@ The script will print out progress as it goes.
10598
Be patient, this can take minutes to hours to load
10699
depending on your computer's configuration.
107100

108-
## Python script method (`import_duckdb.py`)
109-
110-
This method does not require the DuckDB executable, it only requires the DuckDB Python
111-
module and the [SQLGlot](https://github.com/tobymao/sqlglot) Python module, both of which can be
112-
easily installed with `pip`.
113-
114-
### Install dependencies
115-
116-
Install the dependencies by using the included `requirements.txt` file:
117-
118-
```sh
119-
python3 -m pip install -r ./requirements.txt
120-
```
121-
122-
### Create DuckDB database and load data
123-
124-
Create the MIMIC-III database with `import_duckdb.py` like so:
125-
126-
```sh
127-
python ./import_duckdb.py /path/to/mimic_data_dir ./mimic3.db
128-
```
129-
130-
...where `/path/to/mimic_data_dir` is the path containing the .csv or .csv.gz
131-
data files downloaded above.
132-
133-
This command will create the `mimic3.db` file in the current directory. Be aware that
134-
for the full MIMIC-III v1.4 dataset the resulting file will be about 34GB in size.
135-
This process will take some time, as with the shell script version.
136-
137-
The default options will create only the tables and load the data, and assume
138-
that you are running the script from the same directory where this README.md
139-
is located. See the full options below if the defaults are insufficient.
140-
141-
### Create the concepts views
142-
143-
In most cases you will want to create the concepts views at the same time as
144-
the database. To do this, add the `--make-concepts` option:
145-
146-
```sh
147-
python ./import_duckdb.py /path/to/mimic_data_dir ./mimic3.db --make-concepts
148-
```
149-
150-
If you want to add the concepts to a database already created without this
151-
option (or created with the shell script version), you can add the
152-
`--skip-tables` option as well:
153-
154-
```sh
155-
python ./import_duckdb.py /path/to/mimic_data_dir ./mimic3.db --make-concepts --skip-tables
156-
```
157-
158-
### Additional options
159-
160-
There are a few additional options for special situations:
161-
162-
| Option | Description
163-
| - | -
164-
| `--skip-indexes` | Don't create additional indexes when creating tables and loading data. This may be useful in memory-constrained systems or to save a little time.
165-
| `--mimic-code-root [path]` | This argument specifies the location of the mimic-code repository files. This is needed to find the concepts SQL files. This is useful if you are running the script from a different directory than the one where this README.md file is located (the default is `../../../`)
166-
| `--schema-name [name]` | This puts the tables and concepts views into a named schema in the database. This is mainly useful to mirror the behavior of the PostgreSQL version of the database, which places objects in a schema named `mimiciii` by default--if you have existing code designed for the PostgreSQL version, this may make migration easier. Note that--like the PostgreSQL version--the `ccs_dx` view is *not* placed in the specified schema, but in the default schema (which is `main` in DuckDB, not `public` as in PostgreSQL).
167101

168102
# Help
169103

170-
Please see the [issues page](https://github.com/MIT-LCP/mimic-iii/issues) to discuss other issues you may be having.
104+
Please see the [issues page](https://github.com/MIT-LCP/mimic-code/issues) to discuss other issues you may be having.

mimic-iii/buildmimic/duckdb/duckdb_add_indexes.sql

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -551,4 +551,3 @@ CREATE INDEX TRANSFERS_idx03
551551

552552
-- FIXME: Remove this index when the PK can be re-added...
553553
CREATE UNIQUE INDEX chartevents_rowid_pk ON CHARTEVENTS (ROW_ID);
554-

mimic-iii/buildmimic/duckdb/duckdb_add_tables.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -487,4 +487,4 @@ CREATE TABLE TRANSFERS
487487
OUTTIME TIMESTAMP,
488488
LOS DOUBLE PRECISION,
489489
CONSTRAINT transfers_rowid_pk PRIMARY KEY (ROW_ID)
490-
) ;
490+
) ;

mimic-iv/buildmimic/duckdb/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ e.g. `/usr/local/bin`.
4545

4646
Download the CSV files for [MIMIC-IV](https://physionet.org/content/mimiciv/)
4747
by any method you wish.
48-
These instructionds were tested with MIMIC-IV v2.2.
48+
These instructions were tested with MIMIC-IV v2.2.
4949

5050
The CSV files should be a folder structure as follows:
5151

mimic-iv/concepts_duckdb/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This folder has SQL compatible with [DuckDB](https://duckdb.org/).
44
These concepts were generated automatically from the BigQuery SQL dialect using the [sqlglot](https://sqlglot.com/) package.
5-
If you would like to contribute a correction, it should be for the corresponding file in the concepts folder.
5+
If you would like to contribute a correction, do not make it here. Instead, make your correction in the [concepts folder](/mimic-iv/concepts/) using the BigQuery SQL syntax.
66

77
See the [README](/mimic-iv/README.md) in the parent folder for more information.
88

0 commit comments

Comments
 (0)