Skip to content

Commit 71f205e

Browse files
committed
address pr feedback
- Fix intro sentence to use active voice - Remove "on Linux" qualifier from curl example - Make section headings consistent with numbered list - Add section for optional SHA-256 checksum verification
1 parent 67e1f31 commit 71f205e

1 file changed

Lines changed: 10 additions & 4 deletions

File tree

download-data/stac-api/large-assets.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ Assets larger than **50 GB** cannot be downloaded with a regular HTTP `GET` or `
44

55
The workaround is to use **HTTP range requests**, which bypass the CloudFront limit by fetching the file in sequential chunks directly from the S3 origin.
66

7-
The actual download is completed in this steps:
7+
Downloading a large asset involves three steps that we detail in the following subsections:
88

99
1. Probe the asset
1010
2. Download the file in chunks
1111
3. Optional: Verify SHA‑256 checksum
1212

13-
## Probe the asset
13+
## 1. Probe the asset
1414

1515
Send a `GET` request with the header `Range: bytes=0-0` to probe the asset.
1616
The S3 origin responds with `HTTP 206 Partial Content` and includes two useful headers:
@@ -20,7 +20,7 @@ The S3 origin responds with `HTTP 206 Partial Content` and includes two useful h
2020
| `Content-Range` | `bytes 0-0/<total_size>` — the total size of the object |
2121
| `x-amz-meta-sha256` | SHA-256 hex digest of the full object (when set by the publisher) |
2222

23-
Example to probe an asset manually with `curl` on Linux:
23+
Example to probe an asset manually with `curl`:
2424

2525
```bash
2626
curl --silent --show-error --location \
@@ -43,7 +43,7 @@ x-amz-meta-sha256: <hex>
4343
`HEAD` requests are **also blocked** by CloudFront for objects > 50 GB. Always use `GET` with a `Range` header to probe asset metadata.
4444
:::
4545

46-
## Download
46+
## 2. Download the file in chunks
4747

4848
The script below requires **Python 3.6+ and no third-party packages** (stdlib only). It works on Linux, macOS, and Windows.
4949

@@ -301,6 +301,12 @@ if __name__ == '__main__':
301301
main()
302302
```
303303

304+
## 3. Optional: Verify SHA‑256 checksum
305+
306+
If the asset publisher provided a checksum, the download script automatically verifies it after the download completes. The expected SHA‑256 is read from the `x-amz-meta-sha256` response header during the probe step and compared against a hash of the downloaded file.
307+
308+
If the values do not match, the script exits with an error so you can detect a corrupted or incomplete download before using the file.
309+
304310
::: tip Parallel Downloads
305311
The script above downloads chunks sequentially, which is simple and reliable. For faster downloads on high-bandwidth connections, you can parallelize by downloading multiple chunks simultaneously using threads or asyncio.
306312

0 commit comments

Comments
 (0)