Skip to content

Commit 0496d5e

Browse files
committed
Skip transient read errors spherex_cutouts.md
1 parent 9658eab commit 0496d5e

1 file changed

Lines changed: 40 additions & 2 deletions

File tree

tutorials/spherex/spherex_cutouts.md

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,9 @@ The following packages must be installed to run this notebook.
5353

5454
```{code-cell} ipython3
5555
import concurrent.futures
56+
import http.client
5657
import time
58+
import urllib.error
5759
5860
import astropy.units as u
5961
import matplotlib.pyplot as plt
@@ -187,6 +189,35 @@ def process_cutout(row, ra, dec, cache):
187189
row["hdus"] = hdus
188190
```
189191

192+
We provide a small convenience wrapper around `process_cutout` that is used in the rest of this notebook.
193+
It catches transient read errors and simply skips those cutouts, which is sufficient for this demonstration.
194+
For science use cases, users may instead want to implement their own retry logic or error handling strategy.
195+
196+
```{code-cell} ipython3
197+
def process_cutout_with_error_handling(row, ra, dec, cache):
198+
'''
199+
Call `process_cutout` while catching transient read errors.
200+
201+
Parameters:
202+
===========
203+
204+
row : astropy.table row
205+
Row of a table that will be changed in place by this function. The table
206+
is created by the SQL TAP query.
207+
ra,dec : coordinates (astropy units)
208+
Ra and Dec coordinates (same as used for the TAP query) with attached astropy units
209+
cache : bool
210+
If set to `True`, the output of cached and the cutout processing will run faster next time.
211+
Turn this feature off by setting `cache = False`.
212+
'''
213+
try:
214+
process_cutout(row, ra, dec, cache=cache)
215+
# IncompleteRead: https://github.com/Caltech-IPAC/irsa-tutorials/issues/165#issuecomment-3821504954
216+
except (urllib.error.HTTPError, http.client.IncompleteRead):
217+
# Transient read errors. Skip this cutout.
218+
row["hdus"] = None
219+
```
220+
190221
## 7. Download the Cutouts
191222

192223
This process can take a while.
@@ -223,8 +254,11 @@ results_table_serial["hdus"] = np.full(len(results_table_serial), None)
223254
224255
t1 = time.time()
225256
for row in results_table_serial:
226-
process_cutout(row, ra, dec, cache=False)
257+
process_cutout_with_error_handling(row, ra, dec, cache=False)
227258
print("Time to create cutouts in serial mode: {:2.2f} minutes.".format((time.time() - t1) / 60))
259+
260+
# Drop rows that failed to download.
261+
results_table_serial = results_table_serial[[r["hdus"] is not None for r in results_table_serial]]
228262
```
229263

230264
### 7.2 Parallel Approach
@@ -257,9 +291,13 @@ results_table_parallel["hdus"] = np.full(len(results_table_parallel), None)
257291
258292
t1 = time.time()
259293
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
260-
futures = [executor.submit(process_cutout, row, ra, dec, False) for row in results_table_parallel]
294+
futures = [executor.submit(process_cutout_with_error_handling, row, ra, dec, False)
295+
for row in results_table_parallel]
261296
concurrent.futures.wait(futures)
262297
print("Time to create cutouts in parallel mode: {:2.2f} minutes.".format((time.time() - t1) / 60))
298+
299+
# Drop rows that failed to download.
300+
results_table_parallel = results_table_parallel[[r["hdus"] is not None for r in results_table_parallel]]
263301
```
264302

265303
## 8. Create a summary table HDU with renamed columns

0 commit comments

Comments
 (0)