|
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | 7 | "# Part 1: Geostrophic balance\n", |
8 | | - "Andrew Delman, updated 2023-12-22.\n", |
| 8 | + "Andrew Delman, updated 2024-04-04.\n", |
9 | 9 | "\n", |
10 | 10 | "## Objectives\n", |
11 | 11 | "\n", |
|
182 | 182 | } |
183 | 183 | ], |
184 | 184 | "source": [ |
185 | | - "# download file (granule) containing Jan 2000 velocities,\n", |
186 | | - "download_root_dir = join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC') # this is the default path used for download_root_dir\n", |
| 185 | + "# specify location to store downloaded files\n", |
| 186 | + "from os.path import join,expanduser\n", |
| 187 | + "user_home_dir = expanduser('~')\n", |
| 188 | + "download_root_dir = join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC') # this is also the default path used for download_root_dir\n", |
| 189 | + "\n", |
| 190 | + "# download file (granule) containing Jan 2000 velocities\n", |
187 | 191 | "vel_monthly_shortname = \"ECCO_L4_OCEAN_VEL_LLC0090GRID_MONTHLY_V4R4\"\n", |
188 | 192 | "ecco_podaac_download(ShortName=vel_monthly_shortname,\\\n", |
189 | 193 | " StartDate=\"2000-01\",EndDate=\"2000-01\",download_root_dir=download_root_dir,\\\n", |
|
276 | 280 | "cell_type": "markdown", |
277 | 281 | "metadata": {}, |
278 | 282 | "source": [ |
279 | | - "Last but not least, we will need certain parameters from the model grid in order to compute gradients (derivatives with respect to $x$ and $y$). These are in a dataset with ShortName **ECCO_L4_GEOMETRY_LLC0090GRID_V4R4**; they have no time dimension but setting `StartDate` and `EndDate` for any date between 1992-01-01 and 2018-01-01 should download it." |
| 283 | + "Last but not least, we will need certain parameters from the model grid in order to compute gradients (derivatives with respect to $x$ and $y$). These are in a dataset with ShortName **ECCO_L4_GEOMETRY_LLC0090GRID_V4R4**; they have no time dimension but setting `StartDate` and `EndDate` for any date/month/year from 1992 to 2017 should download it." |
280 | 284 | ] |
281 | 285 | }, |
282 | 286 | { |
|
303 | 307 | "source": [ |
304 | 308 | "grid_params_shortname = \"ECCO_L4_GEOMETRY_LLC0090GRID_V4R4\"\n", |
305 | 309 | "ecco_podaac_download(ShortName=grid_params_shortname,\\\n", |
306 | | - " StartDate=\"2000-01-01\",EndDate=\"2000-01-01\",download_root_dir=None,\\\n", |
| 310 | + " StartDate=\"1992\",EndDate=\"2017\",download_root_dir=None,\\\n", |
307 | 311 | " n_workers=6,force_redownload=False)" |
308 | 312 | ] |
309 | 313 | }, |
|
316 | 320 | "\n", |
317 | 321 | "### Download to your AWS instance\n", |
318 | 322 | "\n", |
319 | | - "The cell below uses the `ecco_s3_retrieve.py` module to download to your instance (or open remotely) the datasets you need for this tutorial. If you followed the setup instructions in the [AWS Cloud setup](https://ecco-v4-python-tutorial.readthedocs.io/AWS_Cloud_getting_started.html) tutorial you already have access to this module, or you can download it [here](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/ECCO-ACCESS/ecco_s3_retrieve.py).\n", |
| 323 | + "This sub-section uses the `ecco_s3_retrieve.py` module to download to your instance (or open remotely) the datasets you need for this tutorial. If you followed the setup instructions in the [AWS Cloud setup](https://ecco-v4-python-tutorial.readthedocs.io/AWS_Cloud_getting_started.html) tutorial you already have access to this module, or you can download it [here](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/ECCO-ACCESS/ecco_s3_retrieve.py). Let's query the syntax of the `ecco_podaac_s3_get_diskaware` function that we will use to download or access the files. This function first assesses the disk space available on your instance, and downloads them to your instance if there is sufficient space, or opens them remotely on S3 otherwise.\n", |
320 | 324 | "\n", |
321 | 325 | "> Tip: In future tutorials, you will see there is a boolean variable `incloud_access` that is usually set to `False` by default. If you set this variable to `True`, the `ecco_s3_retrieve.py` module will access the datasets on the cloud properly from your instance." |
322 | 326 | ] |
323 | 327 | }, |
324 | 328 | { |
325 | 329 | "cell_type": "code", |
326 | | - "execution_count": null, |
| 330 | + "execution_count": 1, |
327 | 331 | "metadata": {}, |
328 | | - "outputs": [], |
| 332 | + "outputs": [ |
| 333 | + { |
| 334 | + "name": "stdout", |
| 335 | + "output_type": "stream", |
| 336 | + "text": [ |
| 337 | + "Help on function ecco_podaac_s3_get_diskaware in module ecco_s3_retrieve:\n", |
| 338 | + "\n", |
| 339 | + "ecco_podaac_s3_get_diskaware(ShortNames, StartDate, EndDate, max_avail_frac=0.5, snapshot_interval=None, download_root_dir=None, n_workers=6, force_redownload=False)\n", |
| 340 | + " This function estimates the storage footprint of ECCO datasets, given ShortName(s), a date range, and which \n", |
| 341 | + " files (if any) are already present.\n", |
| 342 | + " If the footprint of the files to be downloaded (not including files already on the instance or re-downloads) \n", |
| 343 | + " is <= the max_avail_frac specified of the instance's available storage, they are downloaded and stored locally \n", |
| 344 | + " on the instance (hosting files locally typically speeds up loading and computation).\n", |
| 345 | + " Otherwise, the files are \"opened\" using ecco_podaac_s3_open so that they can be accessed directly \n", |
| 346 | + " on S3 without occupying local storage.\n", |
| 347 | + " \n", |
| 348 | + " Parameters\n", |
| 349 | + " ----------\n", |
| 350 | + " \n", |
| 351 | + " ShortNames: str or list, the ShortName(s) that identify the dataset on PO.DAAC.\n", |
| 352 | + " \n", |
| 353 | + " StartDate,EndDate: str, in 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD' format, \n", |
| 354 | + " define date range [StartDate,EndDate] for download.\n", |
| 355 | + " EndDate is included in the time range (unlike typical Python ranges).\n", |
| 356 | + " ECCOv4r4 date range is '1992-01-01' to '2017-12-31'.\n", |
| 357 | + " For 'SNAPSHOT' datasets, an additional day is added to EndDate to enable closed budgets\n", |
| 358 | + " within the specified date range.\n", |
| 359 | + " \n", |
| 360 | + " max_avail_frac: float, maximum fraction of remaining available disk space to use in storing current ECCO datasets.\n", |
| 361 | + " This determines whether the dataset files are stored on the current instance, or opened on S3.\n", |
| 362 | + " Valid range is [0,0.9]. If number provided is outside this range, it is replaced by the closer \n", |
| 363 | + " endpoint of the range.\n", |
| 364 | + " \n", |
| 365 | + " snapshot_interval: ('monthly', 'daily', or None), if snapshot datasets are included in ShortNames, \n", |
| 366 | + " this determines whether snapshots are included for only the beginning/end of each month \n", |
| 367 | + " ('monthly'), or for every day ('daily').\n", |
| 368 | + " If None or not specified, defaults to 'daily' if any daily mean ShortNames are included \n", |
| 369 | + " and 'monthly' otherwise.\n", |
| 370 | + " \n", |
| 371 | + " download_root_dir: str, defines parent directory to download files to.\n", |
| 372 | + " Files will be downloaded to directory download_root_dir/ShortName/.\n", |
| 373 | + " If not specified, parent directory defaults to '~/Downloads/ECCO_V4r4_PODAAC/'.\n", |
| 374 | + " \n", |
| 375 | + " n_workers: int, number of workers to use in concurrent downloads. Benefits typically taper off above 5-6.\n", |
| 376 | + " Applies only if files are downloaded.\n", |
| 377 | + " \n", |
| 378 | + " force_redownload: bool, if True, existing files will be redownloaded and replaced;\n", |
| 379 | + " if False, existing files will not be replaced.\n", |
| 380 | + " Applies only if files are downloaded.\n", |
| 381 | + " \n", |
| 382 | + " \n", |
| 383 | + " Returns\n", |
| 384 | + " -------\n", |
| 385 | + " retrieved_files: dict, with keys: ShortNames and values: downloaded or opened file(s) with path on local instance \n", |
| 386 | + " or on S3, that can be passed directly to xarray (open_dataset or open_mfdataset).\n", |
| 387 | + "\n" |
| 388 | + ] |
| 389 | + } |
| 390 | + ], |
329 | 391 | "source": [ |
330 | 392 | "# # only need this if ecco_s3_retrieve.py is in a different directory than the tutorial notebook\n", |
331 | 393 | "# import sys\n", |
|
335 | 397 | "from ecco_s3_retrieve import *\n", |
336 | 398 | "# query to see the syntax needed for the ecco_podaac_s3_get_diskaware function,\n", |
337 | 399 | "# which assesses available disk space to make sure that there is sufficient storage to download the ECCO datasets,\n", |
338 | | - "# and opens them \"remotely\" on S3 otherwise\n", |
| 400 | + "# or opens them \"remotely\" on S3 otherwise\n", |
339 | 401 | "help(ecco_podaac_s3_get_diskaware)" |
340 | 402 | ] |
341 | 403 | }, |
| 404 | + { |
| 405 | + "cell_type": "markdown", |
| 406 | + "metadata": {}, |
| 407 | + "source": [ |
| 408 | + "Notice that this function returns a Python object called a *dictionary* that contains the downloaded/opened files (with the path of their downloaded location), and that this can be used to open these files into your notebook workspace. Alternatively, the files can be identified using the `glob` package as you will see in the next section." |
| 409 | + ] |
| 410 | + }, |
342 | 411 | { |
343 | 412 | "cell_type": "code", |
344 | | - "execution_count": null, |
| 413 | + "execution_count": 2, |
345 | 414 | "metadata": {}, |
346 | | - "outputs": [], |
| 415 | + "outputs": [ |
| 416 | + { |
| 417 | + "name": "stdout", |
| 418 | + "output_type": "stream", |
| 419 | + "text": [ |
| 420 | + "{'ShortName': 'ECCO_L4_OCEAN_VEL_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2000-01-02,2000-01-31'}\n", |
| 421 | + "{'ShortName': 'ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2000-01-02,2000-01-31'}\n", |
| 422 | + "Size of files to be downloaded to instance is 0.057 GB,\n", |
| 423 | + "which is 0.07% of the 82.141 GB available storage.\n", |
| 424 | + "Proceeding with file downloads from S3\n", |
| 425 | + "created download directory /home/jpluser/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_OCEAN_VEL_LLC0090GRID_MONTHLY_V4R4\n", |
| 426 | + "downloading OCEAN_VELOCITY_mon_mean_2000-01_ECCO_V4r4_native_llc0090.nc\n", |
| 427 | + "\n", |
| 428 | + "=====================================\n", |
| 429 | + "Time spent = 0.5400779247283936 seconds\n", |
| 430 | + "\n", |
| 431 | + "\n", |
| 432 | + "created download directory /home/jpluser/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_MONTHLY_V4R4\n", |
| 433 | + "downloading OCEAN_DENS_STRAT_PRESS_mon_mean_2000-01_ECCO_V4r4_native_llc0090.nc\n", |
| 434 | + "\n", |
| 435 | + "=====================================\n", |
| 436 | + "Time spent = 0.6094639301300049 seconds\n", |
| 437 | + "\n", |
| 438 | + "\n", |
| 439 | + "{'ShortName': 'ECCO_L4_OCEAN_VEL_LLC0090GRID_DAILY_V4R4', 'temporal': '2000-01-01,2000-01-01'}\n", |
| 440 | + "{'ShortName': 'ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_DAILY_V4R4', 'temporal': '2000-01-01,2000-01-01'}\n", |
| 441 | + "Size of files to be downloaded to instance is 0.058 GB,\n", |
| 442 | + "which is 0.07% of the 82.084 GB available storage.\n", |
| 443 | + "Proceeding with file downloads from S3\n", |
| 444 | + "created download directory /home/jpluser/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_OCEAN_VEL_LLC0090GRID_DAILY_V4R4\n", |
| 445 | + "downloading OCEAN_VELOCITY_day_mean_2000-01-01_ECCO_V4r4_native_llc0090.nc\n", |
| 446 | + "\n", |
| 447 | + "=====================================\n", |
| 448 | + "Time spent = 0.5322608947753906 seconds\n", |
| 449 | + "\n", |
| 450 | + "\n", |
| 451 | + "created download directory /home/jpluser/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_DAILY_V4R4\n", |
| 452 | + "downloading OCEAN_DENS_STRAT_PRESS_day_mean_2000-01-01_ECCO_V4r4_native_llc0090.nc\n", |
| 453 | + "\n", |
| 454 | + "=====================================\n", |
| 455 | + "Time spent = 0.31609535217285156 seconds\n", |
| 456 | + "\n", |
| 457 | + "\n", |
| 458 | + "{'ShortName': 'ECCO_L4_GEOMETRY_LLC0090GRID_V4R4', 'temporal': '1992-01-01,2017-12-31'}\n", |
| 459 | + "Size of files to be downloaded to instance is 0.008 GB,\n", |
| 460 | + "which is 0.01% of the 82.026 GB available storage.\n", |
| 461 | + "Proceeding with file downloads from S3\n", |
| 462 | + "created download directory /home/jpluser/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_GEOMETRY_LLC0090GRID_V4R4\n", |
| 463 | + "downloading GRID_GEOMETRY_ECCO_V4r4_native_llc0090.nc\n", |
| 464 | + "\n", |
| 465 | + "=====================================\n", |
| 466 | + "Time spent = 0.2874178886413574 seconds\n", |
| 467 | + "\n", |
| 468 | + "\n" |
| 469 | + ] |
| 470 | + } |
| 471 | + ], |
347 | 472 | "source": [ |
| 473 | + "# specify location to store downloaded files\n", |
| 474 | + "from os.path import join,expanduser\n", |
| 475 | + "user_home_dir = expanduser('~')\n", |
| 476 | + "download_root_dir = join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC')\n", |
| 477 | + "\n", |
| 478 | + "\n", |
348 | 479 | "incloud_access = True\n", |
349 | 480 | "\n", |
| 481 | + "# download (or open remotely) the datasets needed for this tutorial\n", |
350 | 482 | "ShortNames_monthly_list = [\"ECCO_L4_OCEAN_VEL_LLC0090GRID_MONTHLY_V4R4\",\\\n", |
351 | 483 | " \"ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_MONTHLY_V4R4\"]\n", |
352 | 484 | "ShortNames_daily_list = [\"ECCO_L4_OCEAN_VEL_LLC0090GRID_DAILY_V4R4\",\\\n", |
353 | 485 | " \"ECCO_L4_DENS_STRAT_PRESS_LLC0090GRID_DAILY_V4R4\"]\n", |
| 486 | + "ShortNames_grid_list = [\"ECCO_L4_GEOMETRY_LLC0090GRID_V4R4\"]\n", |
354 | 487 | "if incloud_access == True:\n", |
355 | | - " download_root_dir = join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC')\n", |
356 | | - " files_nested_list = ecco_podaac_s3_get_diskaware(ShortNames=ShortNames_monthly_list,\\\n", |
357 | | - " StartDate='2000-01',EndDate='2000-01',\\\n", |
358 | | - " max_avail_frac=0.5,\\\n", |
359 | | - " download_root_dir=download_root_dir)\n", |
360 | | - " files_nested_list = ecco_podaac_s3_get_diskaware(ShortNames=ShortNames_daily_list,\\\n", |
361 | | - " StartDate='2000-01-01',EndDate='2000-01-01',\\\n", |
362 | | - " max_avail_frac=0.5,\\\n", |
363 | | - " download_root_dir=download_root_dir)" |
| 488 | + " files_dict = ecco_podaac_s3_get_diskaware(ShortNames=ShortNames_monthly_list,\\\n", |
| 489 | + " StartDate='2000-01',EndDate='2000-01',\\\n", |
| 490 | + " max_avail_frac=0.5,\\\n", |
| 491 | + " download_root_dir=download_root_dir)\n", |
| 492 | + " files_dict = ecco_podaac_s3_get_diskaware(ShortNames=ShortNames_daily_list,\\\n", |
| 493 | + " StartDate='2000-01-01',EndDate='2000-01-01',\\\n", |
| 494 | + " max_avail_frac=0.5,\\\n", |
| 495 | + " download_root_dir=download_root_dir)\n", |
| 496 | + " files_dict = ecco_podaac_s3_get_diskaware(ShortNames=ShortNames_grid_list,\\\n", |
| 497 | + " StartDate='1992',EndDate='2017',\\\n", |
| 498 | + " max_avail_frac=0.5,\\\n", |
| 499 | + " download_root_dir=download_root_dir)" |
| 500 | + ] |
| 501 | + }, |
| 502 | + { |
| 503 | + "cell_type": "markdown", |
| 504 | + "metadata": {}, |
| 505 | + "source": [ |
| 506 | + "One advantage of working on an instance in the AWS cloud is how quickly ECCO and other PO.DAAC datasets can be accessed and downloaded, since they are already hosted in the cloud." |
364 | 507 | ] |
365 | 508 | }, |
366 | 509 | { |
|
397 | 540 | "import glob\n", |
398 | 541 | "from os.path import expanduser,join\n", |
399 | 542 | "import sys\n", |
400 | | - "user_home_dir = expanduser('~')\n", |
401 | 543 | "sys.path.append(join(user_home_dir,'ECCOv4-py'))\n", |
402 | 544 | "import ecco_v4_py as ecco\n", |
403 | 545 | "import matplotlib.pyplot as plt\n", |
|
0 commit comments