Skip to content

Do not silently read a different file for a non-existent path with a suffix (fixes #328)#334

Open
gaoflow wants to merge 1 commit into
lasp:mainfrom
gaoflow:fix-328-illegal-path-suffix-clobber
Open

Do not silently read a different file for a non-existent path with a suffix (fixes #328)#334
gaoflow wants to merge 1 commit into
lasp:mainfrom
gaoflow:fix-328-illegal-path-suffix-clobber

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 2, 2026

Copy link
Copy Markdown

Problem (#328)

cdflib.cdfread.CDF() can read a CDF from an illegal path built from a legal path plus extra characters. For example, passing mydata.cdfINVALID (which does not exist) silently reads mydata.cdf instead of raising:

>>> import cdflib
>>> cdflib.cdfread.CDF('tests/testfiles/psp_fld_l2_mag_rtn_1min_20200104_v02.cdfINVALID').cdf_info().CDF
PosixPath('.../psp_fld_l2_mag_rtn_1min_20200104_v02.cdf')   # wrong file, no error

This is a data-integrity hazard: a user can get data from a file other than the one they asked for, with no indication anything went wrong. (Reported by @ErikPGJ for JUICE/RPWI files; the upper character limit they observed before an error simply corresponds to the OS path-length limit hit in Path.is_file().)

Cause

In the file branch of CDF.__init__:

if not path.is_file():
    path = path.with_suffix(".cdf")
    if not path.is_file():
        raise FileNotFoundError(f"{path} not found")

Path.with_suffix('.cdf') replaces the last suffix rather than appending one, so mydata.cdfINVALIDmydata.cdf. The fallback was presumably meant for the "user passed mydata without the .cdf extension" case, but it also rewrites any wrong-but-present extension to .cdf.

Fix

Only apply the .cdf fallback when the path has no suffix:

if not path.is_file():
    if not path.suffix:
        path = path.with_suffix(".cdf")
    if not path.is_file():
        raise FileNotFoundError(f"{path} not found")
  • mydata (no suffix) → still falls back to mydata.cdf (behavior preserved).
  • mydata.cdfINVALID (has a suffix) → no longer clobbered → FileNotFoundError.
  • An existing mydata.cdf is found before the fallback is ever reached, so valid reads are unaffected.

(Also drops a duplicated self.file = path line in the same block.)

Tests

Adds test_nonexist_path_with_extra_suffix_errors, which asserts that a real test-file path with extra characters appended raises FileNotFoundError. It fails on main (DID NOT RAISE) and passes with the fix; tests/test_cdfread.py passes (5 passed). Verified the no-suffix fallback still resolves <name><name>.cdf.

Fixes #328.

The 'file' branch of CDF.__init__ fell back to path.with_suffix('.cdf')
when the given path was not a file. with_suffix *replaces* the last
suffix, so a non-existent path that already has one (e.g.
'mydata.cdfINVALID') silently resolved to 'mydata.cdf' and read the
wrong file instead of raising. Only apply the '.cdf' fallback when the
path has no suffix (the 'user omitted the extension' case it was meant
for). Adds a regression test. Fixes lasp#328.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cdflib.cdfread.CDF() reads illegal paths built from legal paths with characters appended to it

1 participant