Write selected channels during inference#2120
Conversation
iluise
left a comment
There was a problem hiding this comment.
Thanks for the heads up! Minor suggestion to avoid retrieving the channels twice
| A list of channel names. | ||
| """ | ||
| _logger.debug(f"Getting channels for stream {stream}...") | ||
| all_channels = self.get_inference_stream_attr(stream, "val_target_channels") |
There was a problem hiding this comment.
I would just move it at line 152 (so before self.get_inference_stream_attr(stream, "val_target_channels")):
write_output = self.get_inference_stream_attr(stream, "write_output")
if write_output is not None:
all_channels = [ch for ch in all_channels if ch in write_output]
else:
all_channels = self.get_inference_stream_attr(stream, "val_target_channels")
_logger.debug(f"Channels found in config: {all_channels}")
There was a problem hiding this comment.
Why do we need to change the reader in general here? all_channels should have anyway whatever is written in the output from write_output
There was a problem hiding this comment.
So all_channels should be 2t in our example here.
There was a problem hiding this comment.
@SavvasMel source is kept as it is. Only target and prediction channels are being filtered.
| stream_id : 0 | ||
| source_exclude : ['w_', 'skt', 'tcw', 'cp', 'tp'] | ||
| target_exclude : ['w_', 'slor', 'sdor', 'tcw', 'cp', 'tp'] | ||
| write_output: ['2t'] |
There was a problem hiding this comment.
We need to be careful this to be removed before merge. Or give an empty list.
There was a problem hiding this comment.
I am adding another param in default_config that controls whether channels are being filtered or not.
| _logger.debug(f"Channels found in config: {all_channels}") | ||
|
|
||
| # filter to write_output subset if specified | ||
| write_output = self.get_inference_stream_attr(stream, "write_output") |
There was a problem hiding this comment.
It seems that this respects backward compatibility but please double check
There was a problem hiding this comment.
I mean what happens if the stream does not have "write_output"?
There was a problem hiding this comment.
I am working on checking the backward compatibility. To answer your question, get_inference_stream_attr returns an empty list if it doesn't find write_output
There was a problem hiding this comment.
Backward compatibility checks
parse validation output filter_output_channels from default config format support allow or deny channel filtering per stream introduce output_channels for written targets and predictions read channels from zarr metadata in WeatherGenZarrReader
|
Hi @SavvasMel and @iluise I had another idea how to implement this issue, and I would really appreciate your input. Among the changes, I decided to remove the filtering of the channels from the stream config, and add it in the default config instead so that it can be overwritten with |
Description (Edited)
Changes in
config/default_config.yml:Added a new filed in
outputto filter channels.Changes in
packages/common/src/weathergen/common/io.py:Added a field in the zarr output for channels being written
Changes in
src/weathergen/utils/validation_io.py:Now trims
targets_allandpreds_alltensors to a per-stream allowlist or denylist before writing to the output zarr, via a newoutput_channelslist derived from a deep copy oftarget_channels.Changes in
packages/evaluate/src/weathergen/evaluate/io/wegen_reader.py:Added
get_channels()that reads channel names directly from the Zarr output metadataTo run with:
The output zarr file has the tree:
Issue Number
Closes #1705
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60