Describe the bug, including details regarding any error messages, version, and platform.
I noticed this on the parquet-cli while checking the schema for a shredded parquet file.
Essentially, given this Parquet schema:
message table {
required group data (VARIANT(1)) {
required binary metadata;
optional binary value;
optional group typed_value {
required group name {
optional binary value;
optional binary typed_value (STRING);
}
required group age {
optional binary value;
optional int32 typed_value (INTEGER(8,true));
}
}
}
}
AvroSchemaConverter.convert() produces:
{
"type": "record",
"name": "data",
"fields": [
{"name": "metadata", "type": "bytes"},
{"name": "value", "type": "bytes"}
]
}
typed_value is missing
Expected: The converter should convert all children of the VARIANT group (including typed_value when present)
Version: 1.18.0-SNAPSHOT
I can add a fix to this myself if there are no objections. If we want to keep this behavior, we should make a note.
Component(s)
Avro
Describe the bug, including details regarding any error messages, version, and platform.
I noticed this on the parquet-cli while checking the schema for a shredded parquet file.
Essentially, given this Parquet schema:
AvroSchemaConverter.convert() produces:
typed_valueis missingExpected: The converter should convert all children of the VARIANT group (including typed_value when present)
Version: 1.18.0-SNAPSHOT
I can add a fix to this myself if there are no objections. If we want to keep this behavior, we should make a note.
Component(s)
Avro