Thrift-derived struct containing statistics used to estimate page and column chunk sizes. More...
#include <parquet_schema.hpp>
Public Attributes | |
std::optional< int64_t > | unencoded_byte_array_data_bytes |
std::optional< std::vector< int64_t > > | repetition_level_histogram |
std::optional< std::vector< int64_t > > | definition_level_histogram |
Thrift-derived struct containing statistics used to estimate page and column chunk sizes.
Definition at line 574 of file parquet_schema.hpp.
std::optional<std::vector<int64_t> > cudf::io::parquet::SizeStatistics::definition_level_histogram |
Same as repetition_level_histogram except for definition levels.
This value should not be written if max_definition_level is 0 or 1.
Definition at line 593 of file parquet_schema.hpp.
std::optional<std::vector<int64_t> > cudf::io::parquet::SizeStatistics::repetition_level_histogram |
When present, there is expected to be one element corresponding to each repetition (i.e. size=max repetition_level+1) where each element represents the number of times the repetition level was observed in the data.
This value should not be written if max_repetition_level is 0.
Definition at line 586 of file parquet_schema.hpp.
std::optional<int64_t> cudf::io::parquet::SizeStatistics::unencoded_byte_array_data_bytes |
Number of variable-width bytes stored for the page/chunk. Should not be set for anything but the BYTE_ARRAY physical type.
Definition at line 577 of file parquet_schema.hpp.