Public Attributes | List of all members
cudf::io::parquet::SizeStatistics Struct Reference

Thrift-derived struct containing statistics used to estimate page and column chunk sizes. More...

#include <parquet_schema.hpp>

Public Attributes

std::optional< int64_t > unencoded_byte_array_data_bytes
 
std::optional< std::vector< int64_t > > repetition_level_histogram
 
std::optional< std::vector< int64_t > > definition_level_histogram
 

Detailed Description

Thrift-derived struct containing statistics used to estimate page and column chunk sizes.

Definition at line 574 of file parquet_schema.hpp.

Member Data Documentation

◆ definition_level_histogram

std::optional<std::vector<int64_t> > cudf::io::parquet::SizeStatistics::definition_level_histogram

Same as repetition_level_histogram except for definition levels.

This value should not be written if max_definition_level is 0 or 1.

Definition at line 593 of file parquet_schema.hpp.

◆ repetition_level_histogram

std::optional<std::vector<int64_t> > cudf::io::parquet::SizeStatistics::repetition_level_histogram

When present, there is expected to be one element corresponding to each repetition (i.e. size=max repetition_level+1) where each element represents the number of times the repetition level was observed in the data.

This value should not be written if max_repetition_level is 0.

Definition at line 586 of file parquet_schema.hpp.

◆ unencoded_byte_array_data_bytes

std::optional<int64_t> cudf::io::parquet::SizeStatistics::unencoded_byte_array_data_bytes

Number of variable-width bytes stored for the page/chunk. Should not be set for anything but the BYTE_ARRAY physical type.

Definition at line 577 of file parquet_schema.hpp.


The documentation for this struct was generated from the following file: