Thrift-derived struct describing file-level metadata. More...
#include <parquet_schema.hpp>
Public Attributes | |
int32_t | version = 0 |
Version of this file. | |
std::vector< SchemaElement > | schema |
int64_t | num_rows = 0 |
Number of rows in this file. | |
std::vector< RowGroup > | row_groups |
Row groups in this file. | |
std::vector< KeyValue > | key_value_metadata |
Optional key/value metadata. | |
std::string | created_by = "" |
String for application that wrote this file. | |
std::optional< std::vector< ColumnOrder > > | column_orders |
Thrift-derived struct describing file-level metadata.
The additional information stored in the key_value_metadata can be used during reading to reconstruct the output data to the exact original dataset prior to conversion to Parquet.
Definition at line 834 of file parquet_schema.hpp.
std::optional<std::vector<ColumnOrder> > cudf::io::parquet::FileMetaData::column_orders |
Sort order used for the min_value and max_value fields in the Statistics objects and the min_values and max_values fields in the ColumnIndex objects of each column in this file.
Definition at line 852 of file parquet_schema.hpp.
std::vector<SchemaElement> cudf::io::parquet::FileMetaData::schema |
Parquet schema for this file. This schema contains metadata for all the columns. The schema is represented as a tree with a single root. The nodes of the tree are flattened to a list by doing a depth-first traversal. The column metadata contains the path in the schema for that column which can be used to map columns to nodes in the schema. The first element is the root
Definition at line 841 of file parquet_schema.hpp.