Parquet I/O interfaces. More...
Classes | |
struct | file_header_s |
Struct that describes the Parquet file data header. More... | |
struct | file_ender_s |
Struct that describes the Parquet file data postscript. More... | |
struct | DecimalType |
Struct that describes the decimal logical type annotation. More... | |
struct | TimeUnit |
Time units for temporal logical types. More... | |
struct | TimeType |
Struct that describes the time logical type annotation. More... | |
struct | TimestampType |
Struct that describes the timestamp logical type annotation. More... | |
struct | IntType |
Struct that describes the integer logical type annotation. More... | |
struct | LogicalType |
Struct that describes the logical type annotation. More... | |
struct | ColumnOrder |
Union to specify the order used for the min_value and max_value fields for a column. More... | |
struct | SchemaElement |
Struct for describing an element/field in the Parquet format schema. More... | |
struct | Statistics |
Thrift-derived struct describing column chunk statistics. More... | |
struct | SizeStatistics |
Thrift-derived struct containing statistics used to estimate page and column chunk sizes. More... | |
struct | PageLocation |
Thrift-derived struct describing page location information stored in the offsets index. More... | |
struct | OffsetIndex |
Thrift-derived struct describing the offset index. More... | |
struct | ColumnIndex |
Thrift-derived struct describing the column index. More... | |
struct | PageEncodingStats |
Thrift-derived struct describing page encoding statistics. More... | |
struct | SortingColumn |
Thrift-derived struct describing column sort order. More... | |
struct | ColumnChunkMetaData |
Thrift-derived struct describing a column chunk. More... | |
struct | BloomFilterAlgorithm |
The algorithm used in bloom filter. More... | |
struct | BloomFilterHash |
The hash function used in Bloom filter. More... | |
struct | BloomFilterCompression |
The compression used in the bloom filter. More... | |
struct | BloomFilterHeader |
Bloom filter header struct. More... | |
struct | ColumnChunk |
Thrift-derived struct describing a chunk of data for a particular column. More... | |
struct | RowGroup |
Thrift-derived struct describing a group of row data. More... | |
struct | KeyValue |
Thrift-derived struct describing a key-value pair, for user metadata. More... | |
struct | FileMetaData |
Thrift-derived struct describing file-level metadata. More... | |
struct | DataPageHeader |
Thrift-derived struct describing the header for a data page. More... | |
struct | DataPageHeaderV2 |
Thrift-derived struct describing the header for a V2 data page. More... | |
struct | DictionaryPageHeader |
Thrift-derived struct describing the header for a dictionary page. More... | |
struct | PageHeader |
Thrift-derived struct describing the page header. More... | |
Enumerations | |
enum class | TypeKind : int8_t { UNDEFINED_TYPE = -1 , BOOLEAN = 0 , INT32 = 1 , INT64 = 2 , INT96 = 3 , FLOAT = 4 , DOUBLE = 5 , BYTE_ARRAY = 6 , FIXED_LEN_BYTE_ARRAY = 7 } |
Basic data types in Parquet, determines how data is physically stored. | |
enum class | Type : int8_t { UNDEFINED = -1 , BOOLEAN = 0 , INT32 = 1 , INT64 = 2 , INT96 = 3 , FLOAT = 4 , DOUBLE = 5 , BYTE_ARRAY = 6 , FIXED_LEN_BYTE_ARRAY = 7 } |
Basic data types in Parquet, determines how data is physically stored. | |
enum class | ConvertedType : int8_t { UNKNOWN = -1 , UTF8 = 0 , MAP = 1 , MAP_KEY_VALUE = 2 , LIST , ENUM = 4 , DECIMAL = 5 , DATE = 6 , TIME_MILLIS = 7 , TIME_MICROS = 8 , TIMESTAMP_MILLIS = 9 , TIMESTAMP_MICROS = 10 , UINT_8 = 11 , UINT_16 = 12 , UINT_32 = 13 , UINT_64 = 14 , INT_8 = 15 , INT_16 = 16 , INT_32 = 17 , INT_64 = 18 , JSON = 19 , BSON = 20 , INTERVAL = 21 , NA = 25 } |
High-level data types in Parquet, determines how data is logically interpreted. | |
enum class | Encoding : uint8_t { PLAIN = 0 , GROUP_VAR_INT = 1 , PLAIN_DICTIONARY = 2 , RLE = 3 , BIT_PACKED = 4 , DELTA_BINARY_PACKED = 5 , DELTA_LENGTH_BYTE_ARRAY = 6 , DELTA_BYTE_ARRAY = 7 , RLE_DICTIONARY = 8 , BYTE_STREAM_SPLIT = 9 , NUM_ENCODINGS = 10 } |
Encoding types for the actual data stream. | |
enum class | Compression : uint8_t { UNCOMPRESSED = 0 , SNAPPY = 1 , GZIP = 2 , LZO = 3 , BROTLI = 4 , LZ4 = 5 , ZSTD = 6 , LZ4_RAW = 7 } |
Compression codec used for compressed data pages. | |
enum class | FieldRepetitionType : int8_t { UNSPECIFIED = -1 , REQUIRED = 0 , OPTIONAL = 1 , REPEATED = 2 } |
Compression codec used for compressed data pages. | |
enum class | PageType : uint8_t { DATA_PAGE = 0 , INDEX_PAGE = 1 , DICTIONARY_PAGE = 2 , DATA_PAGE_V2 = 3 } |
Types of pages. | |
enum class | BoundaryOrder : uint8_t { UNORDERED = 0 , ASCENDING = 1 , DESCENDING = 2 } |
Enum to annotate whether lists of min/max elements inside ColumnIndex are ordered and if so, in which direction. | |
enum class | FieldType : uint8_t { BOOLEAN_TRUE = 1 , BOOLEAN_FALSE = 2 , I8 = 3 , I16 = 4 , I32 = 5 , I64 = 6 , DOUBLE = 7 , BINARY = 8 , LIST = 9 , SET = 10 , MAP = 11 , STRUCT = 12 , UUID = 13 } |
Thrift compact protocol struct field types. | |
Parquet I/O interfaces.