c63519fa47
The leveldb and univalue 3rd party libraries are not installed and not needed by anyone outside of the Hub. So move them there, making it easier for 3rd party usage.
105 lines
3.7 KiB
Plaintext
105 lines
3.7 KiB
Plaintext
File format
|
|
===========
|
|
|
|
<beginning_of_file>
|
|
[data block 1]
|
|
[data block 2]
|
|
...
|
|
[data block N]
|
|
[meta block 1]
|
|
...
|
|
[meta block K]
|
|
[metaindex block]
|
|
[index block]
|
|
[Footer] (fixed size; starts at file_size - sizeof(Footer))
|
|
<end_of_file>
|
|
|
|
The file contains internal pointers. Each such pointer is called
|
|
a BlockHandle and contains the following information:
|
|
offset: varint64
|
|
size: varint64
|
|
See https://developers.google.com/protocol-buffers/docs/encoding#varints
|
|
for an explanation of varint64 format.
|
|
|
|
(1) The sequence of key/value pairs in the file are stored in sorted
|
|
order and partitioned into a sequence of data blocks. These blocks
|
|
come one after another at the beginning of the file. Each data block
|
|
is formatted according to the code in block_builder.cc, and then
|
|
optionally compressed.
|
|
|
|
(2) After the data blocks we store a bunch of meta blocks. The
|
|
supported meta block types are described below. More meta block types
|
|
may be added in the future. Each meta block is again formatted using
|
|
block_builder.cc and then optionally compressed.
|
|
|
|
(3) A "metaindex" block. It contains one entry for every other meta
|
|
block where the key is the name of the meta block and the value is a
|
|
BlockHandle pointing to that meta block.
|
|
|
|
(4) An "index" block. This block contains one entry per data block,
|
|
where the key is a string >= last key in that data block and before
|
|
the first key in the successive data block. The value is the
|
|
BlockHandle for the data block.
|
|
|
|
(6) At the very end of the file is a fixed length footer that contains
|
|
the BlockHandle of the metaindex and index blocks as well as a magic number.
|
|
metaindex_handle: char[p]; // Block handle for metaindex
|
|
index_handle: char[q]; // Block handle for index
|
|
padding: char[40-p-q]; // zeroed bytes to make fixed length
|
|
// (40==2*BlockHandle::kMaxEncodedLength)
|
|
magic: fixed64; // == 0xdb4775248b80fb57 (little-endian)
|
|
|
|
"filter" Meta Block
|
|
-------------------
|
|
|
|
If a "FilterPolicy" was specified when the database was opened, a
|
|
filter block is stored in each table. The "metaindex" block contains
|
|
an entry that maps from "filter.<N>" to the BlockHandle for the filter
|
|
block where "<N>" is the string returned by the filter policy's
|
|
"Name()" method.
|
|
|
|
The filter block stores a sequence of filters, where filter i contains
|
|
the output of FilterPolicy::CreateFilter() on all keys that are stored
|
|
in a block whose file offset falls within the range
|
|
|
|
[ i*base ... (i+1)*base-1 ]
|
|
|
|
Currently, "base" is 2KB. So for example, if blocks X and Y start in
|
|
the range [ 0KB .. 2KB-1 ], all of the keys in X and Y will be
|
|
converted to a filter by calling FilterPolicy::CreateFilter(), and the
|
|
resulting filter will be stored as the first filter in the filter
|
|
block.
|
|
|
|
The filter block is formatted as follows:
|
|
|
|
[filter 0]
|
|
[filter 1]
|
|
[filter 2]
|
|
...
|
|
[filter N-1]
|
|
|
|
[offset of filter 0] : 4 bytes
|
|
[offset of filter 1] : 4 bytes
|
|
[offset of filter 2] : 4 bytes
|
|
...
|
|
[offset of filter N-1] : 4 bytes
|
|
|
|
[offset of beginning of offset array] : 4 bytes
|
|
lg(base) : 1 byte
|
|
|
|
The offset array at the end of the filter block allows efficient
|
|
mapping from a data block offset to the corresponding filter.
|
|
|
|
"stats" Meta Block
|
|
------------------
|
|
|
|
This meta block contains a bunch of stats. The key is the name
|
|
of the statistic. The value contains the statistic.
|
|
TODO(postrelease): record following stats.
|
|
data size
|
|
index size
|
|
key size (uncompressed)
|
|
value size (uncompressed)
|
|
number of entries
|
|
number of data blocks
|