journal: Add support for not deduplicating specific fields in compact mode
For fields that are almost always unique, allocating a separate
Data object ends up adding a noticeable amount of overhead. By
adding support for storing these fields inline in the entry object,
we reduce the space required to store these unique fields.
If a field is marked as unique, we don't allocate a Data object and
store it inline in the Entry instead. The entry payload for storing
a single unique field has the following format:
- 1 byte for flags (compressed, etc)
- 4 bytes for size
- data (optionally compressed)
Each unique field is serialized to this format and the concatenation
of all the serialized unique fields becomes the entry payload.
When iterating over entries, we first iterate over all the deduplicated
fields via the trie. Once those are done, we continue with the inlined
fields. journal_file_entry_next()'s implementation is extended to support
this. One change in it's API is that ret_offset is set to zero for inline
fields if it is provided.
The list of fields to not deduplicate can be configured with the
$SYSTEMD_JOURNAL_UNIQUE_FIELDS environment variable. If it's not set, we
default to deduplicating all fields except MESSAGE.
With this change, we need to have the field object available before we
append the data object so we move field object allocation out of
journal_file_append_data() and into journal_file_append_entry() and
journal_file_copy_entry() instead.
Before:
OBJECT TYPE ENTRIES SIZE
Unused 0 0B
Data 3521895 587.0M
Field 3140 169.4K
Entry 3499118 240.2M
Data Hash Table 14 49.7M
Field Hash Table 14 73.0K
Entry Array 577350 499.5M
Tag 0 0B
Trie Node 5767903 220.0M
Trie Hash Table 14 74.6M
Total 13369448 1.6G
After:
OBJECT TYPE ENTRIES SIZE
Unused 0 0B
Data 1022925 95.3M
Field 2808 151.3K
Entry 3499976 667.7M
Data Hash Table 13 46.2M
Field Hash Table 13 67.8K
Entry Array 492907 576.7M
Tag 0 0B
Trie Node 1758648 67.0M
Trie Hash Table 13 69.3M
Total 6777303 1.4G