2aa2af1 journal: Add support for not deduplicating specific fields in compact mode

Authored and Committed by daandemeyer 2 years ago
    journal: Add support for not deduplicating specific fields in compact mode
    
    For fields that are almost always unique, allocating a separate
    Data object ends up adding a noticeable amount of overhead. By
    adding support for storing these fields inline in the entry object,
    we reduce the space required to store these unique fields.
    
    If a field is marked as unique, we don't allocate a Data object and
    store it inline in the Entry instead. The entry payload for storing
    a single unique field has the following format:
    
    - 1 byte for flags (compressed, etc)
    - 4 bytes for size
    - data (optionally compressed)
    
    Each unique field is serialized to this format and the concatenation
    of all the serialized unique fields becomes the entry payload.
    
    When iterating over entries, we first iterate over all the deduplicated
    fields via the trie. Once those are done, we continue with the inlined
    fields. journal_file_entry_next()'s implementation is extended to support
    this. One change in it's API is that ret_offset is set to zero for inline
    fields if it is provided.
    
    The list of fields to not deduplicate can be configured with the
    $SYSTEMD_JOURNAL_UNIQUE_FIELDS environment variable. If it's not set, we
    default to deduplicating all fields except MESSAGE.
    
    With this change, we need to have the field object available before we
    append the data object so we move field object allocation out of
    journal_file_append_data() and into journal_file_append_entry() and
    journal_file_copy_entry() instead.
    
    Before:
    
    OBJECT TYPE      ENTRIES  SIZE
    Unused           0        0B
    Data             3521895  587.0M
    Field            3140     169.4K
    Entry            3499118  240.2M
    Data Hash Table  14       49.7M
    Field Hash Table 14       73.0K
    Entry Array	 577350   499.5M
    Tag              0        0B
    Trie Node        5767903  220.0M
    Trie Hash Table  14       74.6M
    Total            13369448 1.6G
    
    After:
    
    OBJECT TYPE      ENTRIES SIZE
    Unused           0       0B
    Data             1022925 95.3M
    Field            2808    151.3K
    Entry            3499976 667.7M
    Data Hash Table  13      46.2M
    Field Hash Table 13      67.8K
    Entry Array      492907  576.7M
    Tag              0       0B
    Trie Node        1758648 67.0M
    Trie Hash Table  13      69.3M
    Total            6777303 1.4G