bkabrda / pagure

Forked from pagure 6 years ago
Clone

75c9f6d Remove some funky bits from inline pattern regexes

Authored and Committed by adamwill 6 years ago
    Remove some funky bits from inline pattern regexes
    
    The bits removed here are *very* weird. They are inverted sets
    which will match anything but a literal pipe or any character in
    the \w class. The following bit, `(?
    lookbehind assertion which means 'match unless the previous
    character is in the \w class'. So these two actually apply to the
    same character and are almost entirely redundant.
    
    I think the weird set was *meant* to be something like (^|w),
    and the approximate idea here was to match 'start of string
    or any non-word character followed by a #'. If so, then in fact
    just removing the wacky inverted set is all we need to do, as
    negative lookbehind assertions are allowed to match at the start
    of the string. (There is actually a whole hidden complication
    here where markdown, behind the scenes, adds some more bits to
    the pattern we feed it to form a complete regex, but it happens
    that everything works OK with that). The tests should suffice to
    demonstrate that these regexes still behave as we expect. These
    regexes originally came from @ralph, who says he's OK with this
    change.
    
    The changes to `handleMatch()` are all to do with the details of
    what markdown actually does with these pattern classes. In short
    markdown expects that the pattern's `handleMatch()` method spits
    out something that corresponds to all the characters the class'
    regex pattern consumes. The weird sets we remove in this commit
    actually consume a character immediately before the `#`, and the
    space in these `text` values is intended to replace that
    character. The modified regexes no longer consume that character,
    so `handleMatch()` no longer needs to replace it.
    
    This actually fixes a subtle bug, because the character matched
    by the set was *not* in fact always a space. This meant that if
    you wrote something like 'Issue.#23' in a comment (for some
    reason...), it would be turned into 'Issue #23', with the '#23'
    linkified - the '.' was turned into a space. With this change,
    that bug no longer happens.
    
    Signed-off-by: Adam Williamson <awilliam@redhat.com>
    
        
file modified
+4 -4