21bac10 Merging r214481:

Authored and Committed by Bill Wendling 9 years ago
    Merging r214481:
    ------------------------------------------------------------------------
    r214481 | hfinkel | 2014-07-31 22:20:41 -0700 (Thu, 31 Jul 2014) | 38 lines
    
    [PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads
    
    Altivec vector loads on PowerPC have an interesting property: They always load
    from an aligned address (by rounding down the address actually provided if
    necessary). In order to generate an actual unaligned load, you can generate two
    load instructions, one with the original address, one offset by one vector
    length, and use a special permutation to extract the bytes desired.
    
    When this was originally implemented, I generated these two loads using regular
    ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with
    this:
    
    The alignment of a load does not contribute to its identity, and SDNodes
    are uniqued. So, imagine that we have some unaligned load, L1, that is not
    aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned).
    Further imagine that there had already existed a load (L1+16)(unaligned) with
    the same chain operand as the load L1. When (L1+16)(aligned) is created as part
    of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just
    now marked as aligned (because the new alignment overwrites the old). But the
    original users of (L1+16)(unaligned) now get the data intended for the
    permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists
    to get its own permutation-based expansion. This was PR19991.
    
    A second potential problem has to do with the MMOs on these loads, which can be
    used by AA during instruction scheduling to break chain-based dependencies. If
    the new "aligned" loads get the MMO from the original unaligned load, this does
    not represent the fact that it will load data from below the original address.
    Normally, this would not matter, but this load might be combined with another
    load pair for a previous vector, and then the dependency on the otherwise-
    ignored lower bytes can matter.
    
    To fix both problems, instead of generating the necessary loads using regular
    ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are
    provided with MMOs with a conservative address range.
    
    Unfortunately, I no longer have a failing test case (since PR19991 was
    reported, other changes in CodeGen have forced this bug back into hiding it
    again). Nevertheless, this should fix the underlying problem.
    ------------------------------------------------------------------------
    
    llvm-svn: 215058