public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107
@ 2023-02-20 14:20 jakub at gcc dot gnu.org
  2023-02-20 14:20 ` [Bug target/108862] " jakub at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 14:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

            Bug ID: 108862
           Summary: [13 Regression] CryptX miscompilation on power9 since
                    r13-2107
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

The following testcase is miscompiled with -O2 -mcpu=power9 since r13-2107 and
is not fixed with r13-5999 fix:

unsigned long long a[2] = { 0x04a13945d898c296ULL, 0x0000100000000fffULL };
unsigned long long b[4] = { 0x04a13945d898c296ULL, 0, 0, 0x0000100000000fffULL
};

__attribute__((noipa)) unsigned __int128
foo (int x, unsigned long long *y, unsigned long long *z)
{
  unsigned __int128 w = 0;
  for (int i = 0; i < x; i++)
    w += (unsigned __int128)*y++ * (unsigned __int128)*z--;
  return w;
}

int
main ()
{
  unsigned __int128 x = foo (1, &a[0], &a[1]);
  unsigned __int128 y = foo (2, &b[0], &b[3]);
  if ((unsigned long long) (x >> 64) != 0x0000004a13945dd3ULL
      || (unsigned long long) x != 0x9b1c8443b3909d6aULL
      || x != y)
    __builtin_abort ();
  return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
@ 2023-02-20 14:20 ` jakub at gcc dot gnu.org
  2023-02-20 14:27 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 14:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Priority|P3                          |P1
                 CC|                            |segher at gcc dot gnu.org
   Target Milestone|---                         |13.0
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-02-20

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
  2023-02-20 14:20 ` [Bug target/108862] " jakub at gcc dot gnu.org
@ 2023-02-20 14:27 ` jakub at gcc dot gnu.org
  2023-02-20 14:36 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 14:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, -O2 -mcpu=power9:
__attribute__((noipa)) unsigned __int128
foo (unsigned __int128 x, unsigned long long y, unsigned long long z)
{
  return x + (unsigned __int128) y * z;
}

int
main ()
{
  unsigned __int128 x = foo (0, 0x04a13945d898c296ULL, 0x0000100000000fffULL);
  if ((unsigned long long) (x >> 64) != 0x0000004a13945dd3ULL
      || (unsigned long long) x != 0x9b1c8443b3909d6aULL)
    __builtin_abort ();
  return 0;
}
works correctly, in that case we get:
        maddhdu 10,5,6,3
        maddld 3,5,6,3
        add 4,10,4
which is correct.  But for the #c0 testcase above, e.g. with -O2
-fno-unroll-loops -mcpu=power9 we get
.L3:
        ldu 9,8(8)
        ldu 10,-8(5)
        maddld 3,9,10,3
        maddhdu 9,9,10,3
        add 4,9,4
        bdnz .L3
in the inner loop, which looks wrong because maddhdu in that case uses result
of maddld as last operand rather than the low part of the 128-bit counter (w).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
  2023-02-20 14:20 ` [Bug target/108862] " jakub at gcc dot gnu.org
  2023-02-20 14:27 ` jakub at gcc dot gnu.org
@ 2023-02-20 14:36 ` jakub at gcc dot gnu.org
  2023-02-20 14:51 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 14:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The reason is that gen_umaddditi4 is called with the same pseudo as operands[0]
and operands[3] and so gen_maddlddi4 overwrites its low half before
gen_umadddi4_highpart{,_le} uses it.
We could create a temporary if some overlap is mentioned, or can create it
unconditionally and let later optimizations deal with it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-02-20 14:36 ` jakub at gcc dot gnu.org
@ 2023-02-20 14:51 ` segher at gcc dot gnu.org
  2023-02-20 14:56 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: segher at gcc dot gnu.org @ 2023-02-20 14:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #3 from Segher Boessenkool <segher at gcc dot gnu.org> ---
With -fno-unroll-loops added we get

foo:
.LFB0:
        .cfi_startproc
        cmpwi 0,3,0
        ble 0,.L4
        mtctr 3
        addi 10,4,-8
        addi 5,5,8
        li 3,0
        .p2align 4,,15
.L3:
        ldu 4,8(10)
        ldu 9,-8(5)
        maddld 3,4,9,3
        maddhdu 4,4,9,3
        bdnz .L3
        blr
        .p2align 4,,15
.L4:
        li 3,0
        li 4,0
        blr

(which still fails, just is easier to read).

The destinations of the four inner loop insns were 131, 132, 135, 136,
and IRA decided for those
Disposition:
   18:r131 l0     4
   17:r132 l0     9
    1:r135 l0     3
    0:r136 l0     4

and that cannot fly.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-02-20 14:51 ` segher at gcc dot gnu.org
@ 2023-02-20 14:56 ` jakub at gcc dot gnu.org
  2023-02-20 15:07 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54494
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54494&action=edit
gcc13-pr108862.patch

Untested fix.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-02-20 14:56 ` jakub at gcc dot gnu.org
@ 2023-02-20 15:07 ` jakub at gcc dot gnu.org
  2023-02-20 21:08 ` cvs-commit at gcc dot gnu.org
  2023-02-20 21:12 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 15:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54495
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54495&action=edit
gcc13-pr108862.patch

Variant patch, instead of creating another temporary and moving it we can just
do the highpart (which goes into temporary) first, then lowpart, then addition
which depends on the highpart.
Strangely, with this patch one gets the loop unrolled (2 handled iterations at
once), while with the earlier patch it is not.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-02-20 15:07 ` jakub at gcc dot gnu.org
@ 2023-02-20 21:08 ` cvs-commit at gcc dot gnu.org
  2023-02-20 21:12 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-20 21:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:cd8b4fae52d02541c2d8bd2200caad3812f37368

commit r13-6147-gcd8b4fae52d02541c2d8bd2200caad3812f37368
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Feb 20 22:07:57 2023 +0100

    powerpc: Another umaddditi4 fix [PR108862]

    The following testcase is miscompiled on powerpc64le-linux with
    -O2 -mcpu=power9.  The problem is that gen_umaddditi4 is called with
    the same TImode register for both op0 and op3, and maddlddi4
    overwrites the low half of op0 before the low half of op3 is read,
    so when they are the same register it reads the result of maddlddi4.

    The following patch fixes that by swapping maddlddi4 and
    umadddi4_highpart{,_le} during expansion, as the latter writes into
    a temporary pseudo and so can't change anything maddlddi4 depends on.

    2023-02-20  Jakub Jelinek  <jakub2redhat.com>

            PR target/108862
            * config/rs6000/rs6000.md (umaddditi4): Swap gen_maddlddi4 with
            gen_umadddi4_highpart{,_le}.

            * gcc.dg/pr108862.c: New test.
            * gcc.target/powerpc/pr108862.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107
  2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-02-20 21:08 ` cvs-commit at gcc dot gnu.org
@ 2023-02-20 21:12 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-20 21:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-20 21:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-20 14:20 [Bug target/108862] New: [13 Regression] CryptX miscompilation on power9 since r13-2107 jakub at gcc dot gnu.org
2023-02-20 14:20 ` [Bug target/108862] " jakub at gcc dot gnu.org
2023-02-20 14:27 ` jakub at gcc dot gnu.org
2023-02-20 14:36 ` jakub at gcc dot gnu.org
2023-02-20 14:51 ` segher at gcc dot gnu.org
2023-02-20 14:56 ` jakub at gcc dot gnu.org
2023-02-20 15:07 ` jakub at gcc dot gnu.org
2023-02-20 21:08 ` cvs-commit at gcc dot gnu.org
2023-02-20 21:12 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).