public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113701] New: Issues with __int128 argument passing
@ 2024-02-01  9:17 ubizjak at gmail dot com
  2024-02-01  9:20 ` [Bug target/113701] " ubizjak at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01  9:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

            Bug ID: 113701
           Summary: Issues with __int128 argument passing
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
typedef unsigned __int128 U;

U f0 (U x, U y) { return x + y; }
U f1 (U x, U y) { return x - y; }

U f2 (U x, U y) { return x | y; }

int f3 (U x, U y) { return x == y; }
int f4 (U x, U y) { return x < y; }
--cut here--

shows some issues with __int128 parameter passing.

gcc -O2:

f0:
        movq    %rdx, %rax
        movq    %rcx, %rdx
        addq    %rdi, %rax
        adcq    %rsi, %rdx
        ret

f1:
        xchgq   %rdi, %rsi
        movq    %rdx, %r8
        movq    %rsi, %rax
        movq    %rdi, %rdx
        subq    %r8, %rax
        sbbq    %rcx, %rdx
        ret

f2:
        xchgq   %rdi, %rsi
        movq    %rdx, %rax
        movq    %rcx, %rdx
        orq     %rsi, %rax
        orq     %rdi, %rdx
        ret

f3:
        xchgq   %rdi, %rsi
        movq    %rdx, %r8
        movq    %rcx, %rax
        movq    %rsi, %rdx
        movq    %rdi, %rcx
        xorq    %rax, %rcx
        xorq    %r8, %rdx
        xorl    %eax, %eax
        orq     %rcx, %rdx
        sete    %al
        ret

f4:
        xorl    %eax, %eax
        cmpq    %rdx, %rdi
        sbbq    %rcx, %rsi
        setc    %al
        ret

Functions f0 and f4 are now optimal.

Functions f1, f2 and f3 emit extra XCHG, but the swap should be propagated to
MOV instructions instead.

The most problematic function is f3, which regressed noticeably from gcc-12.3:

f3:
        xorq    %rdx, %rdi
        xorq    %rcx, %rsi
        xorl    %eax, %eax
        orq     %rsi, %rdi
        sete    %al
        ret

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
@ 2024-02-01  9:20 ` ubizjak at gmail dot com
  2024-02-01  9:46 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
CC added.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
  2024-02-01  9:20 ` [Bug target/113701] " ubizjak at gmail dot com
@ 2024-02-01  9:46 ` ubizjak at gmail dot com
  2024-02-01 10:31 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #0)
> Following testcase:
> 
> --cut here--
> typedef unsigned __int128 U;
> 
> U f0 (U x, U y) { return x + y; }
> U f1 (U x, U y) { return x - y; }
> 
> U f2 (U x, U y) { return x | y; }
> 
> int f3 (U x, U y) { return x == y; }
> int f4 (U x, U y) { return x < y; }
> --cut here--
> 
> shows some issues with __int128 parameter passing.
> 
> gcc -O2:
> 
> f0:
>         movq    %rdx, %rax
>         movq    %rcx, %rdx
>         addq    %rdi, %rax
>         adcq    %rsi, %rdx
>         ret
> 
> f1:
>         xchgq   %rdi, %rsi
>         movq    %rdx, %r8
>         movq    %rsi, %rax
>         movq    %rdi, %rdx
>         subq    %r8, %rax
>         sbbq    %rcx, %rdx
>         ret
> 
> f2:
>         xchgq   %rdi, %rsi
>         movq    %rdx, %rax
>         movq    %rcx, %rdx
>         orq     %rsi, %rax
>         orq     %rdi, %rdx
>         ret
> 
> f3:
>         xchgq   %rdi, %rsi
>         movq    %rdx, %r8
>         movq    %rcx, %rax
>         movq    %rsi, %rdx
>         movq    %rdi, %rcx
>         xorq    %rax, %rcx
>         xorq    %r8, %rdx
>         xorl    %eax, %eax
>         orq     %rcx, %rdx
>         sete    %al
>         ret
> 
> f4:
>         xorl    %eax, %eax
>         cmpq    %rdx, %rdi
>         sbbq    %rcx, %rsi
>         setc    %al
>         ret
> 
> Functions f0 and f4 are now optimal.
> 
> Functions f1, f2 and f3 emit extra XCHG, but the swap should be propagated
> to MOV instructions instead.
> 
> The most problematic function is f3, which regressed noticeably from gcc-12.3:

This patch solves the regression:

--cut here--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index bac0a6ade67..02fed16db72 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1632,11 +1632,6 @@ (define_insn_and_split "*cmp<dwi>_doubleword"
              (set (match_dup 4) (ior:DWIH (match_dup 4) (match_dup 5)))])]
 {
   split_double_mode (<DWI>mode, &operands[0], 2, &operands[0], &operands[2]);
-  /* Placing the SUBREG pieces in pseudos helps reload.  */
-  for (int i = 0; i < 4; i++)
-    if (SUBREG_P (operands[i]))
-      operands[i] = force_reg (<MODE>mode, operands[i]);
-
   operands[4] = gen_reg_rtx (<MODE>mode);

   /* Special case comparisons against -1.  */
--cut here--

gcc -O2:

f3:
        xchgq   %rdi, %rsi
        xorl    %eax, %eax
        xorq    %rsi, %rdx
        xorq    %rdi, %rcx
        orq     %rcx, %rdx
        sete    %al
        ret

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
  2024-02-01  9:20 ` [Bug target/113701] " ubizjak at gmail dot com
  2024-02-01  9:46 ` ubizjak at gmail dot com
@ 2024-02-01 10:31 ` rguenth at gcc dot gnu.org
  2024-02-01 10:53 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-01 10:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-02-01
             Target|                            |x86_64-*-*
     Ever confirmed|0                           |1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2024-02-01 10:31 ` rguenth at gcc dot gnu.org
@ 2024-02-01 10:53 ` ubizjak at gmail dot com
  2024-02-01 11:14 ` roger at nextmovesoftware dot com
  2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #2)

> > The most problematic function is f3, which regressed noticeably from gcc-12.3:
> 
> This patch solves the regression:
> 
> --cut here--
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index bac0a6ade67..02fed16db72 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1632,11 +1632,6 @@ (define_insn_and_split "*cmp<dwi>_doubleword"
>               (set (match_dup 4) (ior:DWIH (match_dup 4) (match_dup 5)))])]
>  {
>    split_double_mode (<DWI>mode, &operands[0], 2, &operands[0],
> &operands[2]);
> -  /* Placing the SUBREG pieces in pseudos helps reload.  */
> -  for (int i = 0; i < 4; i++)
> -    if (SUBREG_P (operands[i]))
> -      operands[i] = force_reg (<MODE>mode, operands[i]);
> -
>    operands[4] = gen_reg_rtx (<MODE>mode);
>  
>    /* Special case comparisons against -1.  */
> --cut here--

Roger, this part was added in [1]. Does this code improve some real reload
issue, not covered in the testsuite, because testsuite results do not show any
regression if the above code is removed.

BTW: The issue was noticed in gcc.target/i386/pr82580.c test case.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595860.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2024-02-01 10:53 ` ubizjak at gmail dot com
@ 2024-02-01 11:14 ` roger at nextmovesoftware dot com
  2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-02-01 11:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=106518

--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
I like Uros' patch in comment #2.  There have been so many incremental changes
and improvements to x86 TImode and register allocation, that this legacy
heuristic (workaround?) is not only no longer useful, but it actually hurts
register allocation.  *cmp<dwi>_doubleword appears to be the only (remaining?)
place this idiom is used.

Additionally, I think I've mentioned in the past that it might also be useful
to have a xchg/swap sinking pass, perhaps as part of cprop_hardreg, so that for
example swap followed by swap is eliminated, that swap with one destination
REG_DEAD is transformed into mov, etc.  Swap/xchg is almost always just hard
register renaming, so these should often be eliminatable, but the abstraction
is useful to allow this to happen.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113701] Issues with __int128 argument passing
  2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
                   ` (4 preceding siblings ...)
  2024-02-01 11:14 ` roger at nextmovesoftware dot com
@ 2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-01 20:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701

--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:44764984cf24e27cf7756cffd197283b9c62db8b

commit r14-8713-g44764984cf24e27cf7756cffd197283b9c62db8b
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Thu Feb 1 21:23:24 2024 +0100

    i386: Improve *cmp<dwi>_doubleword splitter [PR113701]

    The fix for PR70321 introduced a splitter that split a doubleword
    comparison into a pair of XORs followed by an IOR to set the (zero)
    flags register.  To help the reload, splitter forced SUBREG pieces of
    double-word input values to a pseudo, but this regressed
    gcc.target/i386/pr82580.c:

    int f0 (U x, U y) { return x == y; }

    from:
            xorq    %rdx, %rdi
            xorq    %rcx, %rsi
            xorl    %eax, %eax
            orq     %rsi, %rdi
            sete    %al
            ret

    to:
            xchgq   %rdi, %rsi
            movq    %rdx, %r8
            movq    %rcx, %rax
            movq    %rsi, %rdx
            movq    %rdi, %rcx
            xorq    %rax, %rcx
            xorq    %r8, %rdx
            xorl    %eax, %eax
            orq     %rcx, %rdx
            sete    %al
            ret

    To mitigate the regression, remove this legacy heuristic (workaround?).
    There have been many incremental changes and improvements to x86 TImode
    and register allocation, so this legacy workaround is not only no longer
    useful, but it actually hurts register allocation.  The patched compiler
    now produces:

            xchgq   %rdi, %rsi
            xorl    %eax, %eax
            xorq    %rsi, %rdx
            xorq    %rdi, %rcx
            orq     %rcx, %rdx
            sete    %al
            ret

            PR target/113701

    gcc/ChangeLog:

            * config/i386/i386.md (*cmp<dwi>_doubleword):
            Do not force SUBREG pieces to pseudos.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-01 20:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-01  9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
2024-02-01  9:20 ` [Bug target/113701] " ubizjak at gmail dot com
2024-02-01  9:46 ` ubizjak at gmail dot com
2024-02-01 10:31 ` rguenth at gcc dot gnu.org
2024-02-01 10:53 ` ubizjak at gmail dot com
2024-02-01 11:14 ` roger at nextmovesoftware dot com
2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).