public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113701] New: Issues with __int128 argument passing
@ 2024-02-01 9:17 ubizjak at gmail dot com
2024-02-01 9:20 ` [Bug target/113701] " ubizjak at gmail dot com
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01 9:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
Bug ID: 113701
Summary: Issues with __int128 argument passing
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
typedef unsigned __int128 U;
U f0 (U x, U y) { return x + y; }
U f1 (U x, U y) { return x - y; }
U f2 (U x, U y) { return x | y; }
int f3 (U x, U y) { return x == y; }
int f4 (U x, U y) { return x < y; }
--cut here--
shows some issues with __int128 parameter passing.
gcc -O2:
f0:
movq %rdx, %rax
movq %rcx, %rdx
addq %rdi, %rax
adcq %rsi, %rdx
ret
f1:
xchgq %rdi, %rsi
movq %rdx, %r8
movq %rsi, %rax
movq %rdi, %rdx
subq %r8, %rax
sbbq %rcx, %rdx
ret
f2:
xchgq %rdi, %rsi
movq %rdx, %rax
movq %rcx, %rdx
orq %rsi, %rax
orq %rdi, %rdx
ret
f3:
xchgq %rdi, %rsi
movq %rdx, %r8
movq %rcx, %rax
movq %rsi, %rdx
movq %rdi, %rcx
xorq %rax, %rcx
xorq %r8, %rdx
xorl %eax, %eax
orq %rcx, %rdx
sete %al
ret
f4:
xorl %eax, %eax
cmpq %rdx, %rdi
sbbq %rcx, %rsi
setc %al
ret
Functions f0 and f4 are now optimal.
Functions f1, f2 and f3 emit extra XCHG, but the swap should be propagated to
MOV instructions instead.
The most problematic function is f3, which regressed noticeably from gcc-12.3:
f3:
xorq %rdx, %rdi
xorq %rcx, %rsi
xorl %eax, %eax
orq %rsi, %rdi
sete %al
ret
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
@ 2024-02-01 9:20 ` ubizjak at gmail dot com
2024-02-01 9:46 ` ubizjak at gmail dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01 9:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
CC added.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
2024-02-01 9:20 ` [Bug target/113701] " ubizjak at gmail dot com
@ 2024-02-01 9:46 ` ubizjak at gmail dot com
2024-02-01 10:31 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01 9:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #0)
> Following testcase:
>
> --cut here--
> typedef unsigned __int128 U;
>
> U f0 (U x, U y) { return x + y; }
> U f1 (U x, U y) { return x - y; }
>
> U f2 (U x, U y) { return x | y; }
>
> int f3 (U x, U y) { return x == y; }
> int f4 (U x, U y) { return x < y; }
> --cut here--
>
> shows some issues with __int128 parameter passing.
>
> gcc -O2:
>
> f0:
> movq %rdx, %rax
> movq %rcx, %rdx
> addq %rdi, %rax
> adcq %rsi, %rdx
> ret
>
> f1:
> xchgq %rdi, %rsi
> movq %rdx, %r8
> movq %rsi, %rax
> movq %rdi, %rdx
> subq %r8, %rax
> sbbq %rcx, %rdx
> ret
>
> f2:
> xchgq %rdi, %rsi
> movq %rdx, %rax
> movq %rcx, %rdx
> orq %rsi, %rax
> orq %rdi, %rdx
> ret
>
> f3:
> xchgq %rdi, %rsi
> movq %rdx, %r8
> movq %rcx, %rax
> movq %rsi, %rdx
> movq %rdi, %rcx
> xorq %rax, %rcx
> xorq %r8, %rdx
> xorl %eax, %eax
> orq %rcx, %rdx
> sete %al
> ret
>
> f4:
> xorl %eax, %eax
> cmpq %rdx, %rdi
> sbbq %rcx, %rsi
> setc %al
> ret
>
> Functions f0 and f4 are now optimal.
>
> Functions f1, f2 and f3 emit extra XCHG, but the swap should be propagated
> to MOV instructions instead.
>
> The most problematic function is f3, which regressed noticeably from gcc-12.3:
This patch solves the regression:
--cut here--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index bac0a6ade67..02fed16db72 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1632,11 +1632,6 @@ (define_insn_and_split "*cmp<dwi>_doubleword"
(set (match_dup 4) (ior:DWIH (match_dup 4) (match_dup 5)))])]
{
split_double_mode (<DWI>mode, &operands[0], 2, &operands[0], &operands[2]);
- /* Placing the SUBREG pieces in pseudos helps reload. */
- for (int i = 0; i < 4; i++)
- if (SUBREG_P (operands[i]))
- operands[i] = force_reg (<MODE>mode, operands[i]);
-
operands[4] = gen_reg_rtx (<MODE>mode);
/* Special case comparisons against -1. */
--cut here--
gcc -O2:
f3:
xchgq %rdi, %rsi
xorl %eax, %eax
xorq %rsi, %rdx
xorq %rdi, %rcx
orq %rcx, %rdx
sete %al
ret
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
2024-02-01 9:20 ` [Bug target/113701] " ubizjak at gmail dot com
2024-02-01 9:46 ` ubizjak at gmail dot com
@ 2024-02-01 10:31 ` rguenth at gcc dot gnu.org
2024-02-01 10:53 ` ubizjak at gmail dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-01 10:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-02-01
Target| |x86_64-*-*
Ever confirmed|0 |1
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
` (2 preceding siblings ...)
2024-02-01 10:31 ` rguenth at gcc dot gnu.org
@ 2024-02-01 10:53 ` ubizjak at gmail dot com
2024-02-01 11:14 ` roger at nextmovesoftware dot com
2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-02-01 10:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #2)
> > The most problematic function is f3, which regressed noticeably from gcc-12.3:
>
> This patch solves the regression:
>
> --cut here--
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index bac0a6ade67..02fed16db72 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1632,11 +1632,6 @@ (define_insn_and_split "*cmp<dwi>_doubleword"
> (set (match_dup 4) (ior:DWIH (match_dup 4) (match_dup 5)))])]
> {
> split_double_mode (<DWI>mode, &operands[0], 2, &operands[0],
> &operands[2]);
> - /* Placing the SUBREG pieces in pseudos helps reload. */
> - for (int i = 0; i < 4; i++)
> - if (SUBREG_P (operands[i]))
> - operands[i] = force_reg (<MODE>mode, operands[i]);
> -
> operands[4] = gen_reg_rtx (<MODE>mode);
>
> /* Special case comparisons against -1. */
> --cut here--
Roger, this part was added in [1]. Does this code improve some real reload
issue, not covered in the testsuite, because testsuite results do not show any
regression if the above code is removed.
BTW: The issue was noticed in gcc.target/i386/pr82580.c test case.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595860.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
` (3 preceding siblings ...)
2024-02-01 10:53 ` ubizjak at gmail dot com
@ 2024-02-01 11:14 ` roger at nextmovesoftware dot com
2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-02-01 11:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=106518
--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
I like Uros' patch in comment #2. There have been so many incremental changes
and improvements to x86 TImode and register allocation, that this legacy
heuristic (workaround?) is not only no longer useful, but it actually hurts
register allocation. *cmp<dwi>_doubleword appears to be the only (remaining?)
place this idiom is used.
Additionally, I think I've mentioned in the past that it might also be useful
to have a xchg/swap sinking pass, perhaps as part of cprop_hardreg, so that for
example swap followed by swap is eliminated, that swap with one destination
REG_DEAD is transformed into mov, etc. Swap/xchg is almost always just hard
register renaming, so these should often be eliminatable, but the abstraction
is useful to allow this to happen.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113701] Issues with __int128 argument passing
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
` (4 preceding siblings ...)
2024-02-01 11:14 ` roger at nextmovesoftware dot com
@ 2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-01 20:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:
https://gcc.gnu.org/g:44764984cf24e27cf7756cffd197283b9c62db8b
commit r14-8713-g44764984cf24e27cf7756cffd197283b9c62db8b
Author: Uros Bizjak <ubizjak@gmail.com>
Date: Thu Feb 1 21:23:24 2024 +0100
i386: Improve *cmp<dwi>_doubleword splitter [PR113701]
The fix for PR70321 introduced a splitter that split a doubleword
comparison into a pair of XORs followed by an IOR to set the (zero)
flags register. To help the reload, splitter forced SUBREG pieces of
double-word input values to a pseudo, but this regressed
gcc.target/i386/pr82580.c:
int f0 (U x, U y) { return x == y; }
from:
xorq %rdx, %rdi
xorq %rcx, %rsi
xorl %eax, %eax
orq %rsi, %rdi
sete %al
ret
to:
xchgq %rdi, %rsi
movq %rdx, %r8
movq %rcx, %rax
movq %rsi, %rdx
movq %rdi, %rcx
xorq %rax, %rcx
xorq %r8, %rdx
xorl %eax, %eax
orq %rcx, %rdx
sete %al
ret
To mitigate the regression, remove this legacy heuristic (workaround?).
There have been many incremental changes and improvements to x86 TImode
and register allocation, so this legacy workaround is not only no longer
useful, but it actually hurts register allocation. The patched compiler
now produces:
xchgq %rdi, %rsi
xorl %eax, %eax
xorq %rsi, %rdx
xorq %rdi, %rcx
orq %rcx, %rdx
sete %al
ret
PR target/113701
gcc/ChangeLog:
* config/i386/i386.md (*cmp<dwi>_doubleword):
Do not force SUBREG pieces to pseudos.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-02-01 20:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-01 9:17 [Bug target/113701] New: Issues with __int128 argument passing ubizjak at gmail dot com
2024-02-01 9:20 ` [Bug target/113701] " ubizjak at gmail dot com
2024-02-01 9:46 ` ubizjak at gmail dot com
2024-02-01 10:31 ` rguenth at gcc dot gnu.org
2024-02-01 10:53 ` ubizjak at gmail dot com
2024-02-01 11:14 ` roger at nextmovesoftware dot com
2024-02-01 20:25 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).