public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation
@ 2023-06-13 9:02 ktkachov at gcc dot gnu.org
2023-06-13 14:00 ` [Bug target/110235] " rguenth at gcc dot gnu.org
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2023-06-13 9:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
Bug ID: 110235
Summary: Wrong use of us_truncate in SSE and AVX RTL
representation
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
CC: uros at gcc dot gnu.org
Target Milestone: ---
Target: x86
After g:921b841350c4fc298d09f6c5674663e0f4208610 added constant-folding for
SS_TRUNCATE and US_TRUNCATE some tests in i386.exp started failing:
FAIL: gcc.target/i386/avx-vpackuswb-1.c execution test
FAIL: gcc.target/i386/avx2-vpackssdw-2.c execution test
FAIL: gcc.target/i386/avx2-vpackusdw-2.c execution test
FAIL: gcc.target/i386/avx2-vpackuswb-2.c execution test
FAIL: gcc.target/i386/sse2-packuswb-1.c execution test
From what I can gather from the documentation for intrinsics like
_mm_packus_epi16 the operation they perform is not what we model as us_truncate
in RTL. That is, they don't perform a truncation while treating their input as
an unsigned value. Rather, they treat the input as a signed value and saturate
it to the unsigned min and max of the narrow mode before truncation. In that
regard they seem similar to the SQMOVUN instructions in aarch64.
I think it'd be best to change the representation of those instructions to a
truncating clamp operation, similar to
g:b747f54a2a930da55330c2861cd1e344f67a88d9 in aarch64.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
@ 2023-06-13 14:00 ` rguenth at gcc dot gnu.org
2023-06-14 4:54 ` [Bug target/110235] [14 Regression] " pinskia at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-13 14:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
Last reconfirmed| |2023-06-13
Target|x86 |x86_64-*-*
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed (the FAILs)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
2023-06-13 14:00 ` [Bug target/110235] " rguenth at gcc dot gnu.org
@ 2023-06-14 4:54 ` pinskia at gcc dot gnu.org
2023-06-14 6:25 ` crazylht at gmail dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-14 4:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |testsuite-fail
Summary|Wrong use of us_truncate in |[14 Regression] Wrong use
|SSE and AVX RTL |of us_truncate in SSE and
|representation |AVX RTL representation
Target Milestone|--- |14.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
2023-06-13 14:00 ` [Bug target/110235] " rguenth at gcc dot gnu.org
2023-06-14 4:54 ` [Bug target/110235] [14 Regression] " pinskia at gcc dot gnu.org
@ 2023-06-14 6:25 ` crazylht at gmail dot com
2023-06-14 8:43 ` crazylht at gmail dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: crazylht at gmail dot com @ 2023-06-14 6:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
FAIL: gcc.target/i386/avx2-vpackssdw-2.c execution test
This one is about sign saturation which should match rtl SS_TRUNCATE.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (2 preceding siblings ...)
2023-06-14 6:25 ` crazylht at gmail dot com
@ 2023-06-14 8:43 ` crazylht at gmail dot com
2023-06-15 8:48 ` ktkachov at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: crazylht at gmail dot com @ 2023-06-14 8:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #2)
> FAIL: gcc.target/i386/avx2-vpackssdw-2.c execution test
>
> This one is about sign saturation which should match rtl SS_TRUNCATE.
I realize for 256-bit/512-bit vpackssdw, it's an 128-bit iterleave of src1 and
src2, and then ss_truncate to the dest, not just vec_concat src1 and src2. So
the simplification exposed the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (3 preceding siblings ...)
2023-06-14 8:43 ` crazylht at gmail dot com
@ 2023-06-15 8:48 ` ktkachov at gcc dot gnu.org
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2023-06-15 8:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to Hongtao.liu from comment #3)
> (In reply to Hongtao.liu from comment #2)
> > FAIL: gcc.target/i386/avx2-vpackssdw-2.c execution test
> >
> > This one is about sign saturation which should match rtl SS_TRUNCATE.
>
> I realize for 256-bit/512-bit vpackssdw, it's an 128-bit iterleave of src1
> and src2, and then ss_truncate to the dest, not just vec_concat src1 and
> src2. So the simplification exposed the bug.
Thanks for looking at it. I think it'd make sense for someone with x86/sse/avx
experience to rewrite the RTL representation of the patterns involved to match
the correct semantics for saturation and lane behaviour.
Alternatively, a quick solution would be to convert uses of
us_truncate/ss_truncate in the problematic patterns to an x86-specific UNSPEC,
which would make things work like they did before the simplification was added.
That would be just a stop-gap solution as it's better to use standard RTL
operations where possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (4 preceding siblings ...)
2023-06-15 8:48 ` ktkachov at gcc dot gnu.org
@ 2023-06-15 23:01 ` pinskia at gcc dot gnu.org
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-15 23:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |slyfox at gcc dot gnu.org
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 110274 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (5 preceding siblings ...)
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
@ 2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-19 1:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:58e61a3ab1c13b6d5b07d86a30cf48a46e0345c8
commit r14-1916-g58e61a3ab1c13b6d5b07d86a30cf48a46e0345c8
Author: liuhongt <hongtao.liu@intel.com>
Date: Wed Jun 14 10:34:32 2023 +0800
Reimplement packuswb/packusdw with UNSPEC_US_TRUNCATE instead of original
us_truncate.
packuswb/packusdw does unsigned saturation for signed source, but rtl
us_truncate means does unsigned saturation for unsigned source.
So for value -1, packuswb will produce 0, but us_truncate produces
255. The patch reimplement those related patterns and functions with
UNSPEC_US_TRUNCATE instead of us_truncate.
gcc/ChangeLog:
PR target/110235
* config/i386/i386-expand.cc (ix86_split_mmx_pack): Use
UNSPEC_US_TRUNCATE instead of original us_truncate for
packusdw/packuswb.
* config/i386/mmx.md (mmx_pack<s_trunsuffix>swb): Substitute
with ..
(mmx_packsswb): .. this and ..
(mmx_packuswb): .. this.
(mmx_packusdw): Use UNSPEC_US_TRUNCATE instead of original
us_truncate.
(s_trunsuffix): Removed code iterator.
(any_s_truncate): Ditto.
* config/i386/sse.md (<sse2_avx2>_packuswb<mask_name>): Use
UNSPEC_US_TRUNCATE instead of original us_truncate.
(<sse4_1_avx2>_packusdw<mask_name>): Ditto.
* config/i386/i386.md (UNSPEC_US_TRUNCATE): New unspec_c_enum.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (6 preceding siblings ...)
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
@ 2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
2023-06-19 1:40 ` crazylht at gmail dot com
2023-07-15 6:04 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-19 1:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:f8e02702726d4514b8ff9f5481c9c1f5d34e1787
commit r14-1917-gf8e02702726d4514b8ff9f5481c9c1f5d34e1787
Author: liuhongt <hongtao.liu@intel.com>
Date: Thu Jun 15 16:46:14 2023 +0800
Refined 256/512-bit vpacksswb/vpackssdw patterns.
The packing in vpacksswb/vpackssdw is not a simple concat, it's an
interweave from src1 and src2 for every 128 bit(or 64-bit for the
ss_truncate result).
.i.e.
dst[192-255] = ss_truncate (src2[128-255])
dst[128-191] = ss_truncate (src1[128-255])
dst[64-127] = ss_truncate (src2[0-127])
dst[0-63] = ss_truncate (src1[0-127]
The patch refined those patterns with an extra vec_select for the
interweave.
gcc/ChangeLog:
PR target/110235
* config/i386/sse.md (<sse2_avx2>_packsswb<mask_name>):
Substitute with ..
(sse2_packsswb<mask_name>): .. this, ..
(avx2_packsswb<mask_name>): .. this and ..
(avx512bw_packsswb<mask_name>): .. this.
(<sse2_avx2>_packssdw<mask_name>): Substitute with ..
(sse2_packssdw<mask_name>): .. this, ..
(avx2_packssdw<mask_name>): .. this and ..
(avx512bw_packssdw<mask_name>): .. this.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512bw-vpackssdw-3.c: New test.
* gcc.target/i386/avx512bw-vpacksswb-3.c: New test.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (7 preceding siblings ...)
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
@ 2023-06-19 1:40 ` crazylht at gmail dot com
2023-07-15 6:04 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: crazylht at gmail dot com @ 2023-06-19 1:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed for GCC 14, the bug is latent on all release branches, but would not be
exposed without rtl us/ss_truncate simplification.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/110235] [14 Regression] Wrong use of us_truncate in SSE and AVX RTL representation
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
` (8 preceding siblings ...)
2023-06-19 1:40 ` crazylht at gmail dot com
@ 2023-07-15 6:04 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-15 6:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-07-15 6:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-13 9:02 [Bug target/110235] New: Wrong use of us_truncate in SSE and AVX RTL representation ktkachov at gcc dot gnu.org
2023-06-13 14:00 ` [Bug target/110235] " rguenth at gcc dot gnu.org
2023-06-14 4:54 ` [Bug target/110235] [14 Regression] " pinskia at gcc dot gnu.org
2023-06-14 6:25 ` crazylht at gmail dot com
2023-06-14 8:43 ` crazylht at gmail dot com
2023-06-15 8:48 ` ktkachov at gcc dot gnu.org
2023-06-15 23:01 ` pinskia at gcc dot gnu.org
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
2023-06-19 1:34 ` cvs-commit at gcc dot gnu.org
2023-06-19 1:40 ` crazylht at gmail dot com
2023-07-15 6:04 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).