public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics
Date: Tue, 21 May 2024 14:53:14 +0000	[thread overview]
Message-ID: <bug-115161-4-h2XxJiCvbQ@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-115161-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Sergei Trofimovich from comment #3)
> Looking at -O2's bug.cc.265t.optimized tree optimizations come up with
> unfolded saturated sub8:
> 
>   _12 = __builtin_ia32_psubusb128 ({ -65, 0, 0, 0, -65, 0, 0, 0, -65, 0, 0,
> 0, -65, 0, 0, 0 }, { -99, 0, 0, 0, -99, 0, 0, 0, -99, 0, 0, 0, -99, 0, 0, 0
> });
>   _13 = __builtin_ia32_pminub128 (_12, { 32, 0, 0, 0, 32, 0, 0, 0, 32, 0, 0,
> 0, 32, 0, 0, 0 });
>   ...
> 
> 
> bug.cc.272r.cse1 still has that subtraction:
> 
>     5: r119:V16QI=[`*.LC0']
>       REG_EQUAL const_vector
>     6: r120:V16QI=[`*.LC1']
>       REG_EQUAL const_vector
>     7: r118:V16QI=us_minus(r119:V16QI,r120:V16QI)
> 
> bug.cc.273r.fwprop1 does not anymore:
> 
>     3: NOTE_INSN_BASIC_BLOCK 2
>     2: NOTE_INSN_FUNCTION_BEG
>     9: r122:V16QI=[`*.LC2']
>       REG_EQUAL const_vector
>    13: r123:V4SI=r122:V16QI#0<<0x17
>       REG_EQUAL const_vector
>    16: r128:SI=0x5f800000
>    15: r127:V4SI=vec_duplicate(r128:SI)
> 
> Could it be that constant folder "forgot" to generate anything for
> unsupported saturated-sub instead of leaving it as is?

No.  It is normal constant folding on RTL (not done on GIMPLE because
the i386 backend doesn't try to gimple fold __builtin_ia32_psubusb128
or __builtin_ia32_psubusb128).  0xbf - 0x9d is 0x22, so the us_minus works
actually in this case exactly like minus and because 0x20 is smaller than that,
the minimum is a vector with 0x20 elements (plus min (0 - 0, 0) = 0 elements).

The reason the testcase FAILs is the same as in the other PRs, it is trying to
convert
{0x0.8p+33f, 0x0.8p+33f, 0x0.8p+33f, 0x0.8p+33f}
V4SFmode vector to V4SImode, and because the backend sees the constant operand
of the
fix, it folds it to the unspecified value as with scalar conversion.

Consider:
int
main ()
{
  volatile float f = 0x0.8p+33f;
  volatile float __attribute__((vector_size (16))) vf = { 0x0.8p+33f,
0x0.8p+33f, 0x0.8p+33f, 0x0.8p+33f };
  int a = f;
  int __attribute__((vector_size (16))) vi = __builtin_convertvector (vf, int
__attribute__((vector_size (16))));
  __builtin_printf ("%d\n", a);
  __builtin_printf ("{%d, %d, %d, %d}\n", vi[0], vi[1], vi[2], vi[3]);
}
This prints
-2147483648
{-2147483648, -2147483648, -2147483648, -2147483648}
at -O0 or -O2, but with -O2 -Dvolatile= prints
2147483647
{2147483647, 2147483647, 2147483647, 2147483647}
instead.
Either is IMHO fine, the C standard doesn't specify what should be the result
of the conversion.
Now, whether for _mm_cvttps_epi32 etc. such cases are also unspecified or not
is debatable.  The Intel spec obviously specifies what the CPU instructions do
even in those otherwise unspecified cases, the question is if the intrinsic
must behave the same or if those invalid conversions are still unspecified.
If they'd be well defined when using the intrinsics, arguably the backend
shouldn't use FIX RTL but some UNSPEC, or should use the FIX RTL conditionally
(if_then_else:SI (argument_is_in_bounds) (fix arg) (const_int 0x8000000)).

  parent reply	other threads:[~2024-05-21 14:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-20 10:07 [Bug target/115161] New: " slyfox at gcc dot gnu.org
2024-05-20 11:26 ` [Bug target/115161] " slyfox at gcc dot gnu.org
2024-05-20 14:48 ` roger at nextmovesoftware dot com
2024-05-20 22:08 ` slyfox at gcc dot gnu.org
2024-05-21  6:41 ` rguenth at gcc dot gnu.org
2024-05-21 14:53 ` jakub at gcc dot gnu.org [this message]
2024-05-21 15:02 ` jakub at gcc dot gnu.org
2024-05-21 15:13 ` jakub at gcc dot gnu.org
2024-05-21 15:40 ` amonakov at gcc dot gnu.org
2024-05-21 15:50 ` slyfox at gcc dot gnu.org
2024-05-21 15:51 ` jakub at gcc dot gnu.org
2024-05-21 16:07 ` jakub at gcc dot gnu.org
2024-05-22  1:06 ` liuhongt at gcc dot gnu.org
2024-05-22  1:13 ` pinskia at gcc dot gnu.org
2024-05-22  7:27 ` jakub at gcc dot gnu.org
2024-05-22  7:41 ` slyfox at gcc dot gnu.org
2024-05-22  7:43 ` slyfox at gcc dot gnu.org
2024-05-22  7:52 ` liuhongt at gcc dot gnu.org
2024-05-22  8:03 ` jakub at gcc dot gnu.org
2024-05-22  8:22 ` amonakov at gcc dot gnu.org
2024-05-22  8:26 ` jakub at gcc dot gnu.org
2024-05-22  8:49 ` amonakov at gcc dot gnu.org
2024-05-23 10:39 ` slyfox at gcc dot gnu.org
2024-05-24 22:00 ` slyfox at gcc dot gnu.org
2024-05-25 17:15 ` amonakov at gcc dot gnu.org
2024-05-25 18:44 ` slyfox at gcc dot gnu.org
2024-05-27  0:47 ` [Bug target/115161] highway-1.0.7 miscompilation of _mm_cvttps_epi32(): invalid result assumed liuhongt at gcc dot gnu.org
2024-06-05  4:09 ` cvs-commit at gcc dot gnu.org
2024-06-05  9:13 ` slyfox at gcc dot gnu.org
2024-06-17  8:17 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-115161-4-h2XxJiCvbQ@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).