public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "ubizjak at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246
Date: Sat, 08 Jul 2023 16:56:41 +0000	[thread overview]
Message-ID: <bug-110206-4-iqrOlwlixf@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-110206-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
Some more digging through the code:

In cprop.cc/try_replace_reg, we try to simplify the source of the set given our
substitution:

Breakpoint 1, try_replace_reg (from=0x7fffe9f0b7f8, to=0x7fffe9f099e0,
insn=0x7fffea01b6c0) at ../../git/gcc/gcc/cprop.cc:789
789           src = simplify_replace_rtx (SET_SRC (set), from, to);

(gdb) list
784       if (!success && set && reg_mentioned_p (from, SET_SRC (set)))
785         {
786           /* If above failed and this is a single set, try to simplify the
source
787              of the set given our substitution.  We could perhaps try this
for
788              multiple SETs, but it probably won't buy us anything.  */
789           src = simplify_replace_rtx (SET_SRC (set), from, to);

(gdb) p debug_rtx (set)
(set (reg:V8HI 100)
    (zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
            (parallel [
                    (const_int 0 [0])
                    (const_int 1 [0x1])
                    (const_int 2 [0x2])
                    (const_int 3 [0x3])
                    (const_int 4 [0x4])
                    (const_int 5 [0x5])
                    (const_int 6 [0x6])
                    (const_int 7 [0x7])
                ]))))

(gdb) p debug_rtx (from)
(reg:V4QI 98)

(gdb) p debug_rtx (to)
(const_vector:V4QI [
        (const_int -52 [0xffffffffffffffcc]) repeated x4
    ])

and simplify_replace_rtx simplifies the above to:

(gdb) p debug_rtx (src)
(const_vector:V8HI [
        (const_int 204 [0xcc]) repeated x8
    ])

which is obviously wrong, we have V4QImode input register holding V4QImode
constant.

Tracing through simplify-rtx.cc brings us to a recursive
simplify_replace_fn_rtx, which gets us to:

Breakpoint 1, simplify_replace_fn_rtx (x=0x7fffe9f0b888,
old_rtx=0x7fffe9f0b7f8, fn=0x0, data=0x7fffe9f099e0) at
../../git/gcc/gcc/simplify-rtx.cc:474
474               op0 = simplify_gen_subreg (GET_MODE (x), op0,

(gdb) list
469           if (code == SUBREG)
470             {
471               op0 = simplify_replace_fn_rtx (SUBREG_REG (x), old_rtx, fn,
data);
472               if (op0 == SUBREG_REG (x))
473                 return x;
474               op0 = simplify_gen_subreg (GET_MODE (x), op0,
475                                          GET_MODE (SUBREG_REG (x)),
476                                          SUBREG_BYTE (x));
477               return op0 ? op0 : x;
478             }

(gdb) p debug_rtx (op0)
(const_vector:V4QI [
        (const_int -52 [0xffffffffffffffcc]) repeated x4
    ])
(gdb) p debug_rtx (x)
(subreg:V16QI (reg:V4QI 98) 0)

and simplify_gen_subreg with the above arguments returns:

(gdb) p debug_rtx (op0)
(const_vector:V16QI [
        (const_int -52 [0xffffffffffffffcc]) repeated x16
    ])

No way! It is not possible to get V16QImode vector from V4QImode vector, even
when all elements are duplicates.

Tracing even deeper to simplify_context::simplify_subreg, we found the
following:

Breakpoint 1, simplify_context::simplify_subreg (this=0x7fffffffd528,
outermode=E_V16QImode, op=0x7fffe9f099e0, innermode=E_V4QImode, byte=...)
    at ../../git/gcc/gcc/simplify-rtx.cc:7561
7561            return gen_vec_duplicate (outermode, elt);

(gdb) list
7556          rtx elt;
7557
7558          if (VECTOR_MODE_P (outermode)
7559              && GET_MODE_INNER (outermode) == GET_MODE_INNER (innermode)
7560              && vec_duplicate_p (op, &elt))
7561            return gen_vec_duplicate (outermode, elt);
7562
7563          if (outermode == GET_MODE_INNER (innermode)
7564              && vec_duplicate_p (op, &elt))
7565            return elt;

(gdb) p outermode
$1 = E_V16QImode
(gdb) p debug_rtx (elt)
(const_int -52 [0xffffffffffffffcc])

(gdb) fin
Run till exit from #0  simplify_context::simplify_subreg (this=0x7fffffffd528,
outermode=E_V16QImode, op=0x7fffe9f099e0, innermode=E_V4QImode, byte=...)
    at ../../git/gcc/gcc/simplify-rtx.cc:7561
0x0000000000eb24d3 in simplify_subreg (byte=..., innermode=E_V4QImode,
op=<optimized out>, outermode=<optimized out>) at ../../git/gcc/gcc/rtl.h:3513
3513      return simplify_context ().simplify_subreg (outermode, op, innermode,
byte);
Value returned is $4 = (rtx_def *) 0x7fffe9f09c10

(gdb) p debug_rtx ($4)
(const_vector:V16QI [
        (const_int -52 [0xffffffffffffffcc]) repeated x16
    ])

Nope. This transformation is valid only for non-paradoxical registers.

Patch is then obvious:

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index d7315d82aa3..87ca25086dc 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -7557,6 +7557,7 @@ simplify_context::simplify_subreg (machine_mode
outermode, rtx op,

       if (VECTOR_MODE_P (outermode)
          && GET_MODE_INNER (outermode) == GET_MODE_INNER (innermode)
+         && !paradoxical_subreg_p (outermode, innermode)
          && vec_duplicate_p (op, &elt))
        return gen_vec_duplicate (outermode, elt);

  parent reply	other threads:[~2023-07-08 16:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-11 11:27 [Bug target/110206] New: [14 Regression] wrong code with -Os -march=cascadelake zsojka at seznam dot cz
2023-06-11 16:24 ` [Bug target/110206] " pinskia at gcc dot gnu.org
2023-06-11 16:39 ` pinskia at gcc dot gnu.org
2023-06-11 17:37 ` [Bug target/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246 jakub at gcc dot gnu.org
2023-06-12  8:22 ` rguenth at gcc dot gnu.org
2023-06-12  8:55 ` ubizjak at gmail dot com
2023-06-12  9:09 ` ubizjak at gmail dot com
2023-06-12  9:11 ` [Bug rtl-optimization/110206] " ubizjak at gmail dot com
2023-06-12  9:15 ` ubizjak at gmail dot com
2023-06-12  9:18 ` ubizjak at gmail dot com
2023-07-08 14:07 ` ubizjak at gmail dot com
2023-07-08 16:56 ` ubizjak at gmail dot com [this message]
2023-07-08 17:36 ` ubizjak at gmail dot com
2023-07-09  8:54 ` ubizjak at gmail dot com
2023-07-10 11:54 ` rguenth at gcc dot gnu.org
2023-07-10 12:33 ` ubizjak at gmail dot com
2023-07-10 12:37 ` ubizjak at gmail dot com
2023-07-13 17:51 ` ubizjak at gmail dot com
2023-07-14  6:26 ` ubizjak at gmail dot com
2023-07-14 15:17 ` cvs-commit at gcc dot gnu.org
2023-07-14 20:03 ` cvs-commit at gcc dot gnu.org
2023-07-14 22:38 ` cvs-commit at gcc dot gnu.org
2023-07-14 22:39 ` ubizjak at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-110206-4-iqrOlwlixf@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).