public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
@ 2021-04-29 20:10 zsojka at seznam dot cz
  2021-04-30  6:51 ` [Bug rtl-optimization/100342] " rguenth at gcc dot gnu.org
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: zsojka at seznam dot cz @ 2021-04-29 20:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

            Bug ID: 100342
           Summary: [10/11 Regression] wrong code with -O2 -fno-dse
                    -fno-forward-propagate -mno-sse2
           Product: gcc
           Version: 11.1.1
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zsojka at seznam dot cz
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

Created attachment 50712
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50712&action=edit
semi-reduced testcase

I am failing to further reduce the testcase. I checked the code and I think the
behavior is fully defined. Also I am unable to find out what exactly goes
wrong; please close the PR if you don't want to spend time on this.

On master, it stopped failing between r12-100 (BAD) and r12-159 (OK); but it
might have been just papered over.

$ x86_64-pc-linux-gnu-gcc -O2 -fno-dse -fno-forward-propagate -mno-sse2
testcase.c -Wno-psabi
$ ./a.out 
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-11-branch/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-11-branch/binary-11-branch-20210427124134-gfb7c736c2f1-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.1.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-11-branch//configure --enable-languages=c,c++
--disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl
--with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-11-branch//binary-11-branch-20210427124134-gfb7c736c2f1-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.1 20210427 (GCC)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
@ 2021-04-30  6:51 ` rguenth at gcc dot gnu.org
  2021-04-30  8:54 ` jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-30  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |12.0, 9.3.1
           Priority|P3                          |P2
   Last reconfirmed|                            |2021-04-30
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
          Component|target                      |rtl-optimization
   Target Milestone|---                         |10.4

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
  2021-04-30  6:51 ` [Bug rtl-optimization/100342] " rguenth at gcc dot gnu.org
@ 2021-04-30  8:54 ` jakub at gcc dot gnu.org
  2021-05-04 12:50 ` ubizjak at gmail dot com
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-30  8:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r10-3542-g0b92cf305dcf34387a8e2564e55ca8948df3b47a
Stopped on the trunk with r12-139-g7d6bb80931b429631f63e0fd27bee95f32eb57a9
So most likely was latent before and after those.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
  2021-04-30  6:51 ` [Bug rtl-optimization/100342] " rguenth at gcc dot gnu.org
  2021-04-30  8:54 ` jakub at gcc dot gnu.org
@ 2021-05-04 12:50 ` ubizjak at gmail dot com
  2021-05-04 13:08 ` ubizjak at gmail dot com
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: ubizjak at gmail dot com @ 2021-05-04 12:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
For some reason the *input* value at BSWAP insn is truncated to 32bits.

  v256u128 v256u128_1 =
    SHLV (SHLSV (__builtin_bswap64 (u128_0), (v256u128) (0 < v256u128_0)) <=
          0, v256u128_0);

u128_0 is an argument to foo0, passes value of 123842323652213865 (decimal).

The binary has only one BSWAP insn, so:

$ objdump -dr a.out | grep bswap
  401ddd:       48 0f ca                bswap  %rdx

(gdb) b *0x401ddd
Breakpoint 1 at 0x401ddd: file pr100342.c, line 72.

Breakpoint 1, 0x0000000000401ddd in foo0 (u8_0=u8_0@entry=0 '\000',
v128u8_0=v128u8_0@entry=..., v512u8_0=..., u16_0=u16_0@entry=10, 
    v128u16_0=v128u16_0@entry=..., u32_0=u32_0@entry=4, u64_0=u64_0@entry=2,
v512u64_0=..., u128_0=<optimized out>, v256u128_0=..., v512u128_0=...)
    at zzz.c:72
72          SHLV (SHLSV (__builtin_bswap64 (u128_0), (v256u128) (0 <
v256u128_0)) <=
(gdb) p $rdx
$2 = 3983735913
(gdb) p/x $rdx
$3 = 0xed72fc69

We can "fix" the clobbered value manually in gdb session:

(gdb) set $rdx = 123842323652213865
(gdb) p/x $rdx
$4 = 0x1b7f9efed72fc69
(gdb) c
Continuing.
[Inferior 1 (process 20630) exited normally]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (2 preceding siblings ...)
  2021-05-04 12:50 ` ubizjak at gmail dot com
@ 2021-05-04 13:08 ` ubizjak at gmail dot com
  2021-05-04 15:14 ` ubizjak at gmail dot com
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: ubizjak at gmail dot com @ 2021-05-04 13:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
The problematic insn is:

  401cec:       44 89 f6                mov    %r14d,%esi

This one should be 64 bit wide,

        movl    %r14d, %esi     # 613   [c=4 l=3]  *movqi_internal/2

but is actually a QImode move from r14 to esi.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (3 preceding siblings ...)
  2021-05-04 13:08 ` ubizjak at gmail dot com
@ 2021-05-04 15:14 ` ubizjak at gmail dot com
  2021-05-04 15:29 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: ubizjak at gmail dot com @ 2021-05-04 15:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
The problem can be seen in _.pro_and_epilogue pass:

Starting with:

_.cmpelim

 2741: r14:DI=[sp:DI+0x38]
  ...
  368: di:DI=r14:DI
  ...
  613: si:QI=r14:QI
  ...
 2737: bp:DI=r14:DI
  ...
  658: strict_low_part(ax:QI)=bp:QI
 2275: dx:DI=si:DI
  ...
 2287: dx:DI=bp:DI
  710: dx:DI=bswap(dx:DI)


The pass converts:

_.pro_and_epilogue

rescanning insn with uid = 658.
insn 658: replaced reg 6 with 4
verify found no changes in insn with uid = 658.
verify found no changes in insn with uid = 2275.
deleting insn with uid = 2287.


 2741: r14:DI=[sp:DI+0x38]
  ...
  368: di:DI=r14:DI
  ...
  613: si:QI=r14:QI
  ...
 2737: bp:DI=r14:DI
  ...
  658: strict_low_part(ax:QI)=si:QI
 2275: dx:DI=si:DI
  ...
 2287: ?
  710: dx:DI=bswap(dx:DI)

Please note how the optimization removes insn 2287, assuming that rsi holds the
whole 64bit value ... Which is not true!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (4 preceding siblings ...)
  2021-05-04 15:14 ` ubizjak at gmail dot com
@ 2021-05-04 15:29 ` jakub at gcc dot gnu.org
  2021-05-04 15:33 ` ubizjak at gmail dot com
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-04 15:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That seems to happen during shrink-wrapping:
#0  df_insn_delete (insn=0x7fffe9e3a6c0) at ../../gcc/df-scan.c:932
#1  0x0000000000978328 in delete_insn (insn=0x7fffe9e3a6c0) at
../../gcc/cfgrtl.c:178
#2  0x0000000000d5a270 in copyprop_hardreg_forward_1 (bb=<basic_block
0x7fffe9f763a8 (2)>, vd=<optimized out>) at ../../gcc/regcprop.c:825
#3  0x0000000000d5b772 in copyprop_hardreg_forward_bb_without_debug_insn
(bb=<basic_block 0x7fffe9f763a8 (2)>) at ../../gcc/regcprop.c:1211
#4  0x0000000000db616c in prepare_shrink_wrap (entry_block=<basic_block
0x7fffe9f763a8 (2)>) at ../../gcc/shrink-wrap.c:451
#5  try_shrink_wrapping (entry_edge=0x7fffffffd518,
prologue_seq=0x7fffe9e81440) at ../../gcc/shrink-wrap.c:674
#6  0x0000000000ae7ffc in thread_prologue_and_epilogue_insns () at
../../gcc/function.c:6025
#7  0x0000000000ae86c3 in rest_of_handle_thread_prologue_and_epilogue () at
../../gcc/function.c:6510

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (5 preceding siblings ...)
  2021-05-04 15:29 ` jakub at gcc dot gnu.org
@ 2021-05-04 15:33 ` ubizjak at gmail dot com
  2021-05-04 15:34 ` ubizjak at gmail dot com
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: ubizjak at gmail dot com @ 2021-05-04 15:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
I have traced a bit where (insn 2275) and (insn 2287) come from.

In _.ira, we have:

  613: r125:QI=r2067:DI#0
  ...
  659: zero_extract(r2080:DI,0x8,0x8)=r125:QI#0

And in _.reload, a DImode reload is inserted before zero_extract RTX:

  613: si:QI=r14:QI
  ...
 2275: dx:DI=si:DI
  659: zero_extract(ax:DI,0x8,0x8)=dx:DI

However, %rsi is only initialized with QImode value in (insn 613).

This word-mode reload for zero_extract RTX interferes with register propagation
in subsequent passes, as shown in the comment #5.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (6 preceding siblings ...)
  2021-05-04 15:33 ` ubizjak at gmail dot com
@ 2021-05-04 15:34 ` ubizjak at gmail dot com
  2021-05-04 16:36 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: ubizjak at gmail dot com @ 2021-05-04 15:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
FYI, this whole analysis was done with Fedora 33 system compiler:

gcc version 10.3.1 20210422 (Red Hat 10.3.1-1) (GCC)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (7 preceding siblings ...)
  2021-05-04 15:34 ` ubizjak at gmail dot com
@ 2021-05-04 16:36 ` jakub at gcc dot gnu.org
  2021-05-04 16:57 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-04 16:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Looking at current 10 branch (previously looked at 11), I see:
(insn 2741 1965 368 2 (set (reg:DI 42 r14 [orig:2067 u128_0 ] [2067])
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 56 [0x38])) [9 %sfp+-1032 S8 A64]))
"pr100342.c":65:24 66 {*movdi_internal}
     (nil))
...
(insn 613 612 1970 2 (set (reg:QI 4 si [orig:125 _98 ] [125])
        (reg:QI 42 r14 [orig:2067 u128_0 ] [2067])) "pr100342.c":68:15 69
{*movqi_internal}
     (nil))
...
(insn 2737 1971 615 2 (set (reg:DI 6 bp [orig:2067 u128_0 ] [2067])
        (reg:DI 42 r14 [orig:2067 u128_0 ] [2067])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
...
(insn 2275 658 659 2 (set (reg:DI 1 dx [2225])
        (reg:DI 4 si [orig:125 _98 ] [125])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
...
(insn 2287 2286 710 2 (set (reg:DI 1 dx [orig:131 _104 ] [131])
        (reg:DI 6 bp [orig:2067 u128_0 ] [2067])) "pr100342.c":72:5 66
{*movdi_internal}
     (nil))
(insn 710 2287 2172 2 (set (reg:DI 1 dx [orig:131 _104 ] [131])
        (bswap:DI (reg:DI 1 dx [orig:131 _104 ] [131]))) "pr100342.c":72:5 872
{*bswapdi2}
     (nil))
in the reload dump and I must say I don't find anything wrong on that,
what looks wrong is that regcprop thinks that the insn 2287 is a redundant move
because it thinks that at that point %rdx contains the same value as %rbp -
      /* Detect noop sets and remove them before processing side effects.  */
      if (set && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set)))
        {
          unsigned int regno = REGNO (SET_SRC (set));
          rtx r1 = find_oldest_value_reg (REGNO_REG_CLASS (regno),
                                          SET_DEST (set), vd);
          rtx r2 = find_oldest_value_reg (REGNO_REG_CLASS (regno),
                                          SET_SRC (set), vd);
          if (rtx_equal_p (r1 ? r1 : SET_DEST (set), r2 ? r2 : SET_SRC (set)))
            {
              bool last = insn == BB_END (bb);
              delete_insn (insn);
r1 is %rbp and r2 is NULL, so it is compared against SET_SRC (set) of %rbp.
Something misses that only the lowpart QImode is equivalent that way.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (8 preceding siblings ...)
  2021-05-04 16:36 ` jakub at gcc dot gnu.org
@ 2021-05-04 16:57 ` jakub at gcc dot gnu.org
  2021-05-05 10:59 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-04 16:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
And there is one more important insn in between 2737 and 2275, in particular
(insn 2911 2867 2853 2 (set (reg:DI 42 r14 [2223])
        (const_int 72057594037927935 [0xffffffffffffff])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
On that one the old value of r14 is lost.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (9 preceding siblings ...)
  2021-05-04 16:57 ` jakub at gcc dot gnu.org
@ 2021-05-05 10:59 ` jakub at gcc dot gnu.org
  2021-05-05 15:12 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-05 10:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |liuhongt at gcc dot gnu.org

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(insn 2741 1965 368 2 (set (reg:DI 42 r14 [orig:2067 u128_0 ] [2067])
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 56 [0x38])) [9 %sfp+-1032 S8 A64]))
"pr100342.c":65:24 66 {*movdi_internal}
     (nil))
...
(insn 613 612 1970 2 (set (reg:QI 4 si [orig:125 _98 ] [125])
        (reg:QI 42 r14 [orig:2067 u128_0 ] [2067])) "pr100342.c":68:15 69
{*movqi_internal}
     (nil))
...
(insn 2737 1971 615 2 (set (reg:DI 6 bp [orig:2067 u128_0 ] [2067])
        (reg:DI 42 r14 [orig:2067 u128_0 ] [2067])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
...
(insn 2829 2749 2855 2 (set (reg:DI 42 r14 [2222])
        (const_int -71776119061217281 [0xff00ffffffffffff])) "pr100342.c":68:12
66 {*movdi_internal}
     (nil))
...
(insn 2275 658 659 2 (set (reg:DI 1 dx [2225])
        (reg:DI 4 si [orig:125 _98 ] [125])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
...
(insn 2910 2860 677 2 (set (reg:DI 4 si [2227])
        (const_int -1095216660481 [0xffffff00ffffffff])) "pr100342.c":68:12 66
{*movdi_internal}
     (nil))
...
(insn 2287 2286 710 2 (set (reg:DI 1 dx [orig:131 _104 ] [131])
        (reg:DI 6 bp [orig:2067 u128_0 ] [2067])) "pr100342.c":72:5 66
{*movdi_internal}
     (nil))
(insn 710 2287 2172 2 (set (reg:DI 1 dx [orig:131 _104 ] [131])
        (bswap:DI (reg:DI 1 dx [orig:131 _104 ] [131]))) "pr100342.c":72:5 872
{*bswapdi2}
     (nil))
In copyprop_hardreg_forward_1 at the start of that insn 2737 we have:
(gdb) p vd->e[1]
$32 = {mode = E_DImode, oldest_regno = 1, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[4]
$33 = {mode = E_QImode, oldest_regno = 42, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[6]
$34 = {mode = E_SImode, oldest_regno = 6, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[42]
$35 = {mode = E_DImode, oldest_regno = 42, next_regno = 4, debug_insn_changes =
0x0}
which is I think ok, it says %sil is QImode copy of %r14, all other regs we
care about here contain their values.
At the end of that we have:
(gdb) p vd->e[1]
$36 = {mode = E_DImode, oldest_regno = 1, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[4]
$37 = {mode = E_QImode, oldest_regno = 42, next_regno = 6, debug_insn_changes =
0x0}
(gdb) p vd->e[6]
$38 = {mode = E_DImode, oldest_regno = 42, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[42]
$39 = {mode = E_DImode, oldest_regno = 42, next_regno = 4, debug_insn_changes =
0x0}
as well as before the insn 2829.  That changes it to:
(gdb) p vd->e[1]
$51 = {mode = E_DImode, oldest_regno = 1, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[4]
$52 = {mode = E_QImode, oldest_regno = 4, next_regno = 6, debug_insn_changes =
0x0}
(gdb) p vd->e[6]
$53 = {mode = E_DImode, oldest_regno = 4, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[42]
$54 = {mode = E_DImode, oldest_regno = 42, next_regno = 4294967295,
debug_insn_changes = 0x0}
Same thing at the start of insn 2275.  But after that we have:
(gdb) p vd->e[1]
$61 = {mode = E_DImode, oldest_regno = 4, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[4]
$62 = {mode = E_QImode, oldest_regno = 4, next_regno = 6, debug_insn_changes =
0x0}
(gdb) p vd->e[6]
$63 = {mode = E_DImode, oldest_regno = 4, next_regno = 1, debug_insn_changes =
0x0}
(gdb) p vd->e[42]
$64 = {mode = E_DImode, oldest_regno = 42, next_regno = 4294967295,
debug_insn_changes = 0x0}
I think this step is what looks wrong, it is true that the movq %rsi, %rdx
copied all 64-bits of %rsi, but vd->e[4].mode was just QImode, so I think
vd->e[1].mode
should be also QImode.
That remains until before insn 2910, which changes that to:
(gdb) p vd->e[1]
$84 = {mode = E_DImode, oldest_regno = 6, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[4]
$85 = {mode = E_DImode, oldest_regno = 4, next_regno = 4294967295,
debug_insn_changes = 0x0}
(gdb) p vd->e[6]
$86 = {mode = E_DImode, oldest_regno = 6, next_regno = 1, debug_insn_changes =
0x0}
(gdb) p vd->e[42]
$87 = {mode = E_DImode, oldest_regno = 42, next_regno = 4294967295,
debug_insn_changes = 0x0}
and at that point, the fact that %rdx doesn't contain the same value as %rbp -
only
%dl is guaranteed to be the same as %bpl - is lost.
Then when we look up the oldest value of %rdx on insn 2287 we are told it is
%rbp.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (10 preceding siblings ...)
  2021-05-05 10:59 ` jakub at gcc dot gnu.org
@ 2021-05-05 15:12 ` jakub at gcc dot gnu.org
  2021-05-06  5:22 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-05 15:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50761
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50761&action=edit
gcc11-pr100342.patch

Untested fix.  But what do I know about regcprop?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (11 preceding siblings ...)
  2021-05-05 15:12 ` jakub at gcc dot gnu.org
@ 2021-05-06  5:22 ` crazylht at gmail dot com
  2021-05-15  8:14 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: crazylht at gmail dot com @ 2021-05-06  5:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #13 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Jakub Jelinek from comment #12)
> Created attachment 50761 [details]
> gcc11-pr100342.patch
> 
> Untested fix.  But what do I know about regcprop?

I just want to confirm that this patch alone also fixed pr98694.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (12 preceding siblings ...)
  2021-05-06  5:22 ` crazylht at gmail dot com
@ 2021-05-15  8:14 ` cvs-commit at gcc dot gnu.org
  2021-05-31 14:08 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-15  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:425ad87dcfacbb326d8f448a0f2b4d6b53dcd98f

commit r12-808-g425ad87dcfacbb326d8f448a0f2b4d6b53dcd98f
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat May 15 10:12:11 2021 +0200

    regcprop: Fix another cprop_hardreg bug [PR100342]

    On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches
wrote:
    > Ah, ok, thanks for the extra context.
    >
    > So AIUI the problem when recording xmm2<-di isn't just:
    >
    >  [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
    >
    > but also that:
    >
    >  [B] partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode)
    >
    > For example, all registers in this sequence can be part of the same
chain:
    >
    >     (set (reg:HI R1) (reg:HI R0))
    >     (set (reg:SI R2) (reg:SI R1)) // [A]
    >     (set (reg:DI R3) (reg:DI R2)) // [A]
    >     (set (reg:SI R4) (reg:SI R[0-3]))
    >     (set (reg:HI R5) (reg:HI R[0-4]))
    >
    > But:
    >
    >     (set (reg:SI R1) (reg:SI R0))
    >     (set (reg:HI R2) (reg:HI R1))
    >     (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
    >
    > is problematic because it dips below the precision of the oldest regno
    > and then increases again.
    >
    > When this happens, I guess we have two choices:
    >
    > (1) what the patch does: treat R3 as the start of a new chain.
    > (2) pretend that the copy occured in vd->e[sr].mode instead
    >     (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
    >
    > I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
    > Maybe the optimisation provided by (2) compared to (1) isn't common
    > enough to be worth the complication.
    >
    > I think we should test [B] as well as [A] though.  The pass is set
    > up to do some quite elaborate mode changes and I think rejecting
    > [A] on its own would make some of the other code redundant.
    > It also feels like it should be a seperate âifâ or âelse ifâ,
    > with its own comment.

    Unfortunately, we now have a testcase that shows that testing also [B]
    is a problem (unfortunately now latent on the trunk, only reproduces
    on 10 and 11 branches).

    The comment in the patch tries to list just the interesting instructions,
    we have a 64-bit value, copy low 8 bit of those to another register,
    copy full 64 bits to another register and then clobber the original
register.
    Before that (set (reg:DI r14) (const_int ...)) we have a chain
    DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain,
so
    we have QI si, DI bp , si being the oldest_regno.
    Next DI si is copied into DI dx.  Only the low 8 bits of that are defined,
    the rest is unspecified, but we would add DI dx into that same chain at the
    end, so QI si, DI bp, DI dx [*].  Next si is overwritten, so the chain is
    DI bp, DI dx.  And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
    as redundant, because we think bp and dx are already equivalent, when in
    reality that is true only for the lowpart 8 bits.
    I believe the [*] marked step above is where the bug is.

    The committed regcprop.c (copy_value) change (but only committed to
    trunk/11, not to 10) added
      else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
               && partial_subreg_p (vd->e[sr].mode,
                                    vd->e[vd->e[sr].oldest_regno].mode))
        return;
    and while the first partial_subreg_p call returns true, the second one
    doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
    true and we'd return, but as that reg got clobbered, si became the oldest
    regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
    and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
    But as the testcase shows, what is the oldest_regno in the chain is
    something that changes over time, so relying on it for anything is
    problematic, something could have a different oldest_regno and later
    on get a different oldest_regno (perhaps with different mode) because
    the oldest_regno got overwritten and it can change both ways.

    The following patch effectively implements your (2) above.

    2021-05-15  Jakub Jelinek  <jakub@redhat.com>

            PR rtl-optimization/100342
            * regcprop.c (copy_value): When copying a source reg in a wider
            mode than it has recorded for the value, adjust recorded
destination
            mode too or punt if !REG_CAN_CHANGE_MODE_P.

            * gcc.target/i386/pr100342.c: New test.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (13 preceding siblings ...)
  2021-05-15  8:14 ` cvs-commit at gcc dot gnu.org
@ 2021-05-31 14:08 ` cvs-commit at gcc dot gnu.org
  2021-05-31 14:56 ` [Bug rtl-optimization/100342] [10 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-31 14:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:81c2cd08fafcdc0ff48f6cf440bef86f98822a23

commit r11-8486-g81c2cd08fafcdc0ff48f6cf440bef86f98822a23
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat May 15 10:12:11 2021 +0200

    regcprop: Fix another cprop_hardreg bug [PR100342]

    On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches
wrote:
    > Ah, ok, thanks for the extra context.
    >
    > So AIUI the problem when recording xmm2<-di isn't just:
    >
    >  [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
    >
    > but also that:
    >
    >  [B] partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode)
    >
    > For example, all registers in this sequence can be part of the same
chain:
    >
    >     (set (reg:HI R1) (reg:HI R0))
    >     (set (reg:SI R2) (reg:SI R1)) // [A]
    >     (set (reg:DI R3) (reg:DI R2)) // [A]
    >     (set (reg:SI R4) (reg:SI R[0-3]))
    >     (set (reg:HI R5) (reg:HI R[0-4]))
    >
    > But:
    >
    >     (set (reg:SI R1) (reg:SI R0))
    >     (set (reg:HI R2) (reg:HI R1))
    >     (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
    >
    > is problematic because it dips below the precision of the oldest regno
    > and then increases again.
    >
    > When this happens, I guess we have two choices:
    >
    > (1) what the patch does: treat R3 as the start of a new chain.
    > (2) pretend that the copy occured in vd->e[sr].mode instead
    >     (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
    >
    > I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
    > Maybe the optimisation provided by (2) compared to (1) isn't common
    > enough to be worth the complication.
    >
    > I think we should test [B] as well as [A] though.  The pass is set
    > up to do some quite elaborate mode changes and I think rejecting
    > [A] on its own would make some of the other code redundant.
    > It also feels like it should be a seperate âifâ or âelse ifâ,
    > with its own comment.

    Unfortunately, we now have a testcase that shows that testing also [B]
    is a problem (unfortunately now latent on the trunk, only reproduces
    on 10 and 11 branches).

    The comment in the patch tries to list just the interesting instructions,
    we have a 64-bit value, copy low 8 bit of those to another register,
    copy full 64 bits to another register and then clobber the original
register.
    Before that (set (reg:DI r14) (const_int ...)) we have a chain
    DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain,
so
    we have QI si, DI bp , si being the oldest_regno.
    Next DI si is copied into DI dx.  Only the low 8 bits of that are defined,
    the rest is unspecified, but we would add DI dx into that same chain at the
    end, so QI si, DI bp, DI dx [*].  Next si is overwritten, so the chain is
    DI bp, DI dx.  And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
    as redundant, because we think bp and dx are already equivalent, when in
    reality that is true only for the lowpart 8 bits.
    I believe the [*] marked step above is where the bug is.

    The committed regcprop.c (copy_value) change (but only committed to
    trunk/11, not to 10) added
      else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
               && partial_subreg_p (vd->e[sr].mode,
                                    vd->e[vd->e[sr].oldest_regno].mode))
        return;
    and while the first partial_subreg_p call returns true, the second one
    doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
    true and we'd return, but as that reg got clobbered, si became the oldest
    regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
    and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
    But as the testcase shows, what is the oldest_regno in the chain is
    something that changes over time, so relying on it for anything is
    problematic, something could have a different oldest_regno and later
    on get a different oldest_regno (perhaps with different mode) because
    the oldest_regno got overwritten and it can change both ways.

    The following patch effectively implements your (2) above.

    2021-05-15  Jakub Jelinek  <jakub@redhat.com>

            PR rtl-optimization/100342
            * regcprop.c (copy_value): When copying a source reg in a wider
            mode than it has recorded for the value, adjust recorded
destination
            mode too or punt if !REG_CAN_CHANGE_MODE_P.

            * gcc.target/i386/pr100342.c: New test.

    (cherry picked from commit 425ad87dcfacbb326d8f448a0f2b4d6b53dcd98f)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (14 preceding siblings ...)
  2021-05-31 14:08 ` cvs-commit at gcc dot gnu.org
@ 2021-05-31 14:56 ` jakub at gcc dot gnu.org
  2022-05-10  8:18 ` cvs-commit at gcc dot gnu.org
  2022-05-10 10:45 ` jakub at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-31 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[10/11 Regression] wrong    |[10 Regression] wrong code
                   |code with -O2 -fno-dse      |with -O2 -fno-dse
                   |-fno-forward-propagate      |-fno-forward-propagate
                   |-mno-sse2                   |-mno-sse2

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed also for GCC 11.2.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (15 preceding siblings ...)
  2021-05-31 14:56 ` [Bug rtl-optimization/100342] [10 " jakub at gcc dot gnu.org
@ 2022-05-10  8:18 ` cvs-commit at gcc dot gnu.org
  2022-05-10 10:45 ` jakub at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10  8:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:90330d0e046b11817382db6545bfd8e882f21d41

commit r10-10612-g90330d0e046b11817382db6545bfd8e882f21d41
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat May 15 10:12:11 2021 +0200

    regcprop: Fix another cprop_hardreg bug [PR100342]

    On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches
wrote:
    > Ah, ok, thanks for the extra context.
    >
    > So AIUI the problem when recording xmm2<-di isn't just:
    >
    >  [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
    >
    > but also that:
    >
    >  [B] partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode)
    >
    > For example, all registers in this sequence can be part of the same
chain:
    >
    >     (set (reg:HI R1) (reg:HI R0))
    >     (set (reg:SI R2) (reg:SI R1)) // [A]
    >     (set (reg:DI R3) (reg:DI R2)) // [A]
    >     (set (reg:SI R4) (reg:SI R[0-3]))
    >     (set (reg:HI R5) (reg:HI R[0-4]))
    >
    > But:
    >
    >     (set (reg:SI R1) (reg:SI R0))
    >     (set (reg:HI R2) (reg:HI R1))
    >     (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
    >
    > is problematic because it dips below the precision of the oldest regno
    > and then increases again.
    >
    > When this happens, I guess we have two choices:
    >
    > (1) what the patch does: treat R3 as the start of a new chain.
    > (2) pretend that the copy occured in vd->e[sr].mode instead
    >     (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
    >
    > I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
    > Maybe the optimisation provided by (2) compared to (1) isn't common
    > enough to be worth the complication.
    >
    > I think we should test [B] as well as [A] though.  The pass is set
    > up to do some quite elaborate mode changes and I think rejecting
    > [A] on its own would make some of the other code redundant.
    > It also feels like it should be a seperate âifâ or âelse ifâ,
    > with its own comment.

    Unfortunately, we now have a testcase that shows that testing also [B]
    is a problem (unfortunately now latent on the trunk, only reproduces
    on 10 and 11 branches).

    The comment in the patch tries to list just the interesting instructions,
    we have a 64-bit value, copy low 8 bit of those to another register,
    copy full 64 bits to another register and then clobber the original
register.
    Before that (set (reg:DI r14) (const_int ...)) we have a chain
    DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain,
so
    we have QI si, DI bp , si being the oldest_regno.
    Next DI si is copied into DI dx.  Only the low 8 bits of that are defined,
    the rest is unspecified, but we would add DI dx into that same chain at the
    end, so QI si, DI bp, DI dx [*].  Next si is overwritten, so the chain is
    DI bp, DI dx.  And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
    as redundant, because we think bp and dx are already equivalent, when in
    reality that is true only for the lowpart 8 bits.
    I believe the [*] marked step above is where the bug is.

    The committed regcprop.c (copy_value) change (but only committed to
    trunk/11, not to 10) added
      else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
               && partial_subreg_p (vd->e[sr].mode,
                                    vd->e[vd->e[sr].oldest_regno].mode))
        return;
    and while the first partial_subreg_p call returns true, the second one
    doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
    true and we'd return, but as that reg got clobbered, si became the oldest
    regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
    and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
    But as the testcase shows, what is the oldest_regno in the chain is
    something that changes over time, so relying on it for anything is
    problematic, something could have a different oldest_regno and later
    on get a different oldest_regno (perhaps with different mode) because
    the oldest_regno got overwritten and it can change both ways.

    The following patch effectively implements your (2) above.

    2021-05-15  Jakub Jelinek  <jakub@redhat.com>

            PR rtl-optimization/100342
            * regcprop.c (copy_value): When copying a source reg in a wider
            mode than it has recorded for the value, adjust recorded
destination
            mode too or punt if !REG_CAN_CHANGE_MODE_P.

            * gcc.target/i386/pr100342.c: New test.

    (cherry picked from commit 425ad87dcfacbb326d8f448a0f2b4d6b53dcd98f)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/100342] [10 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2
  2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
                   ` (16 preceding siblings ...)
  2022-05-10  8:18 ` cvs-commit at gcc dot gnu.org
@ 2022-05-10 10:45 ` jakub at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-10 10:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 10.4 too.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-05-10 10:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29 20:10 [Bug target/100342] New: [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 zsojka at seznam dot cz
2021-04-30  6:51 ` [Bug rtl-optimization/100342] " rguenth at gcc dot gnu.org
2021-04-30  8:54 ` jakub at gcc dot gnu.org
2021-05-04 12:50 ` ubizjak at gmail dot com
2021-05-04 13:08 ` ubizjak at gmail dot com
2021-05-04 15:14 ` ubizjak at gmail dot com
2021-05-04 15:29 ` jakub at gcc dot gnu.org
2021-05-04 15:33 ` ubizjak at gmail dot com
2021-05-04 15:34 ` ubizjak at gmail dot com
2021-05-04 16:36 ` jakub at gcc dot gnu.org
2021-05-04 16:57 ` jakub at gcc dot gnu.org
2021-05-05 10:59 ` jakub at gcc dot gnu.org
2021-05-05 15:12 ` jakub at gcc dot gnu.org
2021-05-06  5:22 ` crazylht at gmail dot com
2021-05-15  8:14 ` cvs-commit at gcc dot gnu.org
2021-05-31 14:08 ` cvs-commit at gcc dot gnu.org
2021-05-31 14:56 ` [Bug rtl-optimization/100342] [10 " jakub at gcc dot gnu.org
2022-05-10  8:18 ` cvs-commit at gcc dot gnu.org
2022-05-10 10:45 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).