public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result
@ 2020-04-03 22:40 evan@coeus-group.com
  2020-04-04  9:36 ` [Bug target/94482] " marxin at gcc dot gnu.org
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: evan@coeus-group.com @ 2020-04-03 22:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

            Bug ID: 94482
           Summary: Inserting into vector with optimization enabled on x86
                    generates incorrect result
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: evan@coeus-group.com
  Target Milestone: ---

Created attachment 48193
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48193&action=edit
Test case

I'm tyring to implement _mm_insert_epi64 without relying on intrinsics.  The
GCC-generated executable fails on x86 (but not x86_64) at -O2 and above. 
AFAICT it works on every other architecture and optimiaztion level I've tried. 
It happens on every version of GCC I've tested (7 - 9.3.0), in both C and C++
modes.

I've attached a test case (generated with C-Reduce, slightly modified to remove
some unnecessary macros) which reproduces the issue.  Line 4 is interesting;
the j field isn't used anywhere but if you remove it the code works
(unfortunately not an option in my project).

Please let me know if you need any additional information.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
@ 2020-04-04  9:36 ` marxin at gcc dot gnu.org
  2020-04-04 17:33 ` evan@coeus-group.com
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-04  9:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2020-04-04

--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
Can you please paste full command line used? And please with -v option, it will
show -march options.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
  2020-04-04  9:36 ` [Bug target/94482] " marxin at gcc dot gnu.org
@ 2020-04-04 17:33 ` evan@coeus-group.com
  2020-04-04 21:44 ` ubizjak at gmail dot com
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: evan@coeus-group.com @ 2020-04-04 17:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #2 from Evan Nemerson <evan@coeus-group.com> ---
Created attachment 48195
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48195&action=edit
Output from cc -v

Sure.  It's just -O2, and of course if you're on x86_64 you'll need to pass
-m32.  For example:

  cc -m32 -O2 -o 94482 94482.c

I've attached the output when adding -v.

If you drop either -m32 or -O2 from the flags, the program runs successfully. 
Otherwise, you'll get an assertion failure:

  94482: 94482.c:46: main: Assertion `r_.i64[0] == 1729' failed.
  Aborted (core dumped)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
  2020-04-04  9:36 ` [Bug target/94482] " marxin at gcc dot gnu.org
  2020-04-04 17:33 ` evan@coeus-group.com
@ 2020-04-04 21:44 ` ubizjak at gmail dot com
  2020-04-04 21:50 ` ubizjak at gmail dot com
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: ubizjak at gmail dot com @ 2020-04-04 21:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
Confirmed, needs -m32 -mno-sse.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (2 preceding siblings ...)
  2020-04-04 21:44 ` ubizjak at gmail dot com
@ 2020-04-04 21:50 ` ubizjak at gmail dot com
  2020-04-05 14:44 ` marxin at gcc dot gnu.org
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: ubizjak at gmail dot com @ 2020-04-04 21:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |10.0
      Known to fail|                            |8.4.1, 9.2.1
           Keywords|                            |needs-bisection

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
Current mainline (gcc-10) works OK, possibly latent, needs bisection for
revision that fixed the failure.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (3 preceding siblings ...)
  2020-04-04 21:50 ` ubizjak at gmail dot com
@ 2020-04-05 14:44 ` marxin at gcc dot gnu.org
  2020-04-05 14:46 ` marxin at gcc dot gnu.org
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-05 14:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |

--- Comment #5 from Martin Liška <marxin at gcc dot gnu.org> ---
Fixed on trunk with r10-3542-g0b92cf305dcf3438, which probably made it only
latent. And started with r7-7666-g8487c9a550dea622.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (4 preceding siblings ...)
  2020-04-05 14:44 ` marxin at gcc dot gnu.org
@ 2020-04-05 14:46 ` marxin at gcc dot gnu.org
  2020-04-05 21:08 ` evan@coeus-group.com
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-05 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #6 from Martin Liška <marxin at gcc dot gnu.org> ---
But I bet it's invalid code:

$ gcc -fsanitize=undefined pr94482.c -O2  && ./a.out 
pr94482.c:14:11: runtime error: index 2 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe2f0 with
insufficient space for an object of type 'long int'
0x7fffffffe2f0: note: pointer points here
 00 00 00 00  00 00 00 00 00 00 00 00  eb 4c 48 f7 ff 7f 00 00  e0 d9 61 f7 ff
7f 00 00  e8 e3 ff ff
              ^ 
pr94482.c:14:11: runtime error: index 3 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe2f8 with
insufficient space for an object of type 'long int'
0x7fffffffe2f8: note: pointer points here
 00 00 00 00  eb 4c 48 f7 ff 7f 00 00  e0 d9 61 f7 ff 7f 00 00  e8 e3 ff ff ff
7f 00 00  00 1c 01 00
              ^ 
pr94482.c:14:11: runtime error: index 4 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe300 with
insufficient space for an object of type 'long int'
0x7fffffffe300: note: pointer points here
 00 00 00 00  e0 d9 61 f7 ff 7f 00 00  e8 e3 ff ff ff 7f 00 00  00 1c 01 00 01
00 00 00  70 10 40 00
              ^ 
pr94482.c:14:11: runtime error: index 5 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe308 with
insufficient space for an object of type 'long int'
0x7fffffffe308: note: pointer points here
 00 00 00 00  e8 e3 ff ff ff 7f 00 00  00 1c 01 00 01 00 00 00  70 10 40 00 00
00 00 00  60 15 40 00
              ^ 
pr94482.c:14:11: runtime error: index 6 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe310 with
insufficient space for an object of type 'long int'
0x7fffffffe310: note: pointer points here
 00 00 00 00  00 1c 01 00 01 00 00 00  70 10 40 00 00 00 00 00  60 15 40 00 00
00 00 00  1d 45 5c 9d
              ^ 
pr94482.c:14:11: runtime error: index 7 out of bounds for type 'long int [2]'
pr94482.c:14:15: runtime error: store to address 0x7fffffffe318 with
insufficient space for an object of type 'long int'
0x7fffffffe318: note: pointer points here
 00 00 00 00  70 10 40 00 00 00 00 00  60 15 40 00 00 00 00 00  1d 45 5c 9d 3a
72 cd ab  60 12 40 00
              ^ 
Segmentation fault (core dumped)

$ gcc -fsanitize=address pr94482.c -O2  && ./a.out 
=================================================================
==18733==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x7fffffffe290 at pc 0x0000004015d1 bp 0x7fffffffe150 sp 0x7fffffffe148
WRITE of size 8 at 0x7fffffffe290 thread T0
    #0 0x4015d0 in main (/home/marxin/Programming/testcases/a.out+0x4015d0)
    #1 0x7ffff73c3cea in __libc_start_main ../csu/libc-start.c:308
    #2 0x401659 in _start (/home/marxin/Programming/testcases/a.out+0x401659)

Address 0x7fffffffe290 is located in stack of thread T0 at offset 304 in frame
    #0 0x40111f in main (/home/marxin/Programming/testcases/a.out+0x40111f)

  This frame has 11 object(s):
    [32, 48) 'n' (line 42)
    [64, 80) 'o' (line 43)
    [96, 112) 'r_' (line 47)
    [128, 144) 'n' (line 23)
    [160, 176) 'o' (line 24)
    [192, 208) 'r_' (line 26)
    [224, 240) 'n' (line 29)
    [256, 272) 'o' (line 30)
    [288, 304) 'r_' (line 12) <== Memory access at offset 304 overflows this
variable
    [320, 336) 'n' (line 15)
    [352, 368) 'o' (line 16)
HINT: this may be a false positive if your program uses some custom stack
unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow
(/home/marxin/Programming/testcases/a.out+0x4015d0) in main
Shadow bytes around the buggy address:
  0x10007fff7c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7c20: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10007fff7c30: 00 00 f2 f2 00 00 f2 f2 00 00 f2 f2 00 00 f2 f2
  0x10007fff7c40: 00 00 f2 f2 00 00 f2 f2 00 00 f2 f2 00 00 f2 f2
=>0x10007fff7c50: 00 00[f2]f2 00 00 f2 f2 00 00 f3 f3 00 00 00 00
  0x10007fff7c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==18733==ABORTING

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (5 preceding siblings ...)
  2020-04-05 14:46 ` marxin at gcc dot gnu.org
@ 2020-04-05 21:08 ` evan@coeus-group.com
  2020-04-05 22:19 ` evan@coeus-group.com
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: evan@coeus-group.com @ 2020-04-05 21:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #7 from Evan Nemerson <evan@coeus-group.com> ---
Created attachment 48203
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48203&action=edit
Non-reduced test case

Thanks for looking into this.

ASan didn't have any issues with the original, non-reduced test.  Here is a
compressed copy.

I'm generating a new reduced version now, checking ASan and UBSan along the way
(as well as using -Wall -Werror to make sure the result compiles cleanly), I'll
upload it as soon as it's ready.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (6 preceding siblings ...)
  2020-04-05 21:08 ` evan@coeus-group.com
@ 2020-04-05 22:19 ` evan@coeus-group.com
  2020-04-06  6:35 ` marxin at gcc dot gnu.org
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: evan@coeus-group.com @ 2020-04-05 22:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Evan Nemerson <evan@coeus-group.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #48193|0                           |1
        is obsolete|                            |

--- Comment #8 from Evan Nemerson <evan@coeus-group.com> ---
Created attachment 48204
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48204&action=edit
Reduced test case, ASan/UBSan clean

Here is the reduced test case which works with -fsanitize=address,undefined
-Wno-psabi -Wall -Werror.

This one is self-contained, and instead of using assert the return value is 0
on success and 1 on failure.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (7 preceding siblings ...)
  2020-04-05 22:19 ` evan@coeus-group.com
@ 2020-04-06  6:35 ` marxin at gcc dot gnu.org
  2020-04-06  6:47 ` jakub at gcc dot gnu.org
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-06  6:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |6.4.0
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #9 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Evan Nemerson from comment #8)
> Created attachment 48204 [details]
> Reduced test case, ASan/UBSan clean
> 
> Here is the reduced test case which works with -fsanitize=address,undefined
> -Wno-psabi -Wall -Werror.
> 
> This one is self-contained, and instead of using assert the return value is
> 0 on success and 1 on failure.

Thank you.
The git bisection revisions remain the same for the reduced test-case.
Isn't the problem right now the violation of -Wpsabi?

pr94482-v2.c: In function ‘s’:
pr94482-v2.c:8:1: warning: SSE vector return without SSE enabled changes the
ABI [-Wpsabi]
    8 | l s(__INT64_TYPE__ a) {
      | ^
pr94482-v2.c: In function ‘p’:
pr94482-v2.c:16:3: note: the ABI for passing parameters with 16-byte alignment
has changed in GCC 4.6
   16 | l p(l a, __INT64_TYPE__ i, int q) {
      |   ^
pr94482-v2.c:16:3: warning: SSE vector argument without SSE enabled changes the
ABI [-Wpsabi]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (8 preceding siblings ...)
  2020-04-06  6:35 ` marxin at gcc dot gnu.org
@ 2020-04-06  6:47 ` jakub at gcc dot gnu.org
  2020-04-06  6:55 ` marxin at gcc dot gnu.org
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-06  6:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #9)
> Isn't the problem right now the violation of -Wpsabi?

Why would that be a problem?  That warning sais that if SSE is disabled the
vector arguments (or return values) will be passed differently from when it is
enabled, but as long as both the caller and callee are built the same, that is
not a problem.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (9 preceding siblings ...)
  2020-04-06  6:47 ` jakub at gcc dot gnu.org
@ 2020-04-06  6:55 ` marxin at gcc dot gnu.org
  2020-04-06  7:14 ` jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-06  6:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #11 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #10)
> (In reply to Martin Liška from comment #9)
> > Isn't the problem right now the violation of -Wpsabi?
> 
> Why would that be a problem?  That warning sais that if SSE is disabled the
> vector arguments (or return values) will be passed differently from when it
> is enabled, but as long as both the caller and callee are built the same,
> that is not a problem.

Ok, thank you for the explanation.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (10 preceding siblings ...)
  2020-04-06  6:55 ` marxin at gcc dot gnu.org
@ 2020-04-06  7:14 ` jakub at gcc dot gnu.org
  2020-04-06  7:20 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-06  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Reduced testcase (-O2 -msse2 -m32):
typedef unsigned V __attribute__ ((__vector_size__ (16)));
union U
{
  V j;
  unsigned long long i __attribute__ ((__vector_size__ (16)));
};

static inline __attribute__((always_inline)) V
foo (unsigned long long a)
{
  union U z = { .j = (V) {} };
  for (unsigned long i = 0; i < 1; i++)
    z.i[i] = a;
  return z.j;
}

static inline __attribute__((always_inline)) V
bar (V a, unsigned long long i, int q)
{
  union U z = { .j = a };
  z.i[q] = i;
  return z.j;
}

int
main ()
{
  union U z = { .j = bar (foo (1729), 2, 1) };
  if (z.i[0] != 1729)
    __builtin_abort ();
  return 0;
}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (11 preceding siblings ...)
  2020-04-06  7:14 ` jakub at gcc dot gnu.org
@ 2020-04-06  7:20 ` rguenth at gcc dot gnu.org
  2020-04-06  7:28 ` [Bug target/94482] [8/9/10 Regression] " jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06  7:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #9)
> (In reply to Evan Nemerson from comment #8)
> > Created attachment 48204 [details]
> > Reduced test case, ASan/UBSan clean
> > 
> > Here is the reduced test case which works with -fsanitize=address,undefined
> > -Wno-psabi -Wall -Werror.
> > 
> > This one is self-contained, and instead of using assert the return value is
> > 0 on success and 1 on failure.
> 
> Thank you.
> The git bisection revisions remain the same for the reduced test-case.
> Isn't the problem right now the violation of -Wpsabi?
> 
> pr94482-v2.c: In function ‘s’:
> pr94482-v2.c:8:1: warning: SSE vector return without SSE enabled changes the
> ABI [-Wpsabi]
>     8 | l s(__INT64_TYPE__ a) {
>       | ^
> pr94482-v2.c: In function ‘p’:
> pr94482-v2.c:16:3: note: the ABI for passing parameters with 16-byte
> alignment has changed in GCC 4.6
>    16 | l p(l a, __INT64_TYPE__ i, int q) {
>       |   ^
> pr94482-v2.c:16:3: warning: SSE vector argument without SSE enabled changes
> the ABI [-Wpsabi]

No, that's not an issue here.  All of the code is inlined into main anyways,
with -fno-inline the code is fine.  Making the two non-main functions static
makes the testcase easier to look at.  You can see after inlining the IL
has lots of redundancies that should be irrelevant but GIMPLE IL support
is too limited on the GCC 9 branch to do that editing there.

The assembly shows:

        movl    16(%esp), %eax
        movl    20(%esp), %edx
...
        xorl    $1729, %eax
        orl     %edx, %eax
        setne   %al

which is the final comparison but those stack slots are never written to.

If you look at the GIMPLE before RTL expansion it looks like

main ()
{
  union k r_;
  vector(4) int n;
  union k r_;
  vector(4) int n;
  long long int _1;
  _Bool _2;
  int _6;
  vector(4) int _18;
  vector(4) int _20;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  BIT_FIELD_REF <r_.i64, 64, 0> = 1729;
  _18 = MEM[(union  *)&r_];
  MEM[(char * {ref-all})&m] = _18;
  n = _18;
  MEM[(char * {ref-all})&u] = MEM[(char * {ref-all})&n];
  BIT_FIELD_REF <r_.i64, 64, 64> = 2;
  _20 = MEM[(union  *)&r_];
  MEM[(char * {ref-all})&v] = _20;
  o = _20;
  n ={v} {CLOBBER};
  n = _20;
  MEM[(char * {ref-all})&t] = MEM[(char * {ref-all})&n];
  _1 = BIT_FIELD_REF <_20, 64, 0>;
  _2 = _1 != 1729;
  _6 = (int) _2;
  n ={v} {CLOBBER};
  return _6;
;;    succ:       EXIT

but the body should be simplifiable to just

  BIT_FIELD_REF <r_.i64, 64, 0> = 1729;
  BIT_FIELD_REF <r_.i64, 64, 64> = 2;
  _20 = MEM[(union  *)&r_];
  _1 = BIT_FIELD_REF <_20, 64, 0>;
  _2 = _1 != 1729;
  _6 = (int) _2;
  return _6;

of course the unrelated stmts may actually trigger the miscompile.  GCC 9
does not have BIT_FIELD_REF support for the GIMPLE FE (but it should be
backportable I guess).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (12 preceding siblings ...)
  2020-04-06  7:20 ` rguenth at gcc dot gnu.org
@ 2020-04-06  7:28 ` jakub at gcc dot gnu.org
  2020-04-06  7:29 ` marxin at gcc dot gnu.org
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-06  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Inserting into vector with  |[8/9/10 Regression]
                   |optimization enabled on x86 |Inserting into vector with
                   |generates incorrect result  |optimization enabled on x86
                   |                            |generates incorrect result
   Target Milestone|---                         |8.5

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The above testcase FAILs with all of 7/8/9/10, works with 6.
-fno-tree-sra fixes it.
Unless BIT_FIELD_REF with a non-SSA_NAME first operand is invalid on the lhs of
assignment, I believe this is a SRA bug.
Before SRA we have:
  z.j = { 0, 0, 0, 0 };
  BIT_FIELD_REF <z.i, 64, 0> = 1729;
  _11 = z.j;
  z ={v} {CLOBBER};
  z.j = _11;
  BIT_FIELD_REF <z.i, 64, 64> = 2;
  _6 = z.j;
  z ={v} {CLOBBER};
  z.j = _6;
  _3 = BIT_FIELD_REF <z.i, 64, 0>;
  if (_3 != 1729)
but SRA transforms it into:
  z$j = { 0, 0, 0, 0 };
  BIT_FIELD_REF <z.i, 64, 0> = 1729;
  _19 = MEM[(union U *)&z];
  z$j = _19;
  _1 = z$j;
  _11 = _1;
  z$j ={v} {CLOBBER};
  z$j = _11;
  BIT_FIELD_REF <z.i, 64, 64> = 2;
  _15 = MEM[(union U *)&z];
  z$j = _15;
  _12 = z$j;
  _6 = _12;
  z$j ={v} {CLOBBER};
  z$j_22 = _6;
  MEM[(union U *)&z] = z$j_22;
  _3 = BIT_FIELD_REF <z.i, 64, 0>;
  if (_3 != 1729)
which is definitely not equivalent.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (13 preceding siblings ...)
  2020-04-06  7:28 ` [Bug target/94482] [8/9/10 Regression] " jakub at gcc dot gnu.org
@ 2020-04-06  7:29 ` marxin at gcc dot gnu.org
  2020-04-06  7:30 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-06  7:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #15 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #12)
> Reduced testcase (-O2 -msse2 -m32):
> typedef unsigned V __attribute__ ((__vector_size__ (16)));
> union U
> {
>   V j;
>   unsigned long long i __attribute__ ((__vector_size__ (16)));
> };
> 
> static inline __attribute__((always_inline)) V
> foo (unsigned long long a)
> {
>   union U z = { .j = (V) {} };
>   for (unsigned long i = 0; i < 1; i++)
>     z.i[i] = a;
>   return z.j;
> }
> 
> static inline __attribute__((always_inline)) V
> bar (V a, unsigned long long i, int q)
> {
>   union U z = { .j = a };
>   z.i[q] = i;
>   return z.j;
> }
> 
> int
> main ()
> {
>   union U z = { .j = bar (foo (1729), 2, 1) };
>   if (z.i[0] != 1729)
>     __builtin_abort ();
>   return 0;
> }

Ok, Jakub's version started to Abort with r7-987-gf17a223de829cb5f and can be
seen on current master.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (14 preceding siblings ...)
  2020-04-06  7:29 ` marxin at gcc dot gnu.org
@ 2020-04-06  7:30 ` jakub at gcc dot gnu.org
  2020-04-06  8:32 ` rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-06  7:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
                 CC|                            |jamborm at gcc dot gnu.org
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (15 preceding siblings ...)
  2020-04-06  7:30 ` jakub at gcc dot gnu.org
@ 2020-04-06  8:32 ` rguenth at gcc dot gnu.org
  2020-04-06  8:44 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #14)
> The above testcase FAILs with all of 7/8/9/10, works with 6.
> -fno-tree-sra fixes it.
> Unless BIT_FIELD_REF with a non-SSA_NAME first operand is invalid on the lhs
> of assignment, I believe this is a SRA bug.
> Before SRA we have:
>   z.j = { 0, 0, 0, 0 };
>   BIT_FIELD_REF <z.i, 64, 0> = 1729;
>   _11 = z.j;
>   z ={v} {CLOBBER};
>   z.j = _11;
>   BIT_FIELD_REF <z.i, 64, 64> = 2;
>   _6 = z.j;
>   z ={v} {CLOBBER};
>   z.j = _6;
>   _3 = BIT_FIELD_REF <z.i, 64, 0>;
>   if (_3 != 1729)
> but SRA transforms it into:
>   z$j = { 0, 0, 0, 0 };
>   BIT_FIELD_REF <z.i, 64, 0> = 1729;
>   _19 = MEM[(union U *)&z];
>   z$j = _19;
>   _1 = z$j;
>   _11 = _1;
>   z$j ={v} {CLOBBER};
>   z$j = _11;
>   BIT_FIELD_REF <z.i, 64, 64> = 2;
>   _15 = MEM[(union U *)&z];
>   z$j = _15;
>   _12 = z$j;
>   _6 = _12;
>   z$j ={v} {CLOBBER};
>   z$j_22 = _6;
>   MEM[(union U *)&z] = z$j_22;
>   _3 = BIT_FIELD_REF <z.i, 64, 0>;
>   if (_3 != 1729)
> which is definitely not equivalent.

They definitely are valid.   It looks like SRA sets grp_partial_lhs for
the stores but forgets it for the load.  Not sure why build_access_from_expr_1
special-cases BIT_FIELD_REF/IMAGPART/REALPART that way, possibly tied to
similar code in IL modification.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (16 preceding siblings ...)
  2020-04-06  8:32 ` rguenth at gcc dot gnu.org
@ 2020-04-06  8:44 ` rguenth at gcc dot gnu.org
  2020-04-06  8:54 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
This:

          if (write)
            {
              gassign *stmt;

              if (access->grp_partial_lhs)
                ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
                                                 false, GSI_NEW_STMT);
              stmt = gimple_build_assign (repl, ref);
              gimple_set_location (stmt, loc);
              gsi_insert_after (gsi, stmt, GSI_NEW_STMT);

is definitely wrong.  And in the else case grp_partial_lhs is never true.
The whole bfr path looks fisy, too.  The following fixes the testcase:

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index b2056b58750..e16b641c9bb 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3737,36 +3737,8 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi,
bool write)
          type conversion (see PR42196) and when scalarized unions are involved
          in assembler statements (see PR42398).  */
       if (!useless_type_conversion_p (type, access->type))
-       {
-         tree ref;
-
-         ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
-
-         if (write)
-           {
-             gassign *stmt;
-
-             if (access->grp_partial_lhs)
-               ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
-                                                false, GSI_NEW_STMT);
-             stmt = gimple_build_assign (repl, ref);
-             gimple_set_location (stmt, loc);
-             gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
-           }
-         else
-           {
-             gassign *stmt;
-
-             if (access->grp_partial_lhs)
-               repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
-                                                true, GSI_SAME_STMT);
-             stmt = gimple_build_assign (ref, repl);
-             gimple_set_location (stmt, loc);
-             gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
-           }
-       }
-      else
-       *expr = repl;
+       repl = build1 (VIEW_CONVERT_EXPR, type, repl);
+      *expr = repl;
       sra_stats.exprs++;
     }
   else if (write && access->grp_to_be_debug_replaced)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (17 preceding siblings ...)
  2020-04-06  8:44 ` rguenth at gcc dot gnu.org
@ 2020-04-06  8:54 ` rguenth at gcc dot gnu.org
  2020-04-06  9:32 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06  8:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #17)
> This:
> 
>           if (write)
>             {
>               gassign *stmt;
> 
>               if (access->grp_partial_lhs)
>                 ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
>                                                  false, GSI_NEW_STMT);
>               stmt = gimple_build_assign (repl, ref);
>               gimple_set_location (stmt, loc);
>               gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
>
> is definitely wrong.

That is, it would need to be a read-modify-write operation.

> And in the else case grp_partial_lhs is never true.

OK, I see now how grp_partial_lhs is used.

> The whole bfr path looks fisy, too.  The following fixes the testcase:

It'll likely break in some cases of course.

I understand the motivation is to analyze accesses ignoring outermost
bitfield refs and imag/realpart so we generate replacements for the
"base" accesses.  All OK I guess, so it's the replacement process that
needs fixing up.

Similar issues exist in generate_subtree_copies.

> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index b2056b58750..e16b641c9bb 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -3737,36 +3737,8 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator
> *gsi, bool write)
>           type conversion (see PR42196) and when scalarized unions are
> involved
>           in assembler statements (see PR42398).  */
>        if (!useless_type_conversion_p (type, access->type))
> -       {
> -         tree ref;
> -
> -         ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
> -
> -         if (write)
> -           {
> -             gassign *stmt;
> -
> -             if (access->grp_partial_lhs)
> -               ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
> -                                                false, GSI_NEW_STMT);
> -             stmt = gimple_build_assign (repl, ref);
> -             gimple_set_location (stmt, loc);
> -             gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> -           }
> -         else
> -           {
> -             gassign *stmt;
> -
> -             if (access->grp_partial_lhs)
> -               repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
> -                                                true, GSI_SAME_STMT);
> -             stmt = gimple_build_assign (ref, repl);
> -             gimple_set_location (stmt, loc);
> -             gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> -           }
> -       }
> -      else
> -       *expr = repl;
> +       repl = build1 (VIEW_CONVERT_EXPR, type, repl);
> +      *expr = repl;
>        sra_stats.exprs++;
>      }
>    else if (write && access->grp_to_be_debug_replaced)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug target/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (18 preceding siblings ...)
  2020-04-06  8:54 ` rguenth at gcc dot gnu.org
@ 2020-04-06  9:32 ` rguenth at gcc dot gnu.org
  2020-04-06 12:59 ` [Bug tree-optimization/94482] " rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
gcc.dg/torture/pr52244.c ICEs on the generated

  VIEW_CONVERT_EXPR<union u_t>(u) = bar ();

since V_C_E on the LHS are generally unwanted (but Ada has them for aggregates
just not in outermost position).  What's always possible on the LHS is
to use a BIT_FIELD_REF.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (19 preceding siblings ...)
  2020-04-06  9:32 ` rguenth at gcc dot gnu.org
@ 2020-04-06 12:59 ` rguenth at gcc dot gnu.org
  2020-04-06 13:40 ` jamborm at gcc dot gnu.org
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06 12:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #19)
> gcc.dg/torture/pr52244.c ICEs on the generated
> 
>   VIEW_CONVERT_EXPR<union u_t>(u) = bar ();
> 
> since V_C_E on the LHS are generally unwanted (but Ada has them for
> aggregates
> just not in outermost position).  What's always possible on the LHS is
> to use a BIT_FIELD_REF.

So the issue here is that 'u' is a register, not that V_C_E on the LHS
are invalid.  And we don't have a "general" DECL_GIMPLE_REG_P we could
unset since that's just usd for complex and vector types.  Which means
we'd have to artifically set TREE_ADDRESSABLE on the replacement.  It
isn't grp_partial_lhs so SRA doesn't do that.

In the case of a call we can't move the V_C_E to the RHS so we'd
really need to keep the call and insert a compensation assignment.

  orig = bar ();
  u = VIEW_CONVERT <regtype> (orig);

but that doesn't work for a partial access since we're clobbering the
whole replacement here.  A BIT_FIELD_REF on the LHS for a _register_
is also not possible so the write to a part via an incompatible type
would represent itself as

  orig_full = VIEW_CONVERT <orig_type> (repl_full);
  <original stmt with partial write to orig_full>
  repl_full = VIEW_CONVERT <repl_type> (orig_full);

which of course makes this highly suboptimal (but at least correct which
is what we should focus on right now).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (20 preceding siblings ...)
  2020-04-06 12:59 ` [Bug tree-optimization/94482] " rguenth at gcc dot gnu.org
@ 2020-04-06 13:40 ` jamborm at gcc dot gnu.org
  2020-04-06 16:36 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-04-06 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #21 from Martin Jambor <jamborm at gcc dot gnu.org> ---
As Richi already found out, the path in sra_modify_expr handling type
incompatible replacement does not work when the replaced expr comes
from within a BIT_FIELD_REF - it does only half of what is necessary.

A conservative (not yet much tested) fix would be to emit a full RMW:

*** /tmp/UTN9NX_tree-sra.c      Mon Apr  6 15:28:23 2020
--- gcc/tree-sra.c      Mon Apr  6 15:22:40 2020
*************** sra_modify_expr (tree *expr, gimple_stmt
*** 3742,3768 ****

          ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);

!         if (write)
            {
              gassign *stmt;

              if (access->grp_partial_lhs)
!               ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
!                                                false, GSI_NEW_STMT);
!             stmt = gimple_build_assign (repl, ref);
              gimple_set_location (stmt, loc);
!             gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
            }
!         else
            {
              gassign *stmt;

              if (access->grp_partial_lhs)
!               repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
!                                                true, GSI_SAME_STMT);
!             stmt = gimple_build_assign (ref, repl);
              gimple_set_location (stmt, loc);
!             gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
            }
        }
        else
--- 3742,3771 ----

          ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);

!         if (!write || bfr)
            {
              gassign *stmt;
+             tree src = repl;

              if (access->grp_partial_lhs)
!               src = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
!                                                true, GSI_SAME_STMT);
!             stmt = gimple_build_assign (ref, src);
              gimple_set_location (stmt, loc);
!             gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
            }
!         if (bfr)
!           ref = unshare_expr (ref);
!         if (write || bfr)
            {
              gassign *stmt;

              if (access->grp_partial_lhs)
!               ref = force_gimple_operand_gsi (gsi, ref, true, NULL_TREE,
!                                                false, GSI_NEW_STMT);
!             stmt = gimple_build_assign (repl, ref);
              gimple_set_location (stmt, loc);
!             gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
            }
        }
        else

But I wonder whether we care about type incompatibility within a B_F_R
at all - isn't B_F_R also an implicit V_C_E, always looking at the
binary image?  So perhaps something as simple as the following might
work?

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index b2056b58750..d22b03814d2 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3736,7 +3736,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi,
bool write)
          be accessed as a different type too, potentially creating a need for
          type conversion (see PR42196) and when scalarized unions are involved
          in assembler statements (see PR42398).  */
-      if (!useless_type_conversion_p (type, access->type))
+      if (!bfr && !useless_type_conversion_p (type, access->type))
        {
          tree ref;

I'll test both options ...and it seems we need the RMW one to handle
REALPART_EXPR and IMAGPART_EXPR.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (21 preceding siblings ...)
  2020-04-06 13:40 ` jamborm at gcc dot gnu.org
@ 2020-04-06 16:36 ` rguenth at gcc dot gnu.org
  2020-04-09 12:43 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-06 16:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note that when REALPART_EXPR/IMAGPART_EXPR or BIT_FIELD_REF was there using
a VIEW_CONVERT_EXPR on their op0 should be OK.  Since we then have
grp_partial_def SRA will ensure the replacement is not written into SSA
immediately.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9/10 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (22 preceding siblings ...)
  2020-04-06 16:36 ` rguenth at gcc dot gnu.org
@ 2020-04-09 12:43 ` cvs-commit at gcc dot gnu.org
  2020-04-09 12:46 ` [Bug tree-optimization/94482] [8/9 " jamborm at gcc dot gnu.org
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-09 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #23 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Martin Jambor <jamborm@gcc.gnu.org>:

https://gcc.gnu.org/g:2111d5406a4ec56d6335bde779a995914d0a36d1

commit r10-7657-g2111d5406a4ec56d6335bde779a995914d0a36d1
Author: Martin Jambor <mjambor@suse.cz>
Date:   Thu Apr 9 14:37:21 2020 +0200

    sra: Fix sra_modify_expr handling of partial writes (PR 94482)

    when sra_modify_expr is invoked on an expression that modifies only
    part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
    of an assignment and the SRA replacement's type is not compatible with
    what is being replaced (0th operand of the B_F_R in the above
    example), it does not work properly, basically throwing away the partd
    of the expr that should have stayed intact.

    This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
    binary image of the replacement (and so in a way serve as a
    VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
    REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
    register, we insert a VIEW_CONVERT_EXPR under
    the complex partial access expression, which is always OK, for loads
    from registers we take the extra step of converting it to a temporary.

    This revealed a bug in fwprop which is fixed with the hunk from Richi.

    The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
    fragile because SRA prefers complex and vector types over anything
    else (and in between the two it decides based on TYPE_UID which in my
    testing today always preferred complex types) and so I only run it at
    -O1 (which is the only level where the the test fails for me).

    Bootstrapped and tested on x86_64-linux, i686-linux and aarch64-linux.

    2020-04-09  Martin Jambor  <mjambor@suse.cz>
                Richard Biener  <rguenther@suse.de>

            PR tree-optimization/94482
            * tree-sra.c (create_access_replacement): Dump new replacement with
            TDF_UID.
            (sra_modify_expr): Fix handling of cases when the original EXPR
writes
            to only part of the replacement.
            * tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
            the first operand of combinations into REAL/IMAGPART_EXPR and
            BIT_FIELD_REF.

            testsuite/
            * gcc.dg/torture/pr94482.c: New test.
            * gcc.dg/tree-ssa/pr94482-2.c: Likewise.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (23 preceding siblings ...)
  2020-04-09 12:43 ` cvs-commit at gcc dot gnu.org
@ 2020-04-09 12:46 ` jamborm at gcc dot gnu.org
  2020-04-10  3:39 ` evan@coeus-group.com
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-04-09 12:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jamborm at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #24 from Martin Jambor <jamborm at gcc dot gnu.org> ---
Fixed on trunk, will backport in a week or so.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (24 preceding siblings ...)
  2020-04-09 12:46 ` [Bug tree-optimization/94482] [8/9 " jamborm at gcc dot gnu.org
@ 2020-04-10  3:39 ` evan@coeus-group.com
  2020-04-11  5:51 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: evan@coeus-group.com @ 2020-04-10  3:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #25 from Evan Nemerson <evan@coeus-group.com> ---
Created attachment 48253
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48253&action=edit
Similar test which fails on armv7

I'm also getting an error on armv7-a for the same original code
(<https://github.com/nemequ/simde/blob/2c180d0cb01c79b187b9372f1ac3afe779bff832/simde/x86/sse4.1.h#L1078>)
when compiling with -O1 or above and -fstack-protector-strong.  I'm not sure if
it's the same issue or not; Jakub's test case from comment #12 doesn't abort
with the same target and flags.

I'm attaching a test test case which does trigger the issue on armv7.  If it
would be better to open a new bug just let me know, and if it has already been
fixed sorry for the noise :(

Here is the output from GCC with -v:

Using built-in specs.
COLLECT_GCC=arm-linux-gnueabihf-g++-10
COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabihf/10/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 10-20200324-1'
--with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs
--enable-languages=c,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-10 --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm
--disable-libquadmath --disable-libquadmath-support --enable-plugin
--enable-default-pie --with-system-zlib --without-target-system-zlib
--enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=arm-linux-gnueabihf --program-prefix=arm-linux-gnueabihf-
--includedir=/usr/arm-linux-gnueabihf/include
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200324 (experimental) [master revision
596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536] (Debian
10-20200324-1) 
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Werror' '-O1' '-fstack-protector-strong' 
'-o' 'insert-pp' '-shared-libgcc' '-mfloat-abi=hard' '-mfpu=vfpv3-d16'
'-mthumb' '-mtls-dialect=gnu' '-march=armv7-a+fp'
 /usr/lib/gcc-cross/arm-linux-gnueabihf/10/cc1plus -quiet -v -imultilib .
-imultiarch arm-linux-gnueabihf -D_GNU_SOURCE insert-pp.c -quiet -dumpbase
insert-pp.c -mfloat-abi=hard -mfpu=vfpv3-d16 -mthumb -mtls-dialect=gnu
-march=armv7-a+fp -auxbase insert-pp -O1 -Wall -Werror -version
-fstack-protector-strong -o /tmp/ccwvIVRJ.s
GNU C++14 (Debian 10-20200324-1) version 10.0.1 20200324 (experimental) [master
revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]
(arm-linux-gnueabihf)
        compiled by GNU C version 10.0.1 20200324 (experimental) [master
revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536], GMP
version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version
isl-0.22.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/arm-linux-gnueabihf"
ignoring nonexistent directory
"/usr/lib/gcc-cross/arm-linux-gnueabihf/10/include-fixed"
#include "..." search starts here:
#include <...> search starts here:

/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/include/c++/10

/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/include/c++/10/arm-linux-gnueabihf/.

/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/include/c++/10/backward
 /usr/lib/gcc-cross/arm-linux-gnueabihf/10/include

/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/include
 /usr/include/arm-linux-gnueabihf
 /usr/include
End of search list.
GNU C++14 (Debian 10-20200324-1) version 10.0.1 20200324 (experimental) [master
revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]
(arm-linux-gnueabihf)
        compiled by GNU C version 10.0.1 20200324 (experimental) [master
revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536], GMP
version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version
isl-0.22.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: f8090281bdf780936f7dd6668f41be1f
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Werror' '-O1' '-fstack-protector-strong' 
'-o' 'insert-pp' '-shared-libgcc' '-mfloat-abi=hard' '-mfpu=vfpv3-d16'
'-mthumb' '-mtls-dialect=gnu' '-march=armv7-a+fp'

/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/bin/as
-v -march=armv7-a -mfloat-abi=hard -mfpu=vfpv3-d16 -meabi=5 -o /tmp/cck1klAL.o
/tmp/ccwvIVRJ.s
GNU assembler version 2.34 (arm-linux-gnueabihf) using BFD version (GNU
Binutils for Debian) 2.34
COMPILER_PATH=/usr/lib/gcc-cross/arm-linux-gnueabihf/10/:/usr/lib/gcc-cross/arm-linux-gnueabihf/10/:/usr/lib/gcc-cross/arm-linux-gnueabihf/:/usr/lib/gcc-cross/arm-linux-gnueabihf/10/:/usr/lib/gcc-cross/arm-linux-gnueabihf/:/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/bin/
LIBRARY_PATH=/usr/lib/gcc-cross/arm-linux-gnueabihf/10/:/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/lib/:/lib/arm-linux-gnueabihf/:/lib/:/usr/lib/arm-linux-gnueabihf/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Werror' '-O1' '-fstack-protector-strong' 
'-o' 'insert-pp' '-shared-libgcc' '-mfloat-abi=hard' '-mfpu=vfpv3-d16'
'-mthumb' '-mtls-dialect=gnu' '-march=armv7-a+fp'
 /usr/lib/gcc-cross/arm-linux-gnueabihf/10/collect2 -plugin
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/liblto_plugin.so
-plugin-opt=/usr/lib/gcc-cross/arm-linux-gnueabihf/10/lto-wrapper
-plugin-opt=-fresolution=/tmp/cc47nTLL.res -plugin-opt=-pass-through=-lgcc_s
-plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --sysroot=/
--build-id --eh-frame-hdr -dynamic-linker /lib/ld-linux-armhf.so.3 -X
--hash-style=gnu --as-needed -m armelf_linux_eabi -pie -o insert-pp
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/lib/Scrt1.o
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/lib/crti.o
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/crtbeginS.o
-L/usr/lib/gcc-cross/arm-linux-gnueabihf/10
-L/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/lib
-L/lib/arm-linux-gnueabihf -L/usr/lib/arm-linux-gnueabihf /tmp/cck1klAL.o
-lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/crtendS.o
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/lib/crtn.o
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Werror' '-O1' '-fstack-protector-strong' 
'-o' 'insert-pp' '-shared-libgcc' '-mfloat-abi=hard' '-mfpu=vfpv3-d16'
'-mthumb' '-mtls-dialect=gnu' '-march=armv7-a+fp'

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (25 preceding siblings ...)
  2020-04-10  3:39 ` evan@coeus-group.com
@ 2020-04-11  5:51 ` cvs-commit at gcc dot gnu.org
  2020-04-21 12:22 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-11  5:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #26 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:bb87d5cc77db1f28083990f44e20b6c0728d925e

commit r10-7686-gbb87d5cc77db1f28083990f44e20b6c0728d925e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat Apr 11 07:50:50 2020 +0200

    testsuite: Fix up pr94482.c testcase [PR94482]

    The test FAILs on powerpc64-linux with -m32 due to psabi warnings.
    Furthermore, the test needs really -msse2 to reproduce on x86 -m32 at -O2.

    2020-04-11  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/94482
            * gcc.dg/torture/pr94482.c: Add -Wno-psabi -w.  Don't add -msse
            and sse_runtime effective target on x86, instead only add -msse2
            if target is sse2_runtime.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (26 preceding siblings ...)
  2020-04-11  5:51 ` cvs-commit at gcc dot gnu.org
@ 2020-04-21 12:22 ` cvs-commit at gcc dot gnu.org
  2020-04-21 15:42 ` cvs-commit at gcc dot gnu.org
  2020-04-21 16:37 ` jamborm at gcc dot gnu.org
  29 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-21 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #27 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Martin Jambor
<jamborm@gcc.gnu.org>:

https://gcc.gnu.org/g:9300be2c74e35709ded209a378edab91a9073fbc

commit r9-8520-g9300be2c74e35709ded209a378edab91a9073fbc
Author: Martin Jambor <mjambor@suse.cz>
Date:   Tue Apr 21 14:20:37 2020 +0200

    sra-9: Fix sra_modify_expr handling of partial writes (PR 94482)

    This is a fairly straightforward backport of the mainline fix for PR 94482.

    When sra_modify_expr is invoked on an expression that modifies only
    part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
    of an assignment and the SRA replacement's type is not compatible with
    what is being replaced (0th operand of the B_F_R in the above
    example), it does not work properly, basically throwing away the part
    of the expr that should have stayed intact.

    This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
    binary image of the replacement (and so in a way serve as a
    VIEW_CONVERT_EXPR) we just do not bother with converting.  For
    REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
    register, we insert a VIEW_CONVERT_EXPR under
    the complex partial access expression, which is always OK, for loads
    from registers we take the extra step of converting it to a temporary.

    This revealed a bug in fwprop which is fixed with the hunk from Richi.
    This is the only difference from the mainline patch which has two
    hunks, but the code handling BIT_FIELD_REF is not present in gcc-9.

    Oh, and the testcase options were changed to what Jakub put there on
    the mainline to suppress all vector ABI warnings.

    Bootstrapped and tested on x86_64-linux.

    2020-04-21  Martin Jambor  <mjambor@suse.cz>

            Backport from master
            2020-04-09  Martin Jambor  <mjambor@suse.cz>
                        Richard Biener  <rguenther@suse.de>

            PR tree-optimization/94482
            * tree-sra.c (create_access_replacement): Dump new replacement with
            TDF_UID.
            (sra_modify_expr): Fix handling of cases when the original EXPR
writes
            to only part of the replacement.
            * tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
            the first operand of combinations into REAL/IMAGPART_EXPR and
            BIT_FIELD_REF.

            testsuite/
            * gcc.dg/torture/pr94482.c: New test.
            * gcc.dg/tree-ssa/pr94482-2.c: Likewise.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (27 preceding siblings ...)
  2020-04-21 12:22 ` cvs-commit at gcc dot gnu.org
@ 2020-04-21 15:42 ` cvs-commit at gcc dot gnu.org
  2020-04-21 16:37 ` jamborm at gcc dot gnu.org
  29 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-21 15:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

--- Comment #28 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-8 branch has been updated by Martin Jambor
<jamborm@gcc.gnu.org>:

https://gcc.gnu.org/g:b463ced59535fddeff90d697f869d58e444568fa

commit r8-10194-gb463ced59535fddeff90d697f869d58e444568fa
Author: Martin Jambor <mjambor@suse.cz>
Date:   Tue Apr 21 17:41:01 2020 +0200

    sra-8: Fix sra_modify_expr handling of partial writes (PR 94482)

    This is a fairly straightforward backport of the mainline fix for PR 94482.

    When sra_modify_expr is invoked on an expression that modifies only
    part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
    of an assignment and the SRA replacement's type is not compatible with
    what is being replaced (0th operand of the B_F_R in the above
    example), it does not work properly, basically throwing away the part
    of the expr that should have stayed intact.

    This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
    binary image of the replacement (and so in a way serve as a
    VIEW_CONVERT_EXPR) we just do not bother with converting.  For
    REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
    register, we insert a VIEW_CONVERT_EXPR under
    the complex partial access expression, which is always OK, for loads
    from registers we take the extra step of converting it to a temporary.

    This revealed a bug in fwprop which is fixed with the hunk from Richi.
    This is the only difference from the mainline patch which has two
    hunks, but the code handling BIT_FIELD_REF is not present in gcc-8.

    Oh, and the testcase options were changed to what Jakub put there on
    the mainline to suppress all vector ABI warnings.

    Bootstrapped and tested on x86_64-linux.

    2020-04-21  Martin Jambor  <mjambor@suse.cz>

            Backport from master
            2020-04-09  Martin Jambor  <mjambor@suse.cz>
                        Richard Biener  <rguenther@suse.de>

            PR tree-optimization/94482
            * tree-sra.c (create_access_replacement): Dump new replacement with
            TDF_UID.
            (sra_modify_expr): Fix handling of cases when the original EXPR
writes
            to only part of the replacement.
            * tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
            the first operand of combinations into REAL/IMAGPART_EXPR and
            BIT_FIELD_REF.

            testsuite/
            * gcc.dg/torture/pr94482.c: New test.
            * gcc.dg/tree-ssa/pr94482-2.c: Likewise.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/94482] [8/9 Regression] Inserting into vector with optimization enabled on x86 generates incorrect result
  2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
                   ` (28 preceding siblings ...)
  2020-04-21 15:42 ` cvs-commit at gcc dot gnu.org
@ 2020-04-21 16:37 ` jamborm at gcc dot gnu.org
  29 siblings, 0 replies; 31+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-04-21 16:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94482

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #29 from Martin Jambor <jamborm at gcc dot gnu.org> ---
So this particular bug is fixed on trunk and both opened release branches.

Evan, if the issue you described in comment #25 persists even with
a patched compiler, I suggest you open a new bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-04-21 16:37 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03 22:40 [Bug target/94482] New: Inserting into vector with optimization enabled on x86 generates incorrect result evan@coeus-group.com
2020-04-04  9:36 ` [Bug target/94482] " marxin at gcc dot gnu.org
2020-04-04 17:33 ` evan@coeus-group.com
2020-04-04 21:44 ` ubizjak at gmail dot com
2020-04-04 21:50 ` ubizjak at gmail dot com
2020-04-05 14:44 ` marxin at gcc dot gnu.org
2020-04-05 14:46 ` marxin at gcc dot gnu.org
2020-04-05 21:08 ` evan@coeus-group.com
2020-04-05 22:19 ` evan@coeus-group.com
2020-04-06  6:35 ` marxin at gcc dot gnu.org
2020-04-06  6:47 ` jakub at gcc dot gnu.org
2020-04-06  6:55 ` marxin at gcc dot gnu.org
2020-04-06  7:14 ` jakub at gcc dot gnu.org
2020-04-06  7:20 ` rguenth at gcc dot gnu.org
2020-04-06  7:28 ` [Bug target/94482] [8/9/10 Regression] " jakub at gcc dot gnu.org
2020-04-06  7:29 ` marxin at gcc dot gnu.org
2020-04-06  7:30 ` jakub at gcc dot gnu.org
2020-04-06  8:32 ` rguenth at gcc dot gnu.org
2020-04-06  8:44 ` rguenth at gcc dot gnu.org
2020-04-06  8:54 ` rguenth at gcc dot gnu.org
2020-04-06  9:32 ` rguenth at gcc dot gnu.org
2020-04-06 12:59 ` [Bug tree-optimization/94482] " rguenth at gcc dot gnu.org
2020-04-06 13:40 ` jamborm at gcc dot gnu.org
2020-04-06 16:36 ` rguenth at gcc dot gnu.org
2020-04-09 12:43 ` cvs-commit at gcc dot gnu.org
2020-04-09 12:46 ` [Bug tree-optimization/94482] [8/9 " jamborm at gcc dot gnu.org
2020-04-10  3:39 ` evan@coeus-group.com
2020-04-11  5:51 ` cvs-commit at gcc dot gnu.org
2020-04-21 12:22 ` cvs-commit at gcc dot gnu.org
2020-04-21 15:42 ` cvs-commit at gcc dot gnu.org
2020-04-21 16:37 ` jamborm at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).