public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
@ 2023-10-13 15:30 law at gcc dot gnu.org
  2023-10-16  6:36 ` [Bug tree-optimization/111798] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: law at gcc dot gnu.org @ 2023-10-13 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

            Bug ID: 111798
           Summary: [14 Regression] Recent change causing testsuite
                    regression and poor code on mcore-elf
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

This change:

commit 6decda1a35be5764101987c210b5693a0d914e58
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Oct 12 11:34:57 2023 +0200

    tree-optimization/111779 - Handle some BIT_FIELD_REFs in SRA

    The following handles byte-aligned, power-of-two and byte-multiple
    sized BIT_FIELD_REF reads in SRA.  In particular this should cover
    BIT_FIELD_REFs created by optimize_bit_field_compare.

    For gcc.dg/tree-ssa/ssa-dse-26.c we now SRA the BIT_FIELD_REF
    appearing there leading to more DSE, fully eliding the aggregates.

    This results in the same false positive -Wuninitialized as the
    older attempt to remove the folding from optimize_bit_field_compare,
    fixed by initializing part of the aggregate unconditionally.

            PR tree-optimization/111779
    gcc/
            * tree-sra.cc (sra_handled_bf_read_p): New function.
            (build_access_from_expr_1): Handle some BIT_FIELD_REFs.
            (sra_modify_expr): Likewise.
            (make_fancy_name_1): Skip over BIT_FIELD_REF.

    gcc/fortran/
            * trans-expr.cc (gfc_trans_assignment_1): Initialize
            lhs_caf_attr and rhs_caf_attr codimension flag to avoid
            false positive -Wuninitialized.

    gcc/testsuite/
            * gcc.dg/tree-ssa/ssa-dse-26.c: Adjust for more DSE.
            * gcc.dg/vect/vect-pr111779.c: New testcase.

Causes execute/20040709-2.c to fail on mcore-elf at -O2.  It also results in
what appears to be significantly poorer code generation.

Note I haven't managed to get mcore-elf-gdb to work, so debugging is, umm,
painful.  And I wouldn't put a lot of faith in the simulator correctness.

I have simplified the test to this:
extern void abort (void);
extern void exit (int);

unsigned int
myrnd (void)
{
  static unsigned int s = 1388815473;
  s *= 1103515245;
  s += 12345;
  return (s / 65536) % 2048;
}

struct __attribute__((packed)) K
{
  unsigned int k:6, l:1, j:10, i:15;
};

struct K sK;

unsigned int
fn1K (unsigned int x)
{
  struct K y = sK;
  y.k += x;
  return y.k;
}

void
testK (void)
{
  int i;
  unsigned int mask, v, a, r;
  struct K x;
  char *p = (char *) &sK;
  for (i = 0; i < sizeof (sK); ++i)
    *p++ = myrnd ();
  v = myrnd ();
  a = myrnd ();
  sK.k = v;
  x = sK;
  r = fn1K (a);
  if (x.j != sK.j || x.l != sK.l)
    abort ();
}

int
main (void)
{
  testK ();
  exit (0);
}


Which should at least make the poor code gen obvious.  I don't expect to have
time to debug this further anytime in the near future.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
@ 2023-10-16  6:36 ` rguenth at gcc dot gnu.org
  2023-10-16  8:29 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-16  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization,
                   |                            |wrong-code
            Version|unknown                     |14.0
             Target|                            |mcore-elf
   Target Milestone|---                         |14.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
  2023-10-16  6:36 ` [Bug tree-optimization/111798] " rguenth at gcc dot gnu.org
@ 2023-10-16  8:29 ` rguenth at gcc dot gnu.org
  2024-03-07 21:04 ` law at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-16  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-10-16
                 CC|richard.guenther at gmail dot com  |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |WAITING

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
If I configured correctly the only change (early) SRA does is

 unsigned int fn1K (unsigned int x)
 {
+  <unnamed-unsigned:6> y$k;
...
@@ -57,13 +53,14 @@

   <bb 2> :
   y = sK;
-  _1 = y.k;
+  y$k_13 = sK.k;
+  _1 = y$k_13;
   _2 = (unsigned char) _1;
   _3 = (unsigned char) x_9(D);
   _4 = _2 + _3;
   _5 = (<unnamed-unsigned:6>) _4;
-  y.k = _5;
-  _6 = y.k;
+  y$k_14 = _5;
+  _6 = y$k_14;
   _11 = (unsigned int) _6;
   y ={v} {CLOBBER(eol)};
   return _11;

and later DSE eliminates 'y' completely.  Note I can't see any big assembly
difference -O2 vs -O2 -fno-tree-sra so I wonder if I really reproduced the
significantly worse code generation issue.  In fact generated code looks
better, the stack slot is elided:

> diff -u t.s.good t.s.bad 
--- t.s.good    2023-10-16 10:18:27.457651977 +0200
+++ t.s.bad     2023-10-16 10:07:49.997656268 +0200
@@ -19,13 +19,11 @@
        .export fn1K
        .type   fn1K, @function
 fn1K:
-       subi    sp,8
        lrw     r7, sK
        ld.b    r7,(r7)
        addu    r2,r7
        movi    r7,63
        and     r2,r7
-       addi    sp,8
        jmp     r15
        .size   fn1K, .-fn1K
        .align  1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
  2023-10-16  6:36 ` [Bug tree-optimization/111798] " rguenth at gcc dot gnu.org
  2023-10-16  8:29 ` rguenth at gcc dot gnu.org
@ 2024-03-07 21:04 ` law at gcc dot gnu.org
  2024-03-09  6:03 ` law at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-07 21:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-03-07 21:04 ` law at gcc dot gnu.org
@ 2024-03-09  6:03 ` law at gcc dot gnu.org
  2024-03-11  9:49 ` rguenth at gcc dot gnu.org
  2024-05-07  7:42 ` [Bug tree-optimization/111798] [14/15 " rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-09  6:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW

--- Comment #2 from Jeffrey A. Law <law at gcc dot gnu.org> ---
I wouldn't look at with/without tree-sra.  Instead I would look at the assembly
code before/after your change.  It's clearly worse.  testK goes from ~37
instructions to ~70 and the abort call is no longer removed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-03-09  6:03 ` law at gcc dot gnu.org
@ 2024-03-11  9:49 ` rguenth at gcc dot gnu.org
  2024-05-07  7:42 ` [Bug tree-optimization/111798] [14/15 " rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-03-11  9:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so the abort () call is only elided during RTL optimization before the
change.  The change itself first manifests during ESRA as

 void testK ()
 {
-  <unnamed-unsigned:10> x$j;
   unsigned int D.1835;
   static unsigned int s = 1388815473;
   unsigned int D.1834;
@@ -167,14 +157,13 @@
   _5 = (<unnamed-unsigned:6>) _51;
   sK.k = _5;
   x = sK;
-  x$j_55 = sK.j;
   y$k_36 = sK.k;
   _37 = (unsigned char) y$k_36;
   _38 = (unsigned char) _46;
   _39 = _37 + _38;
   _40 = (<unnamed-unsigned:6>) _39;
   _41 = (unsigned int) _40;
-  _6 = x$j_55;
+  _6 = x.j;
   _7 = sK.j;
   if (_6 != _7)
     goto <bb 7>; [INV]

there are other uses of 'x' below, unchanged:

  else
    goto <bb 6>; [INV]

  <bb 6> :
  _8 = BIT_FIELD_REF <x, 8, 0>;
  _9 = BIT_FIELD_REF <sK, 8, 0>;
  _10 = _8 ^ _9;
  _11 = _10 & 64;
  if (_11 != 0)
    goto <bb 7>; [INV]
  else
    goto <bb 8>; [INV]

  <bb 7> :
  abort ();

and with the change we now analyze those and run into

! Disqualifying sK - Address taken in a non-call-argument context.
! Disqualifying x - No or inhibitingly overlapping accesses.


That's because fold somehow transformed one but not the other bitfield
compares.

In particular we have the following two accesses

_8 = BIT_FIELD_REF <x, 8, 0>;
_6 = x.j;

where the first is offset = 0, size = 8 and the second offset = 7, size = 10

and those partially overlap.  Previous to the change SRA was doing a
not profitable change in the end causing it to leave 'x' around.

The interesting difference is in FRE though where without the change we
are able to elide the _6 != _7 compare but with it we fail.  That's because
SRA makes the load of sK.j available - while VN can see through the
x = sK assignment it won't derive a value from the result unless there's
a leader for it in the IL.

Swapping the compare to

  if (sK.j != x.j || x.l != sK.l)
    abort ();

makes it optimized and the abort () call vanishes as well - in fact it
now vanishes already in DOM2, much before RTL optimization.

I don't think the SRA change is to be blamed for doing anything wrong, we're
running into natural limitations of CSE and premature folding inhibiting
SRA (which the SRA change itself was supposed to handle a bit better).

When there's actual wrong-code now on mcore this is a latent problem
uncovered by the change.

The missed-optimization is yet another "dead code elimination regression".

I don't see an easy way to solve the missed-optimization regression.  On the
VN side, when we just see

  x = sK;
  _3 = x.j;
  _4 = sK.j;

then visiting the x.j load, when running into x = SK, could eventually
also put sK.j into the hash table (with the same VARYING value we assign
to x.j).  This would require some extra bookkeeping as we actually compute
this expression already but do not keep it as alternate form to record
when we figure there's no actual leader for it.  Testcase:

struct X { int a; int b; } p, o;
void foo ()
{
  p = o;
  if (p.a != o.a)
    __builtin_abort ();
}

Let me take this for this particular VN issue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/111798] [14/15 Regression] Recent change causing testsuite regression and poor code on mcore-elf
  2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-03-11  9:49 ` rguenth at gcc dot gnu.org
@ 2024-05-07  7:42 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-07  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|14.0                        |14.2

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 14.1 is being released, retargeting bugs to GCC 14.2.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-07  7:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-13 15:30 [Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf law at gcc dot gnu.org
2023-10-16  6:36 ` [Bug tree-optimization/111798] " rguenth at gcc dot gnu.org
2023-10-16  8:29 ` rguenth at gcc dot gnu.org
2024-03-07 21:04 ` law at gcc dot gnu.org
2024-03-09  6:03 ` law at gcc dot gnu.org
2024-03-11  9:49 ` rguenth at gcc dot gnu.org
2024-05-07  7:42 ` [Bug tree-optimization/111798] [14/15 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).