public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b
@ 2024-02-28 13:57 tnfchris at gcc dot gnu.org
  2024-02-28 14:33 ` [Bug tree-optimization/114151] " rguenth at gcc dot gnu.org
                   ` (25 more replies)
  0 siblings, 26 replies; 27+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-02-28 13:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

            Bug ID: 114151
           Summary: [14 Regression] weird and inefficient codegen and
                    addressing modes since
                    g:a0b1798042d033fd2cc2c806afbb77875dd2909b
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*

Created attachment 57559
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57559&action=edit
testcase

The attached C++ testcase compiled with: -O3 -mcpu=neoverse-n2

used to compile a nice and simple loop.  But after
g:a0b1798042d033fd2cc2c806afbb77875dd2909b

The codegen is weird and it uses horrible addressing modes.

The first odd part is that it's decided to split the loop, the "main" loop has
a guard after it to branch to the exit is the iteration count is 1.

If not instead of just loop again it falls through the a copy of the main loop,
but has destroyed addressing modes.

The copy of the loop seems to have unshared the address calculations. Before we
had:

  _128 = (void *) ivtmp.11_20;
  _54 = MEM <__SVFloat16_t> [(__fp16 *)_128];
  _10 = MEM <__SVFloat16_t> [(__fp16 *)_128 + POLY_INT_CST [16B, 16B]];
  _75 = MEM <__SVFloat16_t> [(__fp16 *)_128 + POLY_INT_CST [32B, 32B]];

etc, so all as an offset from _128.  Now we have:

  col_i_61 = (int) ivtmp.11_100;
  _60 = (long unsigned int) col_i_61;
  _59 = _60 * 2;
  _58 = a_j_69 + _59;
  _54 = MEM <__SVFloat16_t> [(__fp16 *)_58];
  _53 = _59 + POLY_INT_CST [16, 16];
  _13 = a_j_69 + _53;
  _10 = MEM <__SVFloat16_t> [(__fp16 *)_13];
  _74 = _59 + POLY_INT_CST [32, 32];
  _19 = a_j_69 + _74;
  _75 = MEM <__SVFloat16_t> [(__fp16 *)_19];

and similarly for the stores as well.

it also weirdly creates some very complicated addressing computations. Before
we had:

  _144 = p_mat_16(D) + 6; 
  _64 = MEM <__SVFloat16_t> [(__fp16 *)_144 + ivtmp.10_100 * 2];
  _143 = p_mat_16(D) + 4;
  _84 = MEM <__SVFloat16_t> [(__fp16 *)_143 + ivtmp.10_100 * 2];

and after:

  ivtmp.23_130 = (unsigned long) p_mat_16(D);
  _123 = 2 - ivtmp.23_130;
  _124 = &MEM <__SVFloat16_t> [(__fp16 *)0B + _123 + ivtmp.12_109 * 2];
  _64 = MEM <__SVFloat16_t> [(__fp16 *)_124];

  _122 = -ivtmp.23_130;
  _120 = &MEM <__SVFloat16_t> [(__fp16 *)0B + _122 + ivtmp.12_109 * 2];
  _84 = MEM <__SVFloat16_t> [(__fp16 *)_120];

This results in quite the codesize increase, and a 7-10% performance loss.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-03-19 12:16 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-28 13:57 [Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b tnfchris at gcc dot gnu.org
2024-02-28 14:33 ` [Bug tree-optimization/114151] " rguenth at gcc dot gnu.org
2024-02-28 14:36 ` rguenth at gcc dot gnu.org
2024-02-28 16:51 ` tnfchris at gcc dot gnu.org
2024-02-29  7:19 ` rguenther at suse dot de
2024-02-29 18:15 ` amacleod at redhat dot com
2024-03-01  9:37 ` rguenth at gcc dot gnu.org
2024-03-01 15:02 ` [Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193 amacleod at redhat dot com
2024-03-04  7:47 ` rguenth at gcc dot gnu.org
2024-03-06  3:37 ` amacleod at redhat dot com
2024-03-06  7:14 ` rguenth at gcc dot gnu.org
2024-03-06  7:31 ` rguenth at gcc dot gnu.org
2024-03-06 14:57 ` amacleod at redhat dot com
2024-03-06 20:05 ` amacleod at redhat dot com
2024-03-07  8:04 ` rguenth at gcc dot gnu.org
2024-03-07 15:53 ` amacleod at redhat dot com
2024-03-07 20:37 ` law at gcc dot gnu.org
2024-03-08 10:13 ` rguenth at gcc dot gnu.org
2024-03-08 10:22 ` tnfchris at gcc dot gnu.org
2024-03-08 14:10 ` rguenth at gcc dot gnu.org
2024-03-12  9:59 ` rguenth at gcc dot gnu.org
2024-03-12 10:00 ` rguenth at gcc dot gnu.org
2024-03-12 20:41 ` amacleod at redhat dot com
2024-03-13  7:38 ` rguenth at gcc dot gnu.org
2024-03-13 17:37 ` amacleod at redhat dot com
2024-03-19 12:12 ` cvs-commit at gcc dot gnu.org
2024-03-19 12:16 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).