public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code
@ 2021-07-17  2:59 andres at anarazel dot de
  2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: andres at anarazel dot de @ 2021-07-17  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

            Bug ID: 101481
           Summary: -ftree-loop-distribute-patterns can slow down and
                    increases size of code
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andres at anarazel dot de
  Target Milestone: ---

Created attachment 51168
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51168&action=edit
simplified example reproducing problem

Hi,

I found -ftree-loop-distribute-patterns to be far too aggressive in replacing
code, leading to increased code size and substantial slowdowns (12% in the
program I just hit this).

The code size increase & slowdown are partially caused by the function call
itself, and partially due to the spilling necessary to make that function call.
Worsened by the PLT call to memmove().

A very simplified example (also attached) is this:

typedef struct node
{
    unsigned char chunks[4];
    unsigned char count;
} node;

void
foo(node *a, unsigned char newchunk, unsigned char off)
{
    if (a->count > 3)
        __builtin_unreachable();

    for (int i = a->count - 1; i >= off; i--)
        a->chunks[i + 1] = a->chunks[i];
    a->chunks[off] = newchunk;
}

which with `-O2 -fPIC` boils down to:
foo(node*, unsigned char, unsigned char):
        pushq   %r12
        movl    %edx, %r8d
        movl    %esi, %r12d
        pushq   %rbp
        movq    %rdi, %rbp
        pushq   %rbx
        movzbl  4(%rdi), %ecx
        movzbl  %r8b, %ebx
        leal    -1(%rcx), %edx
        cmpl    %ebx, %edx
        jl      .L2
        movl    %ecx, %eax
        movslq  %edx, %rsi
        subl    %ebx, %ecx
        subl    $1, %ecx
        movq    %rsi, %rdx
        subq    %rcx, %rdx
        leaq    1(%rcx), %r8
        leaq    (%rdi,%rdx), %rsi
        movzbl  %al, %edi
        movq    %r8, %rdx
        movq    %rdi, %rax
        subq    %rcx, %rax
        leaq    0(%rbp,%rax), %rdi
        call    memmove@PLT
.L2:
        movb    %r12b, 0(%rbp,%rbx)
        popq    %rbx
        popq    %rbp
        popq    %r12
        ret

compare to `-O2 -fPIC -fno-tree-loop-distribute-patterns`

foo(node*, unsigned char, unsigned char):
        movzbl  4(%rdi), %eax
        movzbl  %dl, %edx
        subl    $1, %eax
        cmpl    %edx, %eax
        jl      .L2
        cltq
.L3:
        movzbl  (%rdi,%rax), %ecx
        movb    %cl, 1(%rdi,%rax)
        subq    $1, %rax
        cmpl    %eax, %edx
        jle     .L3
.L2:
        movb    %sil, (%rdi,%rdx)
        ret

Which I think makes the problem apparent.

Regards,

Andres Freund

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
@ 2021-07-17  3:36 ` pinskia at gcc dot gnu.org
  2021-07-17  3:37 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-17  3:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization
   Target Milestone|---                         |11.2
   Last reconfirmed|                            |2021-07-17
            Summary|-ftree-loop-distribute-patt |[11/12 Regression]
                   |erns can slow down and      |-ftree-loop-distribute-patt
                   |increases size of code      |erns can slow down and
                   |                            |increases size of code
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.  The ranges should be figured out that the max size is 4.

In fact GCC knows the range really (note I think it is really [0,2] but that is
a different issue) and the loop bounds is max 4.

;;   basic block 6, loop depth 1, count 805306369 (estimated locally), maybe
hot
;;    prev block 8, next block 9, flags: (NEW, REACHABLE, VISITED)
;;    pred:       9 [always]  count:603979777 (estimated locally)
(FALLTHRU,DFS_BACK)
;;                8 [always]  count:201326592 (estimated locally) (FALLTHRU)
  # RANGE [-1, 2]
  # i_17 = PHI <i_16(9), i_11(8)>
  # .MEM_21 = PHI <.MEM_15(9), .MEM_9(D)(8)>
  # RANGE [1, 3] NONZERO 3
  _4 = i_17 + 1;
  # VUSE <.MEM_21>
  _5 = a_10(D)->chunksD.1943[i_17];
  # .MEM_15 = VDEF <.MEM_21>
  a_10(D)->chunksD.1943[_4] = _5;
  # RANGE [-1, 1]
  i_16 = i_17 + -1;
  if (i_16 >= _20)
    goto <bb 9>; [75.00%]
  else
    goto <bb 7>; [25.00%]
;;    succ:       9 [75.0% (guessed)]  count:603979777 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
;;                7 [25.0% (guessed)]  count:201326592 (estimated locally)
(FALSE_VALUE,EXECUTABLE)

;;   basic block 9, loop depth 1, count 603979777 (estimated locally), maybe
hot
;;    prev block 6, next block 7, flags: (NEW)
;;    pred:       6 [75.0% (guessed)]  count:603979777 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
  goto <bb 6>; [100.00%]
;;    succ:       6 [always]  count:603979777 (estimated locally)
(FALLTHRU,DFS_BACK)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
  2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
@ 2021-07-17  3:37 ` pinskia at gcc dot gnu.org
  2021-07-19  6:29 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-17  3:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note this is an older regression but I only marked it as 11/12 as
-ftree-loop-distribute-patterns is now enabled at -O2 rather than just -O3.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
  2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
  2021-07-17  3:37 ` pinskia at gcc dot gnu.org
@ 2021-07-19  6:29 ` rguenth at gcc dot gnu.org
  2021-07-28  7:07 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-19  6:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org
            Version|unknown                     |11.1.1
           Priority|P3                          |P2

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
IIRC one of the issues is that we do not have any strategy to expand memmove
calls with bounded size inline.  On GIMPLE we're only replacing memmove
with constant power-of-two size (and properly aligned pointers depending on
targets).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
                   ` (2 preceding siblings ...)
  2021-07-19  6:29 ` rguenth at gcc dot gnu.org
@ 2021-07-28  7:07 ` rguenth at gcc dot gnu.org
  2022-04-21  7:49 ` rguenth at gcc dot gnu.org
  2023-05-29 10:05 ` [Bug tree-optimization/101481] [11/12/13/14 " jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-28  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.2                        |11.3

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.2 is being released, retargeting bugs to GCC 11.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
                   ` (3 preceding siblings ...)
  2021-07-28  7:07 ` rguenth at gcc dot gnu.org
@ 2022-04-21  7:49 ` rguenth at gcc dot gnu.org
  2023-05-29 10:05 ` [Bug tree-optimization/101481] [11/12/13/14 " jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-21  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.3                        |11.4

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 is being released, retargeting bugs to GCC 11.4.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/101481] [11/12/13/14 Regression] -ftree-loop-distribute-patterns can slow down and increases size of code
  2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
                   ` (4 preceding siblings ...)
  2022-04-21  7:49 ` rguenth at gcc dot gnu.org
@ 2023-05-29 10:05 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-29 10:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.4                        |11.5

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.4 is being released, retargeting bugs to GCC 11.5.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-29 10:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
2021-07-17  3:37 ` pinskia at gcc dot gnu.org
2021-07-19  6:29 ` rguenth at gcc dot gnu.org
2021-07-28  7:07 ` rguenth at gcc dot gnu.org
2022-04-21  7:49 ` rguenth at gcc dot gnu.org
2023-05-29 10:05 ` [Bug tree-optimization/101481] [11/12/13/14 " jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).