public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
@ 2022-02-09 14:54 erik.carstensen at intel dot com
  2022-02-09 15:09 ` [Bug c/104468] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: erik.carstensen at intel dot com @ 2022-02-09 14:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

            Bug ID: 104468
           Summary: with -O -g, quadratic compile time of function with
                    __attribute__(("00")) that passes large structs by
                    value
           Product: gcc
           Version: 11.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: erik.carstensen at intel dot com
  Target Milestone: ---

Created attachment 52392
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52392&action=edit
Reproducer

If a function passes many large structs by value to another function, then you
get quadratic  compile performance (O(n^2)) if the file is compiled with -O -g,
but the function is annotated with __attribute__((optimize("O0"))).

Compile time seems (approximately) quadratic independently in the number of
calls, in the number of struct function arguments, and in the size of the
struct. In other words, quadratic in the total size of passed values.

It compiles instantaneously (30s -> 0.1s) if I remove the __attribute__, or -g,
or -O, or if the struct size is changed to <=16 bytes or >=81 bytes.

It's still slow if I pass
-O -fno-auto-inc-dec -fno-branch-count-reg -fno-combine-stack-adjustments
-fno-compare-elim -fno-cprop-registers -fno-dce -fno-defer-pop  -fno-dse
-fno-forward-propagate -fno-guess-branch-probability -fno-if-conversion
-fno-if-conversion2 -fno-inline-functions-called-once -fno-ipa-modref
-fno-ipa-profile -fno-ipa-pure-const -fno-ipa-reference
-fno-ipa-reference-addressable -fno-merge-constants -fno-move-loop-invariants
-fno-omit-frame-pointer -fno-reorder-blocks -fno-shrink-wrap
-fno-shrink-wrap-separate -fno-split-wide-types -fno-ssa-backprop
-fno-ssa-phiopt -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch
-fno-tree-coalesce-vars -fno-tree-copy-prop -fno-tree-dce
-fno-tree-dominator-opts -fno-tree-dse -fno-tree-forwprop -fno-tree-fre
-fno-tree-phiprop -fno-tree-pta -fno-tree-scev-cprop -fno-tree-sink
-fno-tree-slsr -fno-tree-sra -fno-tree-ter -fno-unit-at-a-time
... which is documented to be the same as -O0.

This happens with native gcc from Fedora 34:
$ gcc --version
gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
$ uname -a
Linux ecarsten-mobl1.ger.corp.intel.com 5.15.12-100.fc34.x86_64 #1 SMP Wed Dec
29 15:21:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Also reproduced with gcc 6.4.

Command line:
$ gcc -g -O1 -c foo.c
or alternatively (to bypass ccache on my system):
$ /usr/libexec/gcc/x86_64-redhat-linux/11/cc1 -quiet foo.c -quiet -dumpbase
foo.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g -O0 -o /tmp/ccFglVbD.s

This causes performance issues in C code generated by the DML compiler
(https://github.com/intel/device-modeling-language)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
@ 2022-02-09 15:09 ` rguenth at gcc dot gnu.org
  2022-02-09 15:22 ` erik.carstensen at intel dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-09 15:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |12.0
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-02-09
           Keywords|                            |compile-time-hog,
                   |                            |needs-bisection
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Seems to have been fixed on trunk?

rguenther@ryzen:/tmp/obj/gcc> /usr/bin/time ~/install/gcc-12/usr/local/bin/gcc
-S t.c -O -g -fno-checking
0.02user 0.00system 0:00.06elapsed 55%CPU (0avgtext+0avgdata 31952maxresident)k
65632inputs+0outputs (76major+2504minor)pagefaults 0swaps
rguenther@ryzen:/tmp/obj/gcc> /usr/bin/time gcc-11 -S t.c -O -g              
2.12user 0.00system 0:02.13elapsed 99%CPU (0avgtext+0avgdata 24644maxresident)k
0inputs+0outputs (0major+2993minor)pagefaults 0swaps

Confirmed on the branch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
  2022-02-09 15:09 ` [Bug c/104468] " rguenth at gcc dot gnu.org
@ 2022-02-09 15:22 ` erik.carstensen at intel dot com
  2022-02-09 20:18 ` erik.carstensen at intel dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: erik.carstensen at intel dot com @ 2022-02-09 15:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #2 from Erik Carstensen <erik.carstensen at intel dot com> ---
Perhaps the problem is unrelated to function calls; it seems the time is
quadratic in the number of struct literals: If I change argument types to
pointers, then the issue remains if I pass the args as ({static s_t x;
x=(s_t){{0}};&x;}), but it vanishes if I pass them as ({static s_t
x=(s_t){{0}};&x;}).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
  2022-02-09 15:09 ` [Bug c/104468] " rguenth at gcc dot gnu.org
  2022-02-09 15:22 ` erik.carstensen at intel dot com
@ 2022-02-09 20:18 ` erik.carstensen at intel dot com
  2022-02-10  7:06 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: erik.carstensen at intel dot com @ 2022-02-09 20:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #3 from Erik Carstensen <erik.carstensen at intel dot com> ---
Do we know that some suspected underlying issue is fixed, or could it be that
the window of slowness (struct size ∈ [17,80]) just has moved?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
                   ` (2 preceding siblings ...)
  2022-02-09 20:18 ` erik.carstensen at intel dot com
@ 2022-02-10  7:06 ` rguenth at gcc dot gnu.org
  2022-02-24 11:00 ` marxin at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-10  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Erik Carstensen from comment #3)
> Do we know that some suspected underlying issue is fixed, or could it be
> that the window of slowness (struct size ∈ [17,80]) just has moved?

There is g:716a5836928ee6d8fb884d9a2fbc1b1386ec8994 which removed some
quadraticness regarding to large structures and name lookup.  But that was C++
...

I don't know of anything specific otherwise.  We did do some compile-time
improvements in var-tracking and GCC 11 shows

 var-tracking dataflow              :   2.67 ( 41%)   0.00 (  0%)   2.66 ( 41%)
   96k (  2%)
 var-tracking emit                  :   2.62 ( 40%)   0.00 (  0%)   2.63 ( 40%)
    0  (  0%)

so that might be it.  Maybe that's possible to backport as well.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
                   ` (3 preceding siblings ...)
  2022-02-10  7:06 ` rguenth at gcc dot gnu.org
@ 2022-02-24 11:00 ` marxin at gcc dot gnu.org
  2022-02-24 18:04 ` erik.carstensen at intel dot com
  2022-02-24 18:09 ` [Bug debug/104468] " pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-02-24 11:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |
                 CC|                            |marxin at gcc dot gnu.org

--- Comment #5 from Martin Liška <marxin at gcc dot gnu.org> ---
I see the following changes:

2.66s -> 0.8s : r12-2633-ge5e164effa30fd2b
0.8s -> 0.07s : r12-4397-g4cb52980e5d5fb64

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
                   ` (4 preceding siblings ...)
  2022-02-24 11:00 ` marxin at gcc dot gnu.org
@ 2022-02-24 18:04 ` erik.carstensen at intel dot com
  2022-02-24 18:09 ` [Bug debug/104468] " pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: erik.carstensen at intel dot com @ 2022-02-24 18:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #6 from Erik Carstensen <erik.carstensen at intel dot com> ---
thanks! Looks like the second change repairs __attribute__((optimize("O0")));
this leads to a smaller reproducer: the problem is reproduced if I remove that
attribute and compile with "-g -O0 -fvar-tracking" only.

The first commit somehow enables "QI vector mode", whatever that is. 0.8s still
seems like quite a lot; what happens in a recent gcc if you change the function
to, say, 

void f(void)
{
        // more than 4x slower if you add R2()
        R8(R256(fun(S,S,S,S);))
}

and compile with -g -O0 -fvar-tracking ? if ~6s, then I suppose we are down to
slow linear time; if ~50s, then we still have a quadratic behaviour.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug debug/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
  2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
                   ` (5 preceding siblings ...)
  2022-02-24 18:04 ` erik.carstensen at intel dot com
@ 2022-02-24 18:09 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-24 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=SUSPENDED&bug_status=WAITING&bug_status=REOPENED&cf_known_to_fail_type=allwords&cf_known_to_work_type=allwords&keywords=compile-time-hog%2C%20&keywords_type=allwords&list_id=340184&longdesc=var-tracking&longdesc_type=allwordssubstr&query_format=advanced

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-24 18:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09 14:54 [Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value erik.carstensen at intel dot com
2022-02-09 15:09 ` [Bug c/104468] " rguenth at gcc dot gnu.org
2022-02-09 15:22 ` erik.carstensen at intel dot com
2022-02-09 20:18 ` erik.carstensen at intel dot com
2022-02-10  7:06 ` rguenth at gcc dot gnu.org
2022-02-24 11:00 ` marxin at gcc dot gnu.org
2022-02-24 18:04 ` erik.carstensen at intel dot com
2022-02-24 18:09 ` [Bug debug/104468] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).