public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox
@ 2021-03-26  7:57 mh+gcc at glandium dot org
  2021-03-26  8:26 ` [Bug c++/99785] " pinskia at gcc dot gnu.org
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: mh+gcc at glandium dot org @ 2021-03-26  7:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

            Bug ID: 99785
           Summary: Awful lot of time spent building gl.cc in Firefox
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mh+gcc at glandium dot org
  Target Milestone: ---

Created attachment 50475
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50475&action=edit
gl.ii.gz

Compiling the attached preprocessed source takes 27 minutes with GCC 10 with
`g++ -o gl.o -c gl.ii -O2 -std=gnu++17 -g -pipe` and a grand total of 21GB of
memory spread between cc1plus (15G) and gas (6GB). It's also slow to process at
-O0.

This goes up to 4 hours (!) with GCC 11 and less than 1 minute with clang.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug c++/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
@ 2021-03-26  8:26 ` pinskia at gcc dot gnu.org
  2021-03-26  8:27 ` pinskia at gcc dot gnu.org
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-26  8:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
-O0 -ftime-report:
 callgraph ipa passes               :   6.16 ( 19%)   0.94 ( 20%)   7.10 ( 19%)
   62M (  6%)

(NOTE this is the trunk with checking enabled and not GCC built with
--enable-checking=release).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug c++/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
  2021-03-26  8:26 ` [Bug c++/99785] " pinskia at gcc dot gnu.org
@ 2021-03-26  8:27 ` pinskia at gcc dot gnu.org
  2021-03-26  8:29 ` pinskia at gcc dot gnu.org
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-26  8:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>This goes up to 4 hours (!) with GCC 11 

How did you configure trunk GCC?  Did you use --enable-checking=release ?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug c++/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
  2021-03-26  8:26 ` [Bug c++/99785] " pinskia at gcc dot gnu.org
  2021-03-26  8:27 ` pinskia at gcc dot gnu.org
@ 2021-03-26  8:29 ` pinskia at gcc dot gnu.org
  2021-03-26  8:33 ` [Bug ipa/99785] " mh+gcc at glandium dot org
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-26  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Oh I think this is the inliner.  Because I have -Dalways_inline= on the command
line because I think the code is using it in the wrong places ....

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (2 preceding siblings ...)
  2021-03-26  8:29 ` pinskia at gcc dot gnu.org
@ 2021-03-26  8:33 ` mh+gcc at glandium dot org
  2021-03-26  8:41 ` pinskia at gcc dot gnu.org
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: mh+gcc at glandium dot org @ 2021-03-26  8:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #4 from Mike Hommey <mh+gcc at glandium dot org> ---
GCC 11 is the package in Debian experimental, so however it's built.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (3 preceding siblings ...)
  2021-03-26  8:33 ` [Bug ipa/99785] " mh+gcc at glandium dot org
@ 2021-03-26  8:41 ` pinskia at gcc dot gnu.org
  2021-03-26  8:49 ` mh+gcc at glandium dot org
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-26  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Ok for -O0 case:
 integration                        :  29.75 (  6%)   7.37 ( 40%)  37.61 (  7%)
 3900M ( 39%)
 expand                             :  11.13 (  2%)   0.10 (  1%)  11.21 (  2%)
 1900M ( 19%)
 dominance computation              :  18.08 (  4%)   0.11 (  1%)  18.40 (  3%)
    0  (  0%)


 thread pro- & epilogue             :  24.92 (  5%)   0.03 (  0%)  24.78 (  5%)
   13M (  0%)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (4 preceding siblings ...)
  2021-03-26  8:41 ` pinskia at gcc dot gnu.org
@ 2021-03-26  8:49 ` mh+gcc at glandium dot org
  2021-03-26  9:08 ` mh+gcc at glandium dot org
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: mh+gcc at glandium dot org @ 2021-03-26  8:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #6 from Mike Hommey <mh+gcc at glandium dot org> ---
Replacing __attribute__((always_inline)) with inline on the two blend_pixels
functions makes it go down to 30s with GCC 10.

See https://bugzilla.mozilla.org/show_bug.cgi?id=1700520#c9 why the functions
were marked always_inline in the first place.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (5 preceding siblings ...)
  2021-03-26  8:49 ` mh+gcc at glandium dot org
@ 2021-03-26  9:08 ` mh+gcc at glandium dot org
  2021-03-26  9:27 ` pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: mh+gcc at glandium dot org @ 2021-03-26  9:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #7 from Mike Hommey <mh+gcc at glandium dot org> ---
It's worth noting that the clang variant of the code makes use of
__builtin_shufflevector, which the gcc variant doesn't (per
https://searchfox.org/mozilla-central/source/gfx/wr/swgl/src/vector_type.h), so
the build time comparison might be influenced by that. clang does manage to
inline blend_pixels, though, and the resulting code is much smaller than what
GCC produces.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (6 preceding siblings ...)
  2021-03-26  9:08 ` mh+gcc at glandium dot org
@ 2021-03-26  9:27 ` pinskia at gcc dot gnu.org
  2021-03-26  9:29 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-26  9:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Mike Hommey from comment #7)
> It's worth noting that the clang variant of the code makes use of
> __builtin_shufflevector, which the gcc variant doesn't (per
> https://searchfox.org/mozilla-central/source/gfx/wr/swgl/src/vector_type.h),
> so the build time comparison might be influenced by that. clang does manage
> to inline blend_pixels, though, and the resulting code is much smaller than
> what GCC produces.

It is not exactly __builtin_shufflevector but rather VectorType in clang uses a
type which is just a vector type (using ext_vector_type) while in GCC's version
uses a template class for it.

Can you disable the ext_vector_type usage in header and see if how slow clang
becomes?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (7 preceding siblings ...)
  2021-03-26  9:27 ` pinskia at gcc dot gnu.org
@ 2021-03-26  9:29 ` jakub at gcc dot gnu.org
  2021-03-26  9:33 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-26  9:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
gcc does have __builtin_convertvector (which is used only for clang
apparently), and while it doesn't have __builtin_shufflevector, it does have
__builtin_shuffle which can achieve everything that the code does with
__builtin_shufflevector, just with different syntax.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (8 preceding siblings ...)
  2021-03-26  9:29 ` jakub at gcc dot gnu.org
@ 2021-03-26  9:33 ` rguenth at gcc dot gnu.org
  2021-03-26  9:35 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-26  9:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org
            Version|unknown                     |11.0

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Did anybody check the actual output from clang as to whether it performs the
desired optimizations?  I only have clang 9 around and that rejects the TU
(maybe there's clang specific code paths and the preprocessed source is not
representative here)

Inlining blend_pixels without first constant propagating 'blend_key' (I suppose
at all call paths that's eventually supposed to be constant propagated
somehow?)
looks quite stupid given the large switch.  Sure, saving %xmm around calls can
have a cost but trashing icache should be worse.  If all of this is
auto-generated the auto-generation might also be able to improve the
blend_key dispatch.

Another strathegy might be to not put always_inline on everything
(because that in turn will cause exponential growth) but instead inline
everything into the finally important function(s) via 'flatten'.

That is, you do sth like

static __attribute__((always_inline)) inline void large_leaf () { /* large */ }

static __attribute__((always_inline)) inline void inter1 () { large_leaf (); }

static __attribute__((always_inline)) inline void inter2 () { inter1 (); inter1
(); }

static __attribute__((always_inline)) inline void inter3 () { inter2 (); inter2
(); }

and what you get is (intermediate) 8 copies of the large_leaf body.  Which
is because we inline expand from leafs rather than first inlining the small
always-inline wrappers (and throwing them away before inlining into them).
I suppose we could try to not inline into always-inline functions at the
expense of needing to iterate on inlined always-inline bodies.  Or somehow
at least delay inlining large bodies into always-inline bodies.

Anyway, marking such large functions as always-inline is asking for trouble.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (9 preceding siblings ...)
  2021-03-26  9:33 ` rguenth at gcc dot gnu.org
@ 2021-03-26  9:35 ` rguenth at gcc dot gnu.org
  2021-03-26 10:11 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-26  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, GCC 10 branch tip with -O1:

 ipa inlining heuristics            : 962.91 ( 85%)   0.39 (  4%) 971.66 ( 84%)
1103801 kB ( 10%)
 alias stmt walking                 :  40.95 (  4%)   1.07 ( 11%)  42.13 (  4%)
  25965 kB (  0%)
 integration                        :  13.66 (  1%)   3.23 ( 33%)  16.18 (  1%)
4931059 kB ( 43%)
 TOTAL                              :1135.79          9.69       1153.79      
11462287 kB
1135.79user 9.73system 19:13.82elapsed 99%CPU (0avgtext+0avgdata
4090216maxresident)k
0inputs+0outputs (0major+959014minor)pagefaults 0swaps

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (10 preceding siblings ...)
  2021-03-26  9:35 ` rguenth at gcc dot gnu.org
@ 2021-03-26 10:11 ` rguenth at gcc dot gnu.org
  2021-03-26 11:16 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-26 10:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #11)
> Btw, GCC 10 branch tip with -O1:
> 
>  ipa inlining heuristics            : 962.91 ( 85%)   0.39 (  4%) 971.66 (
> 84%) 1103801 kB ( 10%)
>  alias stmt walking                 :  40.95 (  4%)   1.07 ( 11%)  42.13 ( 
> 4%)   25965 kB (  0%)
>  integration                        :  13.66 (  1%)   3.23 ( 33%)  16.18 ( 
> 1%) 4931059 kB ( 43%)
>  TOTAL                              :1135.79          9.69       1153.79    
> 11462287 kB
> 1135.79user 9.73system 19:13.82elapsed 99%CPU (0avgtext+0avgdata
> 4090216maxresident)k
> 0inputs+0outputs (0major+959014minor)pagefaults 0swaps

And a profile of trunk with release checking shows

Samples: 4M of event 'cycles:u', Event count (approx.): 4811472248176           
Overhead       Samples  Command  Shared Object     Symbol                       
  31.07%       1371974  cc1plus  cc1plus           [.] update_callee_keys
  25.52%       1128500  cc1plus  cc1plus           [.] edge_badness
   4.91%        216874  cc1plus  cc1plus           [.]
evaluate_properties_for_edge
   3.90%        172368  cc1plus  cc1plus           [.]
cgraph_node::get_availability
   3.83%        169169  cc1plus  cc1plus           [.] do_estimate_edge_time
   2.56%        113399  cc1plus  cc1plus           [.]
symtab_node::ultimate_alias_target_1

I wonder why we even run into this for the always inlines.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (11 preceding siblings ...)
  2021-03-26 10:11 ` rguenth at gcc dot gnu.org
@ 2021-03-26 11:16 ` rguenth at gcc dot gnu.org
  2021-03-26 19:31 ` jmuizelaar at mozilla dot com
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-26 11:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
On trunk (with release checking) at -O2 the situation is not different from -O1
or the GCC 10 branch (so it's not 4 hours), the profile looks the same as well.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (12 preceding siblings ...)
  2021-03-26 11:16 ` rguenth at gcc dot gnu.org
@ 2021-03-26 19:31 ` jmuizelaar at mozilla dot com
  2021-03-26 21:38 ` hubicka at gcc dot gnu.org
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: jmuizelaar at mozilla dot com @ 2021-03-26 19:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

Jeff Muizelaar <jmuizelaar at mozilla dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmuizelaar at mozilla dot com

--- Comment #14 from Jeff Muizelaar <jmuizelaar at mozilla dot com> ---
re: __builtin_shuffle vs __builtin_shufflevector - It looks like
__builtin_shuffle doesn't support constructing vectors of a different size than
input type. That's mostly what we're using __builtin_shufflevector for.
__builtin_shufflevector
https://github.com/servo/webrender/blob/master/swgl/src/vector_type.h

I briefly tried to get the gcc variant of the code compiling with clang but ran
into a number of issues including clang's lack of support for
'__builtin_shuffle'. If you'd like to try, the swgl code is pretty easy to
build locally if you. You should be able to just checkout
https://github.com/servo/webrender/ navigate to the the 'swgl' directory and
run 'cargo build --release'

re: inlining huge functions - We tried not inlining blend_pixels with clang and
it seems to have a negative impact on a number of benchmarks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (13 preceding siblings ...)
  2021-03-26 19:31 ` jmuizelaar at mozilla dot com
@ 2021-03-26 21:38 ` hubicka at gcc dot gnu.org
  2021-03-26 22:13 ` hubicka at gcc dot gnu.org
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: hubicka at gcc dot gnu.org @ 2021-03-26 21:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #15 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
We run into the size estimate with always inlines because after inlining we
update the size of caller (because that does matter when inlining normal
functions).

We already have special purepose always inliner to avoid some of the issues, so
I guess we keep running into this during the late IPA inlining?

Honza

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (14 preceding siblings ...)
  2021-03-26 21:38 ` hubicka at gcc dot gnu.org
@ 2021-03-26 22:13 ` hubicka at gcc dot gnu.org
  2021-03-31 13:32 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: hubicka at gcc dot gnu.org @ 2021-03-26 22:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #16 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
OK,we seem to handle all relevant always_inlines in early passes and then we
produce functions large function with many non-always_inline calls that we
spend a lot of time inlining.  This is becuase we have relative function growth
bounds that are quite high and we manage to get a lot of inlining done.
I guess clang hits cap on those earlier. I will check if I can save some
compile time.

Honza

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (15 preceding siblings ...)
  2021-03-26 22:13 ` hubicka at gcc dot gnu.org
@ 2021-03-31 13:32 ` rguenth at gcc dot gnu.org
  2021-03-31 13:37 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-31 13:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jeff Muizelaar from comment #14)
> re: __builtin_shuffle vs __builtin_shufflevector - It looks like
> __builtin_shuffle doesn't support constructing vectors of a different size
> than input type. That's mostly what we're using __builtin_shufflevector for.
> __builtin_shufflevector
> https://github.com/servo/webrender/blob/master/swgl/src/vector_type.h

Indeed that's more powerful that __builtin_shuffle.  It would map loosely
as to what (vec_select (vec_concat ....) ...) on RTL can do but on
GIMPLE we don't have a 1:1 match though I've thought of extending it this
way at multiple occasions (by extending the existing VEC_PERM_EXPR, basically
relaxing the set of valid operands/results).  The complication of doing that
is always that targets need to be made aware of the possibilities.

At least internally I've also pondered allowing a scalar as input serving
as single-element vector.

> I briefly tried to get the gcc variant of the code compiling with clang but
> ran into a number of issues including clang's lack of support for
> '__builtin_shuffle'. If you'd like to try, the swgl code is pretty easy to
> build locally if you. You should be able to just checkout
> https://github.com/servo/webrender/ navigate to the the 'swgl' directory and
> run 'cargo build --release'
>
> re: inlining huge functions - We tried not inlining blend_pixels with clang
> and it seems to have a negative impact on a number of benchmarks.

Did you report this as an issue to them?  That is, leaving auto-inlining
to clangs heuristic and those not working?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (16 preceding siblings ...)
  2021-03-31 13:32 ` rguenth at gcc dot gnu.org
@ 2021-03-31 13:37 ` jakub at gcc dot gnu.org
  2021-05-21  7:29 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-31 13:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Also, __builtin_shufflevector allows to say -1 as a don't care element, our
current infrastructure doesn't allow that, but it would be nice even for
internal uses.  On the other side, I think __builtin_shufflevector allows only
constant indices, while __builtin_shuffle allows arbitrary runtime reshuffling.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (17 preceding siblings ...)
  2021-03-31 13:37 ` jakub at gcc dot gnu.org
@ 2021-05-21  7:29 ` rguenth at gcc dot gnu.org
  2021-05-21  9:35 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-21  7:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #18)
> Also, __builtin_shufflevector allows to say -1 as a don't care element, our
> current infrastructure doesn't allow that, but it would be nice even for
> internal uses.  On the other side, I think __builtin_shufflevector allows
> only constant indices, while __builtin_shuffle allows arbitrary runtime
> reshuffling.

Yes, I think they complement each other.  The question would be whether we'd
want to represent both with VEC_PERM_EXPR on GIMPLE.  And how to present the
more flexible cases to the RTL expander and targets.  const permutes seem
to be handled via the vec_perm_const target hook and not the vec_perm
optab, so a possibility would be to create a new hook with relaxed mode
requirements - either by passing in three modes or some dummy RTXen.

OTOH it should be possible to handle some cases purely in the expander
by using paradoxical subregs when sources are of smaller size.  With
larger size sources the -1 would come in handy allowing for larger
results and subregging them.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (18 preceding siblings ...)
  2021-05-21  7:29 ` rguenth at gcc dot gnu.org
@ 2021-05-21  9:35 ` rguenth at gcc dot gnu.org
  2022-05-17 11:26 ` asolokha at gmx dot com
  2022-05-17 15:09 ` jmuizelaar at mozilla dot com
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-21  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 50852
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50852&action=edit
prototype for __builtin_shufflevector

So this is a prototype for __builtin_shufflevector support.  Most notably the
C++ FE misses support for templates.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (19 preceding siblings ...)
  2021-05-21  9:35 ` rguenth at gcc dot gnu.org
@ 2022-05-17 11:26 ` asolokha at gmx dot com
  2022-05-17 15:09 ` jmuizelaar at mozilla dot com
  21 siblings, 0 replies; 23+ messages in thread
From: asolokha at gmx dot com @ 2022-05-17 11:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

Arseny Solokha <asolokha at gmx dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |asolokha at gmx dot com

--- Comment #21 from Arseny Solokha <asolokha at gmx dot com> ---
I believe this PR needs some reassessment now, given __builtin_shufflevector
has been finally implemented.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug ipa/99785] Awful lot of time spent building gl.cc in Firefox
  2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
                   ` (20 preceding siblings ...)
  2022-05-17 11:26 ` asolokha at gmx dot com
@ 2022-05-17 15:09 ` jmuizelaar at mozilla dot com
  21 siblings, 0 replies; 23+ messages in thread
From: jmuizelaar at mozilla dot com @ 2022-05-17 15:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785

--- Comment #22 from Jeff Muizelaar <jmuizelaar at mozilla dot com> ---
GCC doesn't support clang's xyzw vector attributes, so it still not easy to
build the clang path in GCC

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-05-17 15:09 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-26  7:57 [Bug c++/99785] New: Awful lot of time spent building gl.cc in Firefox mh+gcc at glandium dot org
2021-03-26  8:26 ` [Bug c++/99785] " pinskia at gcc dot gnu.org
2021-03-26  8:27 ` pinskia at gcc dot gnu.org
2021-03-26  8:29 ` pinskia at gcc dot gnu.org
2021-03-26  8:33 ` [Bug ipa/99785] " mh+gcc at glandium dot org
2021-03-26  8:41 ` pinskia at gcc dot gnu.org
2021-03-26  8:49 ` mh+gcc at glandium dot org
2021-03-26  9:08 ` mh+gcc at glandium dot org
2021-03-26  9:27 ` pinskia at gcc dot gnu.org
2021-03-26  9:29 ` jakub at gcc dot gnu.org
2021-03-26  9:33 ` rguenth at gcc dot gnu.org
2021-03-26  9:35 ` rguenth at gcc dot gnu.org
2021-03-26 10:11 ` rguenth at gcc dot gnu.org
2021-03-26 11:16 ` rguenth at gcc dot gnu.org
2021-03-26 19:31 ` jmuizelaar at mozilla dot com
2021-03-26 21:38 ` hubicka at gcc dot gnu.org
2021-03-26 22:13 ` hubicka at gcc dot gnu.org
2021-03-31 13:32 ` rguenth at gcc dot gnu.org
2021-03-31 13:37 ` jakub at gcc dot gnu.org
2021-05-21  7:29 ` rguenth at gcc dot gnu.org
2021-05-21  9:35 ` rguenth at gcc dot gnu.org
2022-05-17 11:26 ` asolokha at gmx dot com
2022-05-17 15:09 ` jmuizelaar at mozilla dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).