* [PATCH] constexprify some tree variables @ 2022-11-18 18:05 apinski 2022-11-18 20:06 ` Jeff Law 2023-01-26 14:45 ` Patrick Palka 0 siblings, 2 replies; 15+ messages in thread From: apinski @ 2022-11-18 18:05 UTC (permalink / raw) To: gcc-patches; +Cc: Andrew Pinski From: Andrew Pinski <apinski@marvell.com> Since we use C++11 by default now, we can use constexpr for some const decls in tree-core.h. This patch does that and it allows for better optimizations of GCC code with checking enabled and without LTO. For an example generic-match.cc compiling is speed up due to the less number of basic blocks and less debugging info produced. I did not check the speed of compiling the same source but rather the speed of compiling the old vs new sources here (but with the same compiler base). The small slow down in the parsing of the arrays in each TU is migrated by a speed up in how much code/debugging info is produced in the end. Note I looked at generic-match.cc since it is one of the compiling sources which causes parallel building to stall and I wanted to speed it up. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Or should this wait until GCC 13 branches off? gcc/ChangeLog: PR middle-end/14840 * tree-core.h (tree_code_type): Constexprify by including all-tree.def. (tree_code_length): Likewise. * tree.cc (tree_code_type): Remove. (tree_code_length): Remove. --- gcc/tree-core.h | 21 +++++++++++++++++++-- gcc/tree.cc | 24 ------------------------ 2 files changed, 19 insertions(+), 26 deletions(-) diff --git a/gcc/tree-core.h b/gcc/tree-core.h index af75522504f..e146b133dbd 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -2284,15 +2284,32 @@ struct floatn_type_info { /* Matrix describing the structures contained in a given tree code. */ extern bool tree_contains_struct[MAX_TREE_CODES][64]; +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, +#define END_OF_BASE_TREE_CODES tcc_exceptional, + + /* Class of tree given its code. */ -extern const enum tree_code_class tree_code_type[]; +constexpr enum tree_code_class tree_code_type[] = { +#include "all-tree.def" +}; + +#undef DEFTREECODE +#undef END_OF_BASE_TREE_CODES /* Each tree code class has an associated string representation. These must correspond to the tree_code_class entries. */ extern const char *const tree_code_class_strings[]; /* Number of argument-words in each kind of tree-node. */ -extern const unsigned char tree_code_length[]; + +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, +#define END_OF_BASE_TREE_CODES 0, +constexpr unsigned char tree_code_length[] = { +#include "all-tree.def" +}; + +#undef DEFTREECODE +#undef END_OF_BASE_TREE_CODES /* Vector of all alias pairs for global symbols. */ extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; diff --git a/gcc/tree.cc b/gcc/tree.cc index 574bd2e65d9..254b2373dcf 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -74,31 +74,7 @@ along with GCC; see the file COPYING3. If not see #include "asan.h" #include "ubsan.h" -/* Tree code classes. */ -#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, -#define END_OF_BASE_TREE_CODES tcc_exceptional, - -const enum tree_code_class tree_code_type[] = { -#include "all-tree.def" -}; - -#undef DEFTREECODE -#undef END_OF_BASE_TREE_CODES - -/* Table indexed by tree code giving number of expression - operands beyond the fixed part of the node structure. - Not used for types or decls. */ - -#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, -#define END_OF_BASE_TREE_CODES 0, - -const unsigned char tree_code_length[] = { -#include "all-tree.def" -}; - -#undef DEFTREECODE -#undef END_OF_BASE_TREE_CODES /* Names of tree components. Used for printing out the tree and error messages. */ -- 2.17.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2022-11-18 18:05 [PATCH] constexprify some tree variables apinski @ 2022-11-18 20:06 ` Jeff Law 2022-11-19 2:53 ` Andrew Pinski 2023-01-26 14:45 ` Patrick Palka 1 sibling, 1 reply; 15+ messages in thread From: Jeff Law @ 2022-11-18 20:06 UTC (permalink / raw) To: apinski, gcc-patches On 11/18/22 11:05, apinski--- via Gcc-patches wrote: > From: Andrew Pinski <apinski@marvell.com> > > Since we use C++11 by default now, we can > use constexpr for some const decls in tree-core.h. > > This patch does that and it allows for better optimizations > of GCC code with checking enabled and without LTO. > > For an example generic-match.cc compiling is speed up due > to the less number of basic blocks and less debugging info > produced. I did not check the speed of compiling the same source > but rather the speed of compiling the old vs new sources here > (but with the same compiler base). > > The small slow down in the parsing of the arrays in each TU > is migrated by a speed up in how much code/debugging info > is produced in the end. > > Note I looked at generic-match.cc since it is one of the > compiling sources which causes parallel building to stall and > I wanted to speed it up. > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > Or should this wait until GCC 13 branches off? > > gcc/ChangeLog: > > PR middle-end/14840 > * tree-core.h (tree_code_type): Constexprify > by including all-tree.def. > (tree_code_length): Likewise. > * tree.cc (tree_code_type): Remove. > (tree_code_length): Remove. I would have preferred this a week ago :-) And if it was just const-ifying, I'd ACK it without hesitation. Can you share any of the build-time speedups you're seeing, even if they're not perfect. It'd help to get a sense of the potential gain here and whether or not there's enough gain to gate it into gcc-13 or have it wait for gcc-14. And if we can improve the compile-time of the files generated by match.pd, that's a win. It's definitely a serialization point -- it becomes *painfully* obvious when doing a bootstrap using qemu, when that file takes 1-2hrs after everything else has finished. Jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2022-11-18 20:06 ` Jeff Law @ 2022-11-19 2:53 ` Andrew Pinski 2022-11-19 16:33 ` Jeff Law 0 siblings, 1 reply; 15+ messages in thread From: Andrew Pinski @ 2022-11-19 2:53 UTC (permalink / raw) To: Jeff Law; +Cc: apinski, gcc-patches On Fri, Nov 18, 2022 at 12:06 PM Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > On 11/18/22 11:05, apinski--- via Gcc-patches wrote: > > From: Andrew Pinski <apinski@marvell.com> > > > > Since we use C++11 by default now, we can > > use constexpr for some const decls in tree-core.h. > > > > This patch does that and it allows for better optimizations > > of GCC code with checking enabled and without LTO. > > > > For an example generic-match.cc compiling is speed up due > > to the less number of basic blocks and less debugging info > > produced. I did not check the speed of compiling the same source > > but rather the speed of compiling the old vs new sources here > > (but with the same compiler base). > > > > The small slow down in the parsing of the arrays in each TU > > is migrated by a speed up in how much code/debugging info > > is produced in the end. > > > > Note I looked at generic-match.cc since it is one of the > > compiling sources which causes parallel building to stall and > > I wanted to speed it up. > > > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > Or should this wait until GCC 13 branches off? > > > > gcc/ChangeLog: > > > > PR middle-end/14840 > > * tree-core.h (tree_code_type): Constexprify > > by including all-tree.def. > > (tree_code_length): Likewise. > > * tree.cc (tree_code_type): Remove. > > (tree_code_length): Remove. > > I would have preferred this a week ago :-) And if it was just > const-ifying, I'd ACK it without hesitation. Yes I know which is why I am ok with waiting for GCC 14 really. I decided to try to clear out some of the old bug reports assigned to myself and this one was one of the oldest and also one of the easiest to do. > > Can you share any of the build-time speedups you're seeing, even if > they're not perfect. It'd help to get a sense of the potential gain > here and whether or not there's enough gain to gate it into gcc-13 or > have it wait for gcc-14. > > > And if we can improve the compile-time of the files generated by > match.pd, that's a win. It's definitely a serialization point -- it > becomes *painfully* obvious when doing a bootstrap using qemu, when that > file takes 1-2hrs after everything else has finished. I recorded some of the timings in the bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14840#c14 Summary is using the same compiler as a base, compiling generic-match.cc is now ~7% faster. I have not looked into why but I can only assume it is due to less debug info and less basic blocks. I assume without checking enabled (or rather release checking) on the sources, I can only assume the speedup is not going to be seen. Most of the constant reads are in the checking part of the code. Thanks, Andrew Pinski > > > Jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2022-11-19 2:53 ` Andrew Pinski @ 2022-11-19 16:33 ` Jeff Law 0 siblings, 0 replies; 15+ messages in thread From: Jeff Law @ 2022-11-19 16:33 UTC (permalink / raw) To: Andrew Pinski; +Cc: apinski, gcc-patches On 11/18/22 19:53, Andrew Pinski wrote: > On Fri, Nov 18, 2022 at 12:06 PM Jeff Law via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> On 11/18/22 11:05, apinski--- via Gcc-patches wrote: >>> From: Andrew Pinski <apinski@marvell.com> >>> >>> Since we use C++11 by default now, we can >>> use constexpr for some const decls in tree-core.h. >>> >>> This patch does that and it allows for better optimizations >>> of GCC code with checking enabled and without LTO. >>> >>> For an example generic-match.cc compiling is speed up due >>> to the less number of basic blocks and less debugging info >>> produced. I did not check the speed of compiling the same source >>> but rather the speed of compiling the old vs new sources here >>> (but with the same compiler base). >>> >>> The small slow down in the parsing of the arrays in each TU >>> is migrated by a speed up in how much code/debugging info >>> is produced in the end. >>> >>> Note I looked at generic-match.cc since it is one of the >>> compiling sources which causes parallel building to stall and >>> I wanted to speed it up. >>> >>> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. >>> Or should this wait until GCC 13 branches off? >>> >>> gcc/ChangeLog: >>> >>> PR middle-end/14840 >>> * tree-core.h (tree_code_type): Constexprify >>> by including all-tree.def. >>> (tree_code_length): Likewise. >>> * tree.cc (tree_code_type): Remove. >>> (tree_code_length): Remove. >> I would have preferred this a week ago :-) And if it was just >> const-ifying, I'd ACK it without hesitation. > Yes I know which is why I am ok with waiting for GCC 14 really. I > decided to try to clear out some of the old bug reports assigned to > myself and this one was one of the oldest and also one of the easiest > to do. Let's go ahead. It addresses a bug and should be quite safe. Jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2022-11-18 18:05 [PATCH] constexprify some tree variables apinski 2022-11-18 20:06 ` Jeff Law @ 2023-01-26 14:45 ` Patrick Palka 2023-01-26 14:51 ` Jakub Jelinek 2023-01-26 15:59 ` [PATCH] tree: Fix up tree_code_{length,type} Jakub Jelinek 1 sibling, 2 replies; 15+ messages in thread From: Patrick Palka @ 2023-01-26 14:45 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc-patches, jason On Fri, 18 Nov 2022, apinski--- via Gcc-patches wrote: > From: Andrew Pinski <apinski@marvell.com> > > Since we use C++11 by default now, we can > use constexpr for some const decls in tree-core.h. > > This patch does that and it allows for better optimizations > of GCC code with checking enabled and without LTO. > > For an example generic-match.cc compiling is speed up due > to the less number of basic blocks and less debugging info > produced. I did not check the speed of compiling the same source > but rather the speed of compiling the old vs new sources here > (but with the same compiler base). > > The small slow down in the parsing of the arrays in each TU > is migrated by a speed up in how much code/debugging info > is produced in the end. > > Note I looked at generic-match.cc since it is one of the > compiling sources which causes parallel building to stall and > I wanted to speed it up. > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > Or should this wait until GCC 13 branches off? > > gcc/ChangeLog: > > PR middle-end/14840 > * tree-core.h (tree_code_type): Constexprify > by including all-tree.def. > (tree_code_length): Likewise. > * tree.cc (tree_code_type): Remove. > (tree_code_length): Remove. > --- > gcc/tree-core.h | 21 +++++++++++++++++++-- > gcc/tree.cc | 24 ------------------------ > 2 files changed, 19 insertions(+), 26 deletions(-) > > diff --git a/gcc/tree-core.h b/gcc/tree-core.h > index af75522504f..e146b133dbd 100644 > --- a/gcc/tree-core.h > +++ b/gcc/tree-core.h > @@ -2284,15 +2284,32 @@ struct floatn_type_info { > /* Matrix describing the structures contained in a given tree code. */ > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > + > + > /* Class of tree given its code. */ > -extern const enum tree_code_class tree_code_type[]; > +constexpr enum tree_code_class tree_code_type[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES > > /* Each tree code class has an associated string representation. > These must correspond to the tree_code_class entries. */ > extern const char *const tree_code_class_strings[]; > > /* Number of argument-words in each kind of tree-node. */ > -extern const unsigned char tree_code_length[]; > + > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > +#define END_OF_BASE_TREE_CODES 0, > +constexpr unsigned char tree_code_length[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES IIUC defining these globals as non-inline constexpr gives them internal linkage, and so each TU contains its own unique copy of these globals. This bloats cc1plus by a tiny bit and is technically an ODR violation because some inline functions such as tree_class_check also ODR-use these variables and so each defn of tree_class_check will refer to a "different" tree_code_class. Since inline variables are a C++17 feature, I guess we could fix this by defining the globals the old way before C++17 and as inline constexpr otherwise? > > /* Vector of all alias pairs for global symbols. */ > extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; > diff --git a/gcc/tree.cc b/gcc/tree.cc > index 574bd2e65d9..254b2373dcf 100644 > --- a/gcc/tree.cc > +++ b/gcc/tree.cc > @@ -74,31 +74,7 @@ along with GCC; see the file COPYING3. If not see > #include "asan.h" > #include "ubsan.h" > > -/* Tree code classes. */ > > -#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > -#define END_OF_BASE_TREE_CODES tcc_exceptional, > - > -const enum tree_code_class tree_code_type[] = { > -#include "all-tree.def" > -}; > - > -#undef DEFTREECODE > -#undef END_OF_BASE_TREE_CODES > - > -/* Table indexed by tree code giving number of expression > - operands beyond the fixed part of the node structure. > - Not used for types or decls. */ > - > -#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > -#define END_OF_BASE_TREE_CODES 0, > - > -const unsigned char tree_code_length[] = { > -#include "all-tree.def" > -}; > - > -#undef DEFTREECODE > -#undef END_OF_BASE_TREE_CODES > > /* Names of tree components. > Used for printing out the tree and error messages. */ > -- > 2.17.1 > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2023-01-26 14:45 ` Patrick Palka @ 2023-01-26 14:51 ` Jakub Jelinek 2023-01-26 14:58 ` Jakub Jelinek 2023-01-26 15:59 ` [PATCH] tree: Fix up tree_code_{length,type} Jakub Jelinek 1 sibling, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-01-26 14:51 UTC (permalink / raw) To: Patrick Palka; +Cc: Andrew Pinski, gcc-patches, jason On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > -extern const unsigned char tree_code_length[]; > > + > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > +#define END_OF_BASE_TREE_CODES 0, > > +constexpr unsigned char tree_code_length[] = { > > +#include "all-tree.def" > > +}; > > + > > +#undef DEFTREECODE > > +#undef END_OF_BASE_TREE_CODES > > IIUC defining these globals as non-inline constexpr gives them internal > linkage, and so each TU contains its own unique copy of these globals. > This bloats cc1plus by a tiny bit and is technically an ODR violation > because some inline functions such as tree_class_check also ODR-use > these variables and so each defn of tree_class_check will refer to a > "different" tree_code_class. Since inline variables are a C++17 > feature, I guess we could fix this by defining the globals the old way > before C++17 and as inline constexpr otherwise? Agreed, just use __cpp_inline_variables >= 201606L to select between the old and new ways. Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] constexprify some tree variables 2023-01-26 14:51 ` Jakub Jelinek @ 2023-01-26 14:58 ` Jakub Jelinek 0 siblings, 0 replies; 15+ messages in thread From: Jakub Jelinek @ 2023-01-26 14:58 UTC (permalink / raw) To: Patrick Palka, Andrew Pinski, gcc-patches, jason On Thu, Jan 26, 2023 at 03:51:07PM +0100, Jakub Jelinek via Gcc-patches wrote: > On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > > -extern const unsigned char tree_code_length[]; > > > + > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > > +#define END_OF_BASE_TREE_CODES 0, > > > +constexpr unsigned char tree_code_length[] = { > > > +#include "all-tree.def" > > > +}; > > > + > > > +#undef DEFTREECODE > > > +#undef END_OF_BASE_TREE_CODES > > > > IIUC defining these globals as non-inline constexpr gives them internal > > linkage, and so each TU contains its own unique copy of these globals. > > This bloats cc1plus by a tiny bit and is technically an ODR violation > > because some inline functions such as tree_class_check also ODR-use > > these variables and so each defn of tree_class_check will refer to a > > "different" tree_code_class. Since inline variables are a C++17 > > feature, I guess we could fix this by defining the globals the old way > > before C++17 and as inline constexpr otherwise? > > Agreed, just use > __cpp_inline_variables >= 201606L > to select between the old and new ways. And I'd argue with the tiny bit. In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. So, that means waste of 555016 .rodata bytes, plus being highly non-cache friendly. Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] tree: Fix up tree_code_{length,type} 2023-01-26 14:45 ` Patrick Palka 2023-01-26 14:51 ` Jakub Jelinek @ 2023-01-26 15:59 ` Jakub Jelinek 2023-01-26 18:03 ` Patrick Palka 2023-01-27 7:42 ` Richard Biener 1 sibling, 2 replies; 15+ messages in thread From: Jakub Jelinek @ 2023-01-26 15:59 UTC (permalink / raw) To: Richard Biener, Jeff Law, Andrew Pinski, Patrick Palka; +Cc: gcc-patches On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > > + > > + > > /* Class of tree given its code. */ > > -extern const enum tree_code_class tree_code_type[]; > > +constexpr enum tree_code_class tree_code_type[] = { > > +#include "all-tree.def" > > +}; > > + > > +#undef DEFTREECODE > > +#undef END_OF_BASE_TREE_CODES > > > > /* Each tree code class has an associated string representation. > > These must correspond to the tree_code_class entries. */ > > extern const char *const tree_code_class_strings[]; > > > > /* Number of argument-words in each kind of tree-node. */ > > -extern const unsigned char tree_code_length[]; > > + > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > +#define END_OF_BASE_TREE_CODES 0, > > +constexpr unsigned char tree_code_length[] = { > > +#include "all-tree.def" > > +}; > > + > > +#undef DEFTREECODE > > +#undef END_OF_BASE_TREE_CODES > > IIUC defining these globals as non-inline constexpr gives them internal > linkage, and so each TU contains its own unique copy of these globals. > This bloats cc1plus by a tiny bit and is technically an ODR violation > because some inline functions such as tree_class_check also ODR-use > these variables and so each defn of tree_class_check will refer to a > "different" tree_code_class. Since inline variables are a C++17 > feature, I guess we could fix this by defining the globals the old way > before C++17 and as inline constexpr otherwise? And I'd argue with the tiny bit. In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. So, that means waste of 555016 .rodata bytes, plus being highly non-cache friendly. The following patch does that. So far tested on x86_64-linux in my -O0 working tree (system gcc 12 compiler) where .rodata shrunk with the patch by 928896 bytes, in last stage of a bootstrapped tree (built by today's prev-gcc) where .rodata shrunk by 561728 bytes (in neither case .text or most other sections changed sizes) and on powerpc64le-linux --disable-bootstrap (system gcc 4.8.5) to test also the non-C++17 case. Ok for trunk if it passes full bootstrap/regtest? BTW, wonder if tree_code_type couldn't be an array of unsigned char elements rather than enum tree_code_class and we'd then cast it to the enum in the macro, that would shrink that array from 1496 bytes to 374. Of course, that sounds like stage1 material. 2023-01-26 Patrick Palka <ppalka@redhat.com> Jakub Jelinek <jakub@redhat.com> * tree-core.h (tree_code_type, tree_code_length): For C++17 and later, add inline keyword, otherwise don't define the arrays, but declare extern arrays. * tree.cc (tree_code_type, tree_code_length): Define these arrays for C++14 and older. --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 @@ -2284,17 +2284,20 @@ struct floatn_type_info { /* Matrix describing the structures contained in a given tree code. */ extern bool tree_contains_struct[MAX_TREE_CODES][64]; +/* Class of tree given its code. */ +#if __cpp_inline_variables >= 201606L #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, #define END_OF_BASE_TREE_CODES tcc_exceptional, - -/* Class of tree given its code. */ -constexpr enum tree_code_class tree_code_type[] = { +constexpr inline enum tree_code_class tree_code_type[] = { #include "all-tree.def" }; #undef DEFTREECODE #undef END_OF_BASE_TREE_CODES +#else +extern const enum tree_code_class tree_code_type[]; +#endif /* Each tree code class has an associated string representation. These must correspond to the tree_code_class entries. */ @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class /* Number of argument-words in each kind of tree-node. */ +#if __cpp_inline_variables >= 201606L #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, #define END_OF_BASE_TREE_CODES 0, -constexpr unsigned char tree_code_length[] = { +constexpr inline unsigned char tree_code_length[] = { #include "all-tree.def" }; #undef DEFTREECODE #undef END_OF_BASE_TREE_CODES +#else +extern const unsigned char tree_code_length[]; +#endif /* Vector of all alias pairs for global symbols. */ extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; --- gcc/tree.cc.jj 2023-01-13 17:37:45.259482663 +0100 +++ gcc/tree.cc 2023-01-26 16:03:59.796878082 +0100 @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3. #include "asan.h" #include "ubsan.h" +#if __cpp_inline_variables < 201606L +/* Tree code classes. */ +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, +#define END_OF_BASE_TREE_CODES tcc_exceptional, + +const enum tree_code_class tree_code_type[] = { +#include "all-tree.def" +}; + +#undef DEFTREECODE +#undef END_OF_BASE_TREE_CODES + +/* Table indexed by tree code giving number of expression + operands beyond the fixed part of the node structure. + Not used for types or decls. */ + +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, +#define END_OF_BASE_TREE_CODES 0, + +const unsigned char tree_code_length[] = { +#include "all-tree.def" +}; + +#undef DEFTREECODE +#undef END_OF_BASE_TREE_CODES +#endif /* Names of tree components. Used for printing out the tree and error messages. */ Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-26 15:59 ` [PATCH] tree: Fix up tree_code_{length,type} Jakub Jelinek @ 2023-01-26 18:03 ` Patrick Palka 2023-01-27 12:40 ` Patrick Palka 2023-01-27 7:42 ` Richard Biener 1 sibling, 1 reply; 15+ messages in thread From: Patrick Palka @ 2023-01-26 18:03 UTC (permalink / raw) To: Jakub Jelinek Cc: Richard Biener, Jeff Law, Andrew Pinski, Patrick Palka, gcc-patches On Thu, 26 Jan 2023, Jakub Jelinek wrote: > On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > > > + > > > + > > > /* Class of tree given its code. */ > > > -extern const enum tree_code_class tree_code_type[]; > > > +constexpr enum tree_code_class tree_code_type[] = { > > > +#include "all-tree.def" > > > +}; > > > + > > > +#undef DEFTREECODE > > > +#undef END_OF_BASE_TREE_CODES > > > > > > /* Each tree code class has an associated string representation. > > > These must correspond to the tree_code_class entries. */ > > > extern const char *const tree_code_class_strings[]; > > > > > > /* Number of argument-words in each kind of tree-node. */ > > > -extern const unsigned char tree_code_length[]; > > > + > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > > +#define END_OF_BASE_TREE_CODES 0, > > > +constexpr unsigned char tree_code_length[] = { > > > +#include "all-tree.def" > > > +}; > > > + > > > +#undef DEFTREECODE > > > +#undef END_OF_BASE_TREE_CODES > > > > IIUC defining these globals as non-inline constexpr gives them internal > > linkage, and so each TU contains its own unique copy of these globals. > > This bloats cc1plus by a tiny bit and is technically an ODR violation > > because some inline functions such as tree_class_check also ODR-use > > these variables and so each defn of tree_class_check will refer to a > > "different" tree_code_class. Since inline variables are a C++17 > > feature, I guess we could fix this by defining the globals the old way > > before C++17 and as inline constexpr otherwise? > > And I'd argue with the tiny bit. > In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, > 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. > So, that means waste of 555016 .rodata bytes, plus being highly non-cache > friendly. > > The following patch does that. > > So far tested on x86_64-linux in my -O0 working tree (system gcc 12 > compiler) where .rodata shrunk with the patch by 928896 bytes, in last > stage of a bootstrapped tree (built by today's prev-gcc) where .rodata > shrunk by 561728 bytes (in neither case .text or most other sections > changed sizes) and on powerpc64le-linux --disable-bootstrap > (system gcc 4.8.5) to test also the non-C++17 case. LGTM FWIW. On a related note I noticed the function tree.h:tree_operand_length is declared static and is then used in the non-static inline functions tree_operand_check etc, which seems to be also be a (harmless) ODR violation? We probably should do s/static inline/inline throughout the header files at some point, which'd hopefully reduce the size of and speed up stage1 cc1plus. > > Ok for trunk if it passes full bootstrap/regtest? > > BTW, wonder if tree_code_type couldn't be an array of unsigned char > elements rather than enum tree_code_class and we'd then cast it > to the enum in the macro, that would shrink that array from 1496 bytes > to 374. Of course, that sounds like stage1 material. > > 2023-01-26 Patrick Palka <ppalka@redhat.com> > Jakub Jelinek <jakub@redhat.com> > > * tree-core.h (tree_code_type, tree_code_length): For > C++17 and later, add inline keyword, otherwise don't define > the arrays, but declare extern arrays. > * tree.cc (tree_code_type, tree_code_length): Define these > arrays for C++14 and older. > > --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 > +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 > @@ -2284,17 +2284,20 @@ struct floatn_type_info { > /* Matrix describing the structures contained in a given tree code. */ > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > +/* Class of tree given its code. */ > +#if __cpp_inline_variables >= 201606L > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > #define END_OF_BASE_TREE_CODES tcc_exceptional, > > - > -/* Class of tree given its code. */ > -constexpr enum tree_code_class tree_code_type[] = { > +constexpr inline enum tree_code_class tree_code_type[] = { > #include "all-tree.def" > }; > > #undef DEFTREECODE > #undef END_OF_BASE_TREE_CODES > +#else > +extern const enum tree_code_class tree_code_type[]; > +#endif > > /* Each tree code class has an associated string representation. > These must correspond to the tree_code_class entries. */ > @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class > > /* Number of argument-words in each kind of tree-node. */ > > +#if __cpp_inline_variables >= 201606L > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > #define END_OF_BASE_TREE_CODES 0, > -constexpr unsigned char tree_code_length[] = { > +constexpr inline unsigned char tree_code_length[] = { > #include "all-tree.def" > }; > > #undef DEFTREECODE > #undef END_OF_BASE_TREE_CODES > +#else > +extern const unsigned char tree_code_length[]; > +#endif > > /* Vector of all alias pairs for global symbols. */ > extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; > --- gcc/tree.cc.jj 2023-01-13 17:37:45.259482663 +0100 > +++ gcc/tree.cc 2023-01-26 16:03:59.796878082 +0100 > @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3. > #include "asan.h" > #include "ubsan.h" > > +#if __cpp_inline_variables < 201606L > +/* Tree code classes. */ > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > + > +const enum tree_code_class tree_code_type[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES > + > +/* Table indexed by tree code giving number of expression > + operands beyond the fixed part of the node structure. > + Not used for types or decls. */ > + > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > +#define END_OF_BASE_TREE_CODES 0, > + > +const unsigned char tree_code_length[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES > +#endif > > /* Names of tree components. > Used for printing out the tree and error messages. */ > > Jakub > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-26 18:03 ` Patrick Palka @ 2023-01-27 12:40 ` Patrick Palka 2023-01-27 13:14 ` Richard Biener 0 siblings, 1 reply; 15+ messages in thread From: Patrick Palka @ 2023-01-27 12:40 UTC (permalink / raw) To: Patrick Palka Cc: Jakub Jelinek, Richard Biener, Jeff Law, Andrew Pinski, gcc-patches On Thu, 26 Jan 2023, Patrick Palka wrote: > On Thu, 26 Jan 2023, Jakub Jelinek wrote: > > > On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > > > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > > > > + > > > > + > > > > /* Class of tree given its code. */ > > > > -extern const enum tree_code_class tree_code_type[]; > > > > +constexpr enum tree_code_class tree_code_type[] = { > > > > +#include "all-tree.def" > > > > +}; > > > > + > > > > +#undef DEFTREECODE > > > > +#undef END_OF_BASE_TREE_CODES > > > > > > > > /* Each tree code class has an associated string representation. > > > > These must correspond to the tree_code_class entries. */ > > > > extern const char *const tree_code_class_strings[]; > > > > > > > > /* Number of argument-words in each kind of tree-node. */ > > > > -extern const unsigned char tree_code_length[]; > > > > + > > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > > > +#define END_OF_BASE_TREE_CODES 0, > > > > +constexpr unsigned char tree_code_length[] = { > > > > +#include "all-tree.def" > > > > +}; > > > > + > > > > +#undef DEFTREECODE > > > > +#undef END_OF_BASE_TREE_CODES > > > > > > IIUC defining these globals as non-inline constexpr gives them internal > > > linkage, and so each TU contains its own unique copy of these globals. > > > This bloats cc1plus by a tiny bit and is technically an ODR violation > > > because some inline functions such as tree_class_check also ODR-use > > > these variables and so each defn of tree_class_check will refer to a > > > "different" tree_code_class. Since inline variables are a C++17 > > > feature, I guess we could fix this by defining the globals the old way > > > before C++17 and as inline constexpr otherwise? > > > > And I'd argue with the tiny bit. > > In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, > > 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. > > So, that means waste of 555016 .rodata bytes, plus being highly non-cache > > friendly. > > > > The following patch does that. > > > > So far tested on x86_64-linux in my -O0 working tree (system gcc 12 > > compiler) where .rodata shrunk with the patch by 928896 bytes, in last > > stage of a bootstrapped tree (built by today's prev-gcc) where .rodata > > shrunk by 561728 bytes (in neither case .text or most other sections > > changed sizes) and on powerpc64le-linux --disable-bootstrap > > (system gcc 4.8.5) to test also the non-C++17 case. > > LGTM FWIW. On a related note I noticed the function > tree.h:tree_operand_length is declared static and is then used in the > non-static inline functions tree_operand_check etc, which seems to be > also be a (harmless) ODR violation? > > We probably should do s/static inline/inline throughout the header files > at some point, which'd hopefully reduce the size of and speed up stage1 > cc1plus. Mechanically replacing uses of static inline in headers via echo gcc/*.h gcc/*/*.h | xargs sed -i 's/^static inline/inline/g' reduces rodata size of stage1 cc1plus by ~1.5MB and seems to make it ~2% faster. Not bad.. > > > > > Ok for trunk if it passes full bootstrap/regtest? > > > > BTW, wonder if tree_code_type couldn't be an array of unsigned char > > elements rather than enum tree_code_class and we'd then cast it > > to the enum in the macro, that would shrink that array from 1496 bytes > > to 374. Of course, that sounds like stage1 material. > > > > 2023-01-26 Patrick Palka <ppalka@redhat.com> > > Jakub Jelinek <jakub@redhat.com> > > > > * tree-core.h (tree_code_type, tree_code_length): For > > C++17 and later, add inline keyword, otherwise don't define > > the arrays, but declare extern arrays. > > * tree.cc (tree_code_type, tree_code_length): Define these > > arrays for C++14 and older. > > > > --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 > > +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 > > @@ -2284,17 +2284,20 @@ struct floatn_type_info { > > /* Matrix describing the structures contained in a given tree code. */ > > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > > > +/* Class of tree given its code. */ > > +#if __cpp_inline_variables >= 201606L > > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > #define END_OF_BASE_TREE_CODES tcc_exceptional, > > > > - > > -/* Class of tree given its code. */ > > -constexpr enum tree_code_class tree_code_type[] = { > > +constexpr inline enum tree_code_class tree_code_type[] = { > > #include "all-tree.def" > > }; > > > > #undef DEFTREECODE > > #undef END_OF_BASE_TREE_CODES > > +#else > > +extern const enum tree_code_class tree_code_type[]; > > +#endif > > > > /* Each tree code class has an associated string representation. > > These must correspond to the tree_code_class entries. */ > > @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class > > > > /* Number of argument-words in each kind of tree-node. */ > > > > +#if __cpp_inline_variables >= 201606L > > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > #define END_OF_BASE_TREE_CODES 0, > > -constexpr unsigned char tree_code_length[] = { > > +constexpr inline unsigned char tree_code_length[] = { > > #include "all-tree.def" > > }; > > > > #undef DEFTREECODE > > #undef END_OF_BASE_TREE_CODES > > +#else > > +extern const unsigned char tree_code_length[]; > > +#endif > > > > /* Vector of all alias pairs for global symbols. */ > > extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; > > --- gcc/tree.cc.jj 2023-01-13 17:37:45.259482663 +0100 > > +++ gcc/tree.cc 2023-01-26 16:03:59.796878082 +0100 > > @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3. > > #include "asan.h" > > #include "ubsan.h" > > > > +#if __cpp_inline_variables < 201606L > > +/* Tree code classes. */ > > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > > + > > +const enum tree_code_class tree_code_type[] = { > > +#include "all-tree.def" > > +}; > > + > > +#undef DEFTREECODE > > +#undef END_OF_BASE_TREE_CODES > > + > > +/* Table indexed by tree code giving number of expression > > + operands beyond the fixed part of the node structure. > > + Not used for types or decls. */ > > + > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > +#define END_OF_BASE_TREE_CODES 0, > > + > > +const unsigned char tree_code_length[] = { > > +#include "all-tree.def" > > +}; > > + > > +#undef DEFTREECODE > > +#undef END_OF_BASE_TREE_CODES > > +#endif > > > > /* Names of tree components. > > Used for printing out the tree and error messages. */ > > > > Jakub > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-27 12:40 ` Patrick Palka @ 2023-01-27 13:14 ` Richard Biener 0 siblings, 0 replies; 15+ messages in thread From: Richard Biener @ 2023-01-27 13:14 UTC (permalink / raw) To: Patrick Palka via Gcc-patches Cc: Patrick Palka, Jakub Jelinek, Jeff Law, Andrew Pinski > Am 27.01.2023 um 13:41 schrieb Patrick Palka via Gcc-patches <gcc-patches@gcc.gnu.org>: > > On Thu, 26 Jan 2023, Patrick Palka wrote: > >>> On Thu, 26 Jan 2023, Jakub Jelinek wrote: >>> >>> On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: >>>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, >>>>> +#define END_OF_BASE_TREE_CODES tcc_exceptional, >>>>> + >>>>> + >>>>> /* Class of tree given its code. */ >>>>> -extern const enum tree_code_class tree_code_type[]; >>>>> +constexpr enum tree_code_class tree_code_type[] = { >>>>> +#include "all-tree.def" >>>>> +}; >>>>> + >>>>> +#undef DEFTREECODE >>>>> +#undef END_OF_BASE_TREE_CODES >>>>> >>>>> /* Each tree code class has an associated string representation. >>>>> These must correspond to the tree_code_class entries. */ >>>>> extern const char *const tree_code_class_strings[]; >>>>> >>>>> /* Number of argument-words in each kind of tree-node. */ >>>>> -extern const unsigned char tree_code_length[]; >>>>> + >>>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, >>>>> +#define END_OF_BASE_TREE_CODES 0, >>>>> +constexpr unsigned char tree_code_length[] = { >>>>> +#include "all-tree.def" >>>>> +}; >>>>> + >>>>> +#undef DEFTREECODE >>>>> +#undef END_OF_BASE_TREE_CODES >>>> >>>> IIUC defining these globals as non-inline constexpr gives them internal >>>> linkage, and so each TU contains its own unique copy of these globals. >>>> This bloats cc1plus by a tiny bit and is technically an ODR violation >>>> because some inline functions such as tree_class_check also ODR-use >>>> these variables and so each defn of tree_class_check will refer to a >>>> "different" tree_code_class. Since inline variables are a C++17 >>>> feature, I guess we could fix this by defining the globals the old way >>>> before C++17 and as inline constexpr otherwise? >>> >>> And I'd argue with the tiny bit. >>> In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, >>> 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. >>> So, that means waste of 555016 .rodata bytes, plus being highly non-cache >>> friendly. >>> >>> The following patch does that. >>> >>> So far tested on x86_64-linux in my -O0 working tree (system gcc 12 >>> compiler) where .rodata shrunk with the patch by 928896 bytes, in last >>> stage of a bootstrapped tree (built by today's prev-gcc) where .rodata >>> shrunk by 561728 bytes (in neither case .text or most other sections >>> changed sizes) and on powerpc64le-linux --disable-bootstrap >>> (system gcc 4.8.5) to test also the non-C++17 case. >> >> LGTM FWIW. On a related note I noticed the function >> tree.h:tree_operand_length is declared static and is then used in the >> non-static inline functions tree_operand_check etc, which seems to be >> also be a (harmless) ODR violation? >> >> We probably should do s/static inline/inline throughout the header files >> at some point, which'd hopefully reduce the size of and speed up stage1 >> cc1plus. > > Mechanically replacing uses of static inline in headers via > > echo gcc/*.h gcc/*/*.h | xargs sed -i 's/^static inline/inline/g' > > reduces rodata size of stage1 cc1plus by ~1.5MB and seems to make it ~2% > faster. Not bad.. Nice. Richard >> >>> >>> Ok for trunk if it passes full bootstrap/regtest? >>> >>> BTW, wonder if tree_code_type couldn't be an array of unsigned char >>> elements rather than enum tree_code_class and we'd then cast it >>> to the enum in the macro, that would shrink that array from 1496 bytes >>> to 374. Of course, that sounds like stage1 material. >>> >>> 2023-01-26 Patrick Palka <ppalka@redhat.com> >>> Jakub Jelinek <jakub@redhat.com> >>> >>> * tree-core.h (tree_code_type, tree_code_length): For >>> C++17 and later, add inline keyword, otherwise don't define >>> the arrays, but declare extern arrays. >>> * tree.cc (tree_code_type, tree_code_length): Define these >>> arrays for C++14 and older. >>> >>> --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 >>> +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 >>> @@ -2284,17 +2284,20 @@ struct floatn_type_info { >>> /* Matrix describing the structures contained in a given tree code. */ >>> extern bool tree_contains_struct[MAX_TREE_CODES][64]; >>> >>> +/* Class of tree given its code. */ >>> +#if __cpp_inline_variables >= 201606L >>> #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, >>> #define END_OF_BASE_TREE_CODES tcc_exceptional, >>> >>> - >>> -/* Class of tree given its code. */ >>> -constexpr enum tree_code_class tree_code_type[] = { >>> +constexpr inline enum tree_code_class tree_code_type[] = { >>> #include "all-tree.def" >>> }; >>> >>> #undef DEFTREECODE >>> #undef END_OF_BASE_TREE_CODES >>> +#else >>> +extern const enum tree_code_class tree_code_type[]; >>> +#endif >>> >>> /* Each tree code class has an associated string representation. >>> These must correspond to the tree_code_class entries. */ >>> @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class >>> >>> /* Number of argument-words in each kind of tree-node. */ >>> >>> +#if __cpp_inline_variables >= 201606L >>> #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, >>> #define END_OF_BASE_TREE_CODES 0, >>> -constexpr unsigned char tree_code_length[] = { >>> +constexpr inline unsigned char tree_code_length[] = { >>> #include "all-tree.def" >>> }; >>> >>> #undef DEFTREECODE >>> #undef END_OF_BASE_TREE_CODES >>> +#else >>> +extern const unsigned char tree_code_length[]; >>> +#endif >>> >>> /* Vector of all alias pairs for global symbols. */ >>> extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; >>> --- gcc/tree.cc.jj 2023-01-13 17:37:45.259482663 +0100 >>> +++ gcc/tree.cc 2023-01-26 16:03:59.796878082 +0100 >>> @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3. >>> #include "asan.h" >>> #include "ubsan.h" >>> >>> +#if __cpp_inline_variables < 201606L >>> +/* Tree code classes. */ >>> >>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, >>> +#define END_OF_BASE_TREE_CODES tcc_exceptional, >>> + >>> +const enum tree_code_class tree_code_type[] = { >>> +#include "all-tree.def" >>> +}; >>> + >>> +#undef DEFTREECODE >>> +#undef END_OF_BASE_TREE_CODES >>> + >>> +/* Table indexed by tree code giving number of expression >>> + operands beyond the fixed part of the node structure. >>> + Not used for types or decls. */ >>> + >>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, >>> +#define END_OF_BASE_TREE_CODES 0, >>> + >>> +const unsigned char tree_code_length[] = { >>> +#include "all-tree.def" >>> +}; >>> + >>> +#undef DEFTREECODE >>> +#undef END_OF_BASE_TREE_CODES >>> +#endif >>> >>> /* Names of tree components. >>> Used for printing out the tree and error messages. */ >>> >>> Jakub >>> >>> >> > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-26 15:59 ` [PATCH] tree: Fix up tree_code_{length,type} Jakub Jelinek 2023-01-26 18:03 ` Patrick Palka @ 2023-01-27 7:42 ` Richard Biener 2023-01-27 8:57 ` Jakub Jelinek 1 sibling, 1 reply; 15+ messages in thread From: Richard Biener @ 2023-01-27 7:42 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jeff Law, Andrew Pinski, Patrick Palka, gcc-patches On Thu, 26 Jan 2023, Jakub Jelinek wrote: > On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote: > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > > > + > > > + > > > /* Class of tree given its code. */ > > > -extern const enum tree_code_class tree_code_type[]; > > > +constexpr enum tree_code_class tree_code_type[] = { > > > +#include "all-tree.def" > > > +}; > > > + > > > +#undef DEFTREECODE > > > +#undef END_OF_BASE_TREE_CODES > > > > > > /* Each tree code class has an associated string representation. > > > These must correspond to the tree_code_class entries. */ > > > extern const char *const tree_code_class_strings[]; > > > > > > /* Number of argument-words in each kind of tree-node. */ > > > -extern const unsigned char tree_code_length[]; > > > + > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > > > +#define END_OF_BASE_TREE_CODES 0, > > > +constexpr unsigned char tree_code_length[] = { > > > +#include "all-tree.def" > > > +}; > > > + > > > +#undef DEFTREECODE > > > +#undef END_OF_BASE_TREE_CODES > > > > IIUC defining these globals as non-inline constexpr gives them internal > > linkage, and so each TU contains its own unique copy of these globals. > > This bloats cc1plus by a tiny bit and is technically an ODR violation > > because some inline functions such as tree_class_check also ODR-use > > these variables and so each defn of tree_class_check will refer to a > > "different" tree_code_class. Since inline variables are a C++17 > > feature, I guess we could fix this by defining the globals the old way > > before C++17 and as inline constexpr otherwise? > > And I'd argue with the tiny bit. > In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars, > 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each. > So, that means waste of 555016 .rodata bytes, plus being highly non-cache > friendly. > > The following patch does that. > > So far tested on x86_64-linux in my -O0 working tree (system gcc 12 > compiler) where .rodata shrunk with the patch by 928896 bytes, in last > stage of a bootstrapped tree (built by today's prev-gcc) where .rodata > shrunk by 561728 bytes (in neither case .text or most other sections > changed sizes) and on powerpc64le-linux --disable-bootstrap > (system gcc 4.8.5) to test also the non-C++17 case. > > Ok for trunk if it passes full bootstrap/regtest? > > BTW, wonder if tree_code_type couldn't be an array of unsigned char > elements rather than enum tree_code_class and we'd then cast it > to the enum in the macro, that would shrink that array from 1496 bytes > to 374. Of course, that sounds like stage1 material. One could argue the same way for this patch (and instead revert), I'd say if we tweak this now then tweak it to the maximum extent? Isn't sth like 'enum unsigned char tree_code_class' now possible? (and a static assert the enum values all fit, though that would be diagnosed anyway?) > 2023-01-26 Patrick Palka <ppalka@redhat.com> > Jakub Jelinek <jakub@redhat.com> > > * tree-core.h (tree_code_type, tree_code_length): For > C++17 and later, add inline keyword, otherwise don't define > the arrays, but declare extern arrays. > * tree.cc (tree_code_type, tree_code_length): Define these > arrays for C++14 and older. > > --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 > +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 > @@ -2284,17 +2284,20 @@ struct floatn_type_info { > /* Matrix describing the structures contained in a given tree code. */ > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > +/* Class of tree given its code. */ > +#if __cpp_inline_variables >= 201606L > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > #define END_OF_BASE_TREE_CODES tcc_exceptional, > > - > -/* Class of tree given its code. */ > -constexpr enum tree_code_class tree_code_type[] = { > +constexpr inline enum tree_code_class tree_code_type[] = { > #include "all-tree.def" > }; Do we need an explicit external definition somewhere when constant folding isn't possible? Otherwise looks good to me. Thanks, Richard. > #undef DEFTREECODE > #undef END_OF_BASE_TREE_CODES > +#else > +extern const enum tree_code_class tree_code_type[]; > +#endif > > /* Each tree code class has an associated string representation. > These must correspond to the tree_code_class entries. */ > @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class > > /* Number of argument-words in each kind of tree-node. */ > > +#if __cpp_inline_variables >= 201606L > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > #define END_OF_BASE_TREE_CODES 0, > -constexpr unsigned char tree_code_length[] = { > +constexpr inline unsigned char tree_code_length[] = { > #include "all-tree.def" > }; > > #undef DEFTREECODE > #undef END_OF_BASE_TREE_CODES > +#else > +extern const unsigned char tree_code_length[]; > +#endif > > /* Vector of all alias pairs for global symbols. */ > extern GTY(()) vec<alias_pair, va_gc> *alias_pairs; > --- gcc/tree.cc.jj 2023-01-13 17:37:45.259482663 +0100 > +++ gcc/tree.cc 2023-01-26 16:03:59.796878082 +0100 > @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3. > #include "asan.h" > #include "ubsan.h" > > +#if __cpp_inline_variables < 201606L > +/* Tree code classes. */ > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > +#define END_OF_BASE_TREE_CODES tcc_exceptional, > + > +const enum tree_code_class tree_code_type[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES > + > +/* Table indexed by tree code giving number of expression > + operands beyond the fixed part of the node structure. > + Not used for types or decls. */ > + > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH, > +#define END_OF_BASE_TREE_CODES 0, > + > +const unsigned char tree_code_length[] = { > +#include "all-tree.def" > +}; > + > +#undef DEFTREECODE > +#undef END_OF_BASE_TREE_CODES > +#endif > > /* Names of tree components. > Used for printing out the tree and error messages. */ > > Jakub > > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-27 7:42 ` Richard Biener @ 2023-01-27 8:57 ` Jakub Jelinek 2023-01-27 9:49 ` Richard Biener 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-01-27 8:57 UTC (permalink / raw) To: Richard Biener; +Cc: Jeff Law, Andrew Pinski, Patrick Palka, gcc-patches On Fri, Jan 27, 2023 at 07:42:39AM +0000, Richard Biener wrote: > > BTW, wonder if tree_code_type couldn't be an array of unsigned char > > elements rather than enum tree_code_class and we'd then cast it > > to the enum in the macro, that would shrink that array from 1496 bytes > > to 374. Of course, that sounds like stage1 material. > > One could argue the same way for this patch (and instead revert), Well, this patch is in fact a conditional reversion (revert for C++11/14, add one keyword to 2 declarations otherwise). > I'd say if we tweak this now then tweak it to the maximum extent? > Isn't sth like 'enum unsigned char tree_code_class' now possible? > (and a static assert the enum values all fit, though that would > be diagnosed anyway?) C++11 indeed has enum tree_code_class : unsigned char { tcc_exceptional, ... tcc_expression }; and one indeed gets an error if some enumerator doesn't fit. The problem I see with this is that the type is 8-bit everywhere, which I'd be afraid could cause worse code generation (of course, one would need to try to see how much; e.g. build the compiler unmodified, with the unsigned char array plus explicit casts from the array and finally with unsigned char as underlying type). When passing around enum tree_code_class etc., it is fine if it is 32-bit. And there isn't a way to create an enum with different underlying type but with the same enumerators as in another enum. Perhaps for tree_code_class we could away with the underlying type because it is mostly used in the macros which immediately compare it, in gcc/*.cc just in the following explicitly: expr.cc:get_def_for_expr_class (tree name, enum tree_code_class tclass) fold-const.cc: enum tree_code_class tclass; fold-const.cc: enum tree_code_class tclass = TREE_CODE_CLASS (code); fold-const.cc: enum tree_code_class tclass = TREE_CODE_CLASS (code); fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); gimple-fold.cc: enum tree_code_class kind = TREE_CODE_CLASS (subcode); print-tree.cc: enum tree_code_class tclass; print-tree.cc: enum tree_code_class tclass; tree.cc: These must correspond to the tree_code_class entries. */ tree.cc:const char *const tree_code_class_strings[] = tree.cc: enum tree_code_class type = TREE_CODE_CLASS (code); tree.cc: enum tree_code_class type = TREE_CODE_CLASS (code); tree.cc:tree_class_check_failed (const_tree node, const enum tree_code_class cl, tree.cc:tree_not_class_check_failed (const_tree node, const enum tree_code_class cl, tree.cc: const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); tree.cc: const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); tree-dump.cc: enum tree_code_class code_class; tree-inline.cc: enum tree_code_class cl = TREE_CODE_CLASS (code); tree-pretty-print.cc: enum tree_code_class tclass; tree-ssa-live.cc: enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); tree-ssa-operands.cc: enum tree_code_class codeclass; But as I said, one would need to watch for code generation at least on a couple of common hosts, and while x86_64 should be one of them, it might have bigger effects on others as x86 has byte comparison etc. instructions. > > > 2023-01-26 Patrick Palka <ppalka@redhat.com> > > Jakub Jelinek <jakub@redhat.com> > > > > * tree-core.h (tree_code_type, tree_code_length): For > > C++17 and later, add inline keyword, otherwise don't define > > the arrays, but declare extern arrays. > > * tree.cc (tree_code_type, tree_code_length): Define these > > arrays for C++14 and older. > > > > --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 > > +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 > > @@ -2284,17 +2284,20 @@ struct floatn_type_info { > > /* Matrix describing the structures contained in a given tree code. */ > > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > > > +/* Class of tree given its code. */ > > +#if __cpp_inline_variables >= 201606L > > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > #define END_OF_BASE_TREE_CODES tcc_exceptional, > > > > - > > -/* Class of tree given its code. */ > > -constexpr enum tree_code_class tree_code_type[] = { > > +constexpr inline enum tree_code_class tree_code_type[] = { > > #include "all-tree.def" > > }; > > Do we need an explicit external definition somewhere when > constant folding isn't possible? > > Otherwise looks good to me. > > Thanks, > Richard. > > > #undef DEFTREECODE > > #undef END_OF_BASE_TREE_CODES > > +#else > > +extern const enum tree_code_class tree_code_type[]; There is one here for the C++11 and C++14 cases. For C++17 and later it isn't needed, constexpr inline enum tree_code_class tree_code_type[] = { ... }; means this is a comdat variable in all TUs which need non-ODR uses of it (tree_code_type[23] evaluates to constant expression, but tree_code_type[x] or &tree_code_type[23] etc. often don't and then the comdat var is emitted and all TUs share one copy of the variable. Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] tree: Fix up tree_code_{length,type} 2023-01-27 8:57 ` Jakub Jelinek @ 2023-01-27 9:49 ` Richard Biener 0 siblings, 0 replies; 15+ messages in thread From: Richard Biener @ 2023-01-27 9:49 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jeff Law, Andrew Pinski, Patrick Palka, gcc-patches On Fri, 27 Jan 2023, Jakub Jelinek wrote: > On Fri, Jan 27, 2023 at 07:42:39AM +0000, Richard Biener wrote: > > > BTW, wonder if tree_code_type couldn't be an array of unsigned char > > > elements rather than enum tree_code_class and we'd then cast it > > > to the enum in the macro, that would shrink that array from 1496 bytes > > > to 374. Of course, that sounds like stage1 material. > > > > One could argue the same way for this patch (and instead revert), > > Well, this patch is in fact a conditional reversion (revert for > C++11/14, add one keyword to 2 declarations otherwise). > > > I'd say if we tweak this now then tweak it to the maximum extent? > > Isn't sth like 'enum unsigned char tree_code_class' now possible? > > (and a static assert the enum values all fit, though that would > > be diagnosed anyway?) > > C++11 indeed has > enum tree_code_class : unsigned char { > tcc_exceptional, > ... > tcc_expression > }; > and one indeed gets an error if some enumerator doesn't fit. > The problem I see with this is that the type is 8-bit everywhere, > which I'd be afraid could cause worse code generation (of course, > one would need to try to see how much; e.g. build the compiler > unmodified, with the unsigned char array plus explicit casts from > the array and finally with unsigned char as underlying type). > When passing around enum tree_code_class etc., it is fine if it > is 32-bit. And there isn't a way to create an enum with different > underlying type but with the same enumerators as in another enum. > Perhaps for tree_code_class we could away with the underlying type > because it is mostly used in the macros which immediately compare > it, in gcc/*.cc just in the following explicitly: > expr.cc:get_def_for_expr_class (tree name, enum tree_code_class tclass) > fold-const.cc: enum tree_code_class tclass; > fold-const.cc: enum tree_code_class tclass = TREE_CODE_CLASS (code); > fold-const.cc: enum tree_code_class tclass = TREE_CODE_CLASS (code); > fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); > fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); > fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); > fold-const.cc: enum tree_code_class kind = TREE_CODE_CLASS (code); > gimple-fold.cc: enum tree_code_class kind = TREE_CODE_CLASS (subcode); > print-tree.cc: enum tree_code_class tclass; > print-tree.cc: enum tree_code_class tclass; > tree.cc: These must correspond to the tree_code_class entries. */ > tree.cc:const char *const tree_code_class_strings[] = > tree.cc: enum tree_code_class type = TREE_CODE_CLASS (code); > tree.cc: enum tree_code_class type = TREE_CODE_CLASS (code); > tree.cc:tree_class_check_failed (const_tree node, const enum tree_code_class cl, > tree.cc:tree_not_class_check_failed (const_tree node, const enum tree_code_class cl, > tree.cc: const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); > tree.cc: const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); > tree-dump.cc: enum tree_code_class code_class; > tree-inline.cc: enum tree_code_class cl = TREE_CODE_CLASS (code); > tree-pretty-print.cc: enum tree_code_class tclass; > tree-ssa-live.cc: enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); > tree-ssa-operands.cc: enum tree_code_class codeclass; > But as I said, one would need to watch for code generation at least on > a couple of common hosts, and while x86_64 should be one of them, it might > have bigger effects on others as x86 has byte comparison etc. instructions. Hm, yes. Not sure if using uint_fast8_t would make a difference where it should. So lets keep this change separate. Richard. > > > > > 2023-01-26 Patrick Palka <ppalka@redhat.com> > > > Jakub Jelinek <jakub@redhat.com> > > > > > > * tree-core.h (tree_code_type, tree_code_length): For > > > C++17 and later, add inline keyword, otherwise don't define > > > the arrays, but declare extern arrays. > > > * tree.cc (tree_code_type, tree_code_length): Define these > > > arrays for C++14 and older. > > > > > > --- gcc/tree-core.h.jj 2023-01-02 09:32:31.188158094 +0100 > > > +++ gcc/tree-core.h 2023-01-26 16:02:34.212113251 +0100 > > > @@ -2284,17 +2284,20 @@ struct floatn_type_info { > > > /* Matrix describing the structures contained in a given tree code. */ > > > extern bool tree_contains_struct[MAX_TREE_CODES][64]; > > > > > > +/* Class of tree given its code. */ > > > +#if __cpp_inline_variables >= 201606L > > > #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE, > > > #define END_OF_BASE_TREE_CODES tcc_exceptional, > > > > > > - > > > -/* Class of tree given its code. */ > > > -constexpr enum tree_code_class tree_code_type[] = { > > > +constexpr inline enum tree_code_class tree_code_type[] = { > > > #include "all-tree.def" > > > }; > > > > Do we need an explicit external definition somewhere when > > constant folding isn't possible? > > > > > Otherwise looks good to me. > > > > Thanks, > > Richard. > > > > > #undef DEFTREECODE > > > #undef END_OF_BASE_TREE_CODES > > > +#else > > > +extern const enum tree_code_class tree_code_type[]; > > There is one here for the C++11 and C++14 cases. > For C++17 and later it isn't needed, > constexpr inline enum tree_code_class tree_code_type[] = { > ... > }; > means this is a comdat variable in all TUs which need non-ODR > uses of it (tree_code_type[23] evaluates to constant expression, > but tree_code_type[x] or &tree_code_type[23] etc. often don't and then > the comdat var is emitted and all TUs share one copy of the variable. > > Jakub > > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] tree: Fix up tree_code_{length,type} @ 2023-01-27 20:44 Maciej Cencora 0 siblings, 0 replies; 15+ messages in thread From: Maciej Cencora @ 2023-01-27 20:44 UTC (permalink / raw) To: gcc-patches Hi, you can emulate C++17 inline variables in C++11 with either of the two ways: 1) via a template helper template <typename = void> struct Helper { static constexpr unsigned value[4] = {1, 2, 3, 4}; }; template <typename T> constexpr unsigned Helper<T>::value[4]; static constexpr auto& arr = Helper<>::value; 2) extern constexpr + weak attribute [[gnu::weak]] extern constexpr unsigned arr[] = {1, 2, 3, 4}; Regards, Maciej ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-01-27 20:44 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-18 18:05 [PATCH] constexprify some tree variables apinski 2022-11-18 20:06 ` Jeff Law 2022-11-19 2:53 ` Andrew Pinski 2022-11-19 16:33 ` Jeff Law 2023-01-26 14:45 ` Patrick Palka 2023-01-26 14:51 ` Jakub Jelinek 2023-01-26 14:58 ` Jakub Jelinek 2023-01-26 15:59 ` [PATCH] tree: Fix up tree_code_{length,type} Jakub Jelinek 2023-01-26 18:03 ` Patrick Palka 2023-01-27 12:40 ` Patrick Palka 2023-01-27 13:14 ` Richard Biener 2023-01-27 7:42 ` Richard Biener 2023-01-27 8:57 ` Jakub Jelinek 2023-01-27 9:49 ` Richard Biener 2023-01-27 20:44 Maciej Cencora
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).