* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
@ 2024-06-03 9:09 ` jakub at gcc dot gnu.org
2024-06-03 9:16 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-03 9:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Target Milestone|--- |12.4
Summary|[12/13/14/15 Regression] |[12/13/14/15 Regression]
|PCH (and maybe GC) of |PCH of rs6000 builtins
|rs6000 builtins broken |broken
Ever confirmed|0 |1
Last reconfirmed| |2024-06-03
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
While for GC everything looks ok, in particular we have
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided so that they cover just the fntype fields in
rs6000_instance_info and rs6000_builtin_info, it causes problems for PCH,
because
PCH saves/restores the whole rs6000_instance_info/rs6000_builtin_info arrays,
which are huge and worse the contain quite a lot of pointers to .rodata/.data
sections besides the tree fntype members.
In struct bifdata it is
const char *GTY((skip(""))) bifname;
...
const char *GTY((skip(""))) attr_string;
and in struct ovlddata
const char *GTY((skip(""))) bifname;
...
ovlddata *GTY((skip(""))) next;
fields. bifname and attr_string members point to string literals, so .rodata
section,
and next is either NULL or &rs6000_instance_info[RS6000_INST_XXX]
Now, the problem is when cc1/cc1plus etc. are PIEs, the saving is done when
.rodata/.data sections are loaded at one address, but the restoring can be done
when they are loaded at different address, so the pointers are garbage after
PCH restore.
We could extend GTY, to have a similar flag to callback where we'd somehow fix
up those pointers in global GTY roots.
Or I think we can just move the fntype members from those structures to
separate arrays, making the original arrays non-GTY and only GTY the new
*_fntype arrays.
I've implemented the latter, will attach it momentarily.
It seems to decrease the size of .data arrays.
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
On a cross from x86_64 -> powerpc64le it increases .text size in the
_Z30rs6000_init_generated_builtinsv function, but will need to find out if that
is also the case on powerpc64le native where section anchors are used, the
changes in the source are like:
rs6000_overload_info[RS6000_OVLD_VEC_VSUBFP - base].first_instance
- = &rs6000_instance_info[RS6000_INST_VSUBFP_DEPR1];
+ = RS6000_INST_VSUBFP_DEPR1;
- rs6000_instance_info[RS6000_INST_VSUBSBS_DEPR1].fntype
+ rs6000_instance_info_fntype[RS6000_INST_VSUBSBS_DEPR1]
= v16qi_ftype_v16qi_v16qi;
and
- rs6000_builtin_info[RS6000_BIF_CFSTRING].fntype
+ rs6000_builtin_info_fntype[RS6000_BIF_CFSTRING]
= v_ftype_v;
so there is IMO nothing inherently larger.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
2024-06-03 9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
@ 2024-06-03 9:16 ` jakub at gcc dot gnu.org
2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-03 9:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 58331
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58331&action=edit
gcc15-pr115324.patch
Untested fix.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
2024-06-03 9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
2024-06-03 9:16 ` jakub at gcc dot gnu.org
@ 2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-03 21:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8
commit r15-1001-g4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Jun 3 23:11:06 2024 +0200
rs6000: Fix up PCH in --enable-host-pie builds [PR115324]
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them
back
in full. While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.
So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.
Size-wise in .data sections it is (in bytes):
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.
Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.
When I applied
void
rs6000_init_generated_builtins ()
{
+ bifdata *rs6000_builtin_info_p;
+ tree *rs6000_builtin_info_fntype_p;
+ ovlddata *rs6000_instance_info_p;
+ tree *rs6000_instance_info_fntype_p;
+ ovldrecord *rs6000_overload_info_p;
+ __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+ __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
+ __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+ __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
+ __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+ #define rs6000_builtin_info rs6000_builtin_info_p
+ #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+ #define rs6000_instance_info rs6000_instance_info_p
+ #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+ #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments. If you want such a change,
could that be done incrementally?
2024-06-03 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members. Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *. Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise. Make
static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (2 preceding siblings ...)
2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
@ 2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
2024-06-04 14:29 ` [Bug target/115324] [12/13 " jakub at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-04 14:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:a7dd44c02ec1047166b4bacc3faa6255c816da2a
commit r14-10277-ga7dd44c02ec1047166b4bacc3faa6255c816da2a
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Jun 3 23:11:06 2024 +0200
rs6000: Fix up PCH in --enable-host-pie builds [PR115324]
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them
back
in full. While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.
So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.
Size-wise in .data sections it is (in bytes):
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.
Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.
When I applied
void
rs6000_init_generated_builtins ()
{
+ bifdata *rs6000_builtin_info_p;
+ tree *rs6000_builtin_info_fntype_p;
+ ovlddata *rs6000_instance_info_p;
+ tree *rs6000_instance_info_fntype_p;
+ ovldrecord *rs6000_overload_info_p;
+ __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+ __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
+ __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+ __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
+ __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+ #define rs6000_builtin_info rs6000_builtin_info_p
+ #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+ #define rs6000_instance_info rs6000_instance_info_p
+ #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+ #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments. If you want such a change,
could that be done incrementally?
2024-06-03 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members. Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *. Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise. Make
static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.
(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (3 preceding siblings ...)
2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
@ 2024-06-04 14:29 ` jakub at gcc dot gnu.org
2024-06-11 6:17 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-04 14:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[12/13/14/15 Regression] |[12/13 Regression] PCH of
|PCH of rs6000 builtins |rs6000 builtins broken
|broken |
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed for 14.2+/15.1+ for now.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (4 preceding siblings ...)
2024-06-04 14:29 ` [Bug target/115324] [12/13 " jakub at gcc dot gnu.org
@ 2024-06-11 6:17 ` cvs-commit at gcc dot gnu.org
2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11 6:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:50b5019fde97c20a377e004c9d73df62e4898773
commit r13-8834-g50b5019fde97c20a377e004c9d73df62e4898773
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Jun 3 23:11:06 2024 +0200
rs6000: Fix up PCH in --enable-host-pie builds [PR115324]
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them
back
in full. While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.
So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.
Size-wise in .data sections it is (in bytes):
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.
Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.
When I applied
void
rs6000_init_generated_builtins ()
{
+ bifdata *rs6000_builtin_info_p;
+ tree *rs6000_builtin_info_fntype_p;
+ ovlddata *rs6000_instance_info_p;
+ tree *rs6000_instance_info_fntype_p;
+ ovldrecord *rs6000_overload_info_p;
+ __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+ __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
+ __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+ __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
+ __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+ #define rs6000_builtin_info rs6000_builtin_info_p
+ #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+ #define rs6000_instance_info rs6000_instance_info_p
+ #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+ #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments. If you want such a change,
could that be done incrementally?
2024-06-03 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members. Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *. Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise. Make
static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.
(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (5 preceding siblings ...)
2024-06-11 6:17 ` cvs-commit at gcc dot gnu.org
@ 2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
2024-06-11 10:54 ` jakub at gcc dot gnu.org
2024-06-18 6:33 ` cvs-commit at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11 10:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:bda8c28e6fcdbe0b486b54616877eec32c86d322
commit r12-10531-gbda8c28e6fcdbe0b486b54616877eec32c86d322
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Jun 3 23:11:06 2024 +0200
rs6000: Fix up PCH in --enable-host-pie builds [PR115324]
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them
back
in full. While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.
So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.
Size-wise in .data sections it is (in bytes):
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.
Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.
When I applied
void
rs6000_init_generated_builtins ()
{
+ bifdata *rs6000_builtin_info_p;
+ tree *rs6000_builtin_info_fntype_p;
+ ovlddata *rs6000_instance_info_p;
+ tree *rs6000_instance_info_fntype_p;
+ ovldrecord *rs6000_overload_info_p;
+ __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+ __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
+ __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+ __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
+ __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+ #define rs6000_builtin_info rs6000_builtin_info_p
+ #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+ #define rs6000_instance_info rs6000_instance_info_p
+ #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+ #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments. If you want such a change,
could that be done incrementally?
2024-06-03 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members. Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *. Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise. Make
static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.
(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (6 preceding siblings ...)
2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
@ 2024-06-11 10:54 ` jakub at gcc dot gnu.org
2024-06-18 6:33 ` cvs-commit at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-11 10:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed for 12.4+ and 13.4+ too.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
2024-06-03 8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
` (7 preceding siblings ...)
2024-06-11 10:54 ` jakub at gcc dot gnu.org
@ 2024-06-18 6:33 ` cvs-commit at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-18 6:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324
--- Comment #9 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:e17114f99c9ea754787573679b3b4d2b52434b61
commit r15-1393-ge17114f99c9ea754787573679b3b4d2b52434b61
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Jun 18 08:32:37 2024 +0200
rs6000: Shrink rs6000_init_generated_builtins size [PR115324]
While my r15-1001-g4cf2de9b5268224 PCH PIE power fix change decreased the
.data section sizes (219792 -> 189336), it increased the size of already
huge rs6000_init_generated_builtins generated function, from 218328
to 228668 bytes. That is because there are thousands of array references
to global arrays and we keep constructing the addresses of the arrays
again and again.
Ideally some optimization would figure out we have a single function which
has
461 rs6000_overload_info
1257 rs6000_builtin_info_fntype
1768 rs6000_builtin_decls
2548 rs6000_instance_info_fntype
array references and that maybe it might be a good idea to just preload
the addresses of those arrays into some register if it decreases code size
and doesn't slow things down.
The function actually is called just once and is huge, so code size is even
more important than speed, which is dominated by all the GC allocations
anyway.
Until that is done, here is a slightly cleaner version of the hack, which
makes the function noipa (so that LTO doesn't undo it) for GCC 8.1+ and
passes the 4 arrays as arguments to the function from the caller.
This decreases the function size from 228668 bytes to 207572 bytes.
2024-06-18 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Change
declaration of rs6000_init_generated_builtins from no arguments
to 4 pointer arguments.
(write_init_bif_table): Change rs6000_builtin_info_fntype to
builtin_info_fntype and rs6000_builtin_decls to builtin_decls.
(write_init_ovld_table): Change rs6000_instance_info_fntype to
instance_info_fntype, rs6000_builtin_decls to builtin_decls and
rs6000_overload_info to overload_info.
(write_init_file): Add __noipa__ attribute to
rs6000_init_generated_builtins for GCC 8.1+ and change the function
from no arguments to 4 pointer arguments. Change
rs6000_builtin_decls
to builtin_decls.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Adjust
rs6000_init_generated_builtins caller.
^ permalink raw reply [flat|nested] 10+ messages in thread