public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken
@ 2024-06-03  8:44 jakub at gcc dot gnu.org
  2024-06-03  9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-03  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

            Bug ID: 115324
           Summary: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000
                    builtins broken
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
                CC: belegdol at gmail dot com, dan at danny dot cz
        Depends on: 104323
  Target Milestone: ---

+++ This bug was initially created as a clone of Bug #104323 +++

The PR104323 fix doesn't seem to work with --enable-host-pie.
cat pr104323.h 
#include <altivec.h>
cat pr104323.c 
/* { dg-require-effective-target powerpc_altivec_ok } */
/* { dg-options "-maltivec" } */

#include "pr104323.h"

__vector int a1 = { 100, 200, 300, 400 };
__vector int a2 = { 500, 600, 700, 800 };
__vector int r;

int
main ()
{
  r = vec_add (a1, a2);
  return 0;
}
./xgcc -B ./ -c pr104323.h -o pr104323.h.gch
./xgcc -B ./ -S pr104323.c
pr104323.c: In function ‘main’:
pr104323.c:13:3: internal compiler error: Segmentation fault
   13 |   r = vec_add (a1, a2);
      |   ^
0x5625e2834b83 crash_signal
        ../../gcc/toplev.cc:319
0x5625e228c56d altivec_resolve_overloaded_builtin(unsigned int, tree_node*,
void*)
        ../../gcc/config/rs6000/rs6000-c.cc:1997
0x5625e2190392 c_build_function_call_vec(unsigned int, vec<unsigned int,
va_heap, vl_ptr> const&, tree_node*, vec<tree_node*, va_gc, vl_embed>*,
vec<tree_node*, va_gc, vl_embed>*)
        ../../gcc/c/c-typeck.cc:3440
0x5625e21c4536 c_parser_postfix_expression_after_primary
        ../../gcc/c/c-parser.cc:12682
0x5625e21a99a5 c_parser_postfix_expression
        ../../gcc/c/c-parser.cc:12234
0x5625e21ae60a c_parser_unary_expression
        ../../gcc/c/c-parser.cc:9894
0x5625e21b01df c_parser_cast_expression
        ../../gcc/c/c-parser.cc:9735
0x5625e21b04db c_parser_binary_expression
        ../../gcc/c/c-parser.cc:9503
0x5625e21b1a93 c_parser_conditional_expression
        ../../gcc/c/c-parser.cc:9198
0x5625e21b2324 c_parser_expr_no_commas
        ../../gcc/c/c-parser.cc:9111
0x5625e21b23ea c_parser_expr_no_commas
        ../../gcc/c/c-parser.cc:9154
0x5625e21b2773 c_parser_expression
        ../../gcc/c/c-parser.cc:12822
0x5625e21b2c57 c_parser_expression_conv
        ../../gcc/c/c-parser.cc:12862
0x5625e21d5b57 c_parser_statement_after_labels
        ../../gcc/c/c-parser.cc:7800
0x5625e21a942b c_parser_compound_statement_nostart
        ../../gcc/c/c-parser.cc:7287
0x5625e21d2136 c_parser_compound_statement
        ../../gcc/c/c-parser.cc:6564
0x5625e21d45af c_parser_declaration_or_fndef
        ../../gcc/c/c-parser.cc:3019
0x5625e21deb73 c_parser_external_declaration
        ../../gcc/c/c-parser.cc:2048
0x5625e21df5cb c_parser_translation_unit
        ../../gcc/c/c-parser.cc:1902
0x5625e21df5cb c_parse_file()
        ../../gcc/c/c-parser.cc:27269
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104323
[Bug 104323] [12 Regression] PCH (and maybe GC) of rs6000 builtins broken

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
@ 2024-06-03  9:09 ` jakub at gcc dot gnu.org
  2024-06-03  9:16 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-03  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
   Target Milestone|---                         |12.4
            Summary|[12/13/14/15 Regression]    |[12/13/14/15 Regression]
                   |PCH (and maybe GC) of       |PCH of rs6000 builtins
                   |rs6000 builtins broken      |broken
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-06-03

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
While for GC everything looks ok, in particular we have
  {
    &rs6000_instance_info[0].fntype,
    1 * (RS6000_INST_MAX),
    sizeof (rs6000_instance_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &rs6000_builtin_info[0].fntype,
    1 * (RS6000_BIF_MAX),
    sizeof (rs6000_builtin_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
GC roots which are strided so that they cover just the fntype fields in
rs6000_instance_info and rs6000_builtin_info, it causes problems for PCH,
because
PCH saves/restores the whole rs6000_instance_info/rs6000_builtin_info arrays,
which are huge and worse the contain quite a lot of pointers to .rodata/.data
sections besides the tree fntype members.
In struct bifdata it is
  const char *GTY((skip(""))) bifname;
...
  const char *GTY((skip(""))) attr_string;
and in struct ovlddata
  const char *GTY((skip(""))) bifname;
...
  ovlddata *GTY((skip(""))) next;
fields.  bifname and attr_string members point to string literals, so .rodata
section,
and next is either NULL or &rs6000_instance_info[RS6000_INST_XXX]
Now, the problem is when cc1/cc1plus etc. are PIEs, the saving is done when
.rodata/.data sections are loaded at one address, but the restoring can be done
when they are loaded at different address, so the pointers are garbage after
PCH restore.

We could extend GTY, to have a similar flag to callback where we'd somehow fix
up those pointers in global GTY roots.
Or I think we can just move the fntype members from those structures to
separate arrays, making the original arrays non-GTY and only GTY the new
*_fntype arrays.

I've implemented the latter, will attach it momentarily.
It seems to decrease the size of .data arrays.
                             vanilla    patched
rs6000_builtin_info          130832     110704
rs6000_instance_info          81568      40784
rs6000_overload_info           7392       7392
rs6000_builtin_info_fntype        0      10064
rs6000_instance_info_fntype       0      20392
sum                          219792     189336

On a cross from x86_64 -> powerpc64le it increases .text size in the
_Z30rs6000_init_generated_builtinsv function, but will need to find out if that
is also the case on powerpc64le native where section anchors are used, the
changes in the source are like:
   rs6000_overload_info[RS6000_OVLD_VEC_VSUBFP - base].first_instance
-    = &rs6000_instance_info[RS6000_INST_VSUBFP_DEPR1];
+    = RS6000_INST_VSUBFP_DEPR1;

-  rs6000_instance_info[RS6000_INST_VSUBSBS_DEPR1].fntype
+  rs6000_instance_info_fntype[RS6000_INST_VSUBSBS_DEPR1]
     = v16qi_ftype_v16qi_v16qi;
and
-  rs6000_builtin_info[RS6000_BIF_CFSTRING].fntype
+  rs6000_builtin_info_fntype[RS6000_BIF_CFSTRING]
     = v_ftype_v;
so there is IMO nothing inherently larger.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
  2024-06-03  9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
@ 2024-06-03  9:16 ` jakub at gcc dot gnu.org
  2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-03  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 58331
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58331&action=edit
gcc15-pr115324.patch

Untested fix.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
  2024-06-03  9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
  2024-06-03  9:16 ` jakub at gcc dot gnu.org
@ 2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
  2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-03 21:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8

commit r15-1001-g4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 3 23:11:06 2024 +0200

    rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

    PCH doesn't work properly in --enable-host-pie configurations on
    powerpc*-linux*.
    The problem is that the rs6000_builtin_info and rs6000_instance_info
    arrays mix pointers to .rodata/.data (bifname and attr_string point
    to string literals in .rodata section, and the next member is either NULL
    or &rs6000_instance_info[XXX]) and GC member (tree fntype).
    Now, for normal GC this works just fine, we emit
      {
        &rs6000_instance_info[0].fntype,
        1 * (RS6000_INST_MAX),
        sizeof (rs6000_instance_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
      {
        &rs6000_builtin_info[0].fntype,
        1 * (RS6000_BIF_MAX),
        sizeof (rs6000_builtin_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
    GC roots which are strided and thus cover only the fntype members of all
    the elements of the two arrays.
    For PCH though it actually results in saving those huge arrays (one is
    130832 bytes, another 81568 bytes) into the .gch files and loading them
back
    in full.  While the bifname and attr_string and next pointers are marked as
    GTY((skip)), they are actually saved to point to the .rodata and .data
    sections of the process which writes the PCH, but because cc1/cc1plus etc.
    are position independent executables with --enable-host-pie, when it is
    loaded from the PCH file, it can point in a completely different addresses
    where nothing is mapped at all or some random different thing appears at.
    While gengtype supports the callback option, that one is meant for
    relocatable function pointers and doesn't work in the case of GTY arrays
    inside of .data section anyway.

    So, either we'd need to add some further GTY extensions, or the following
    patch instead reworks it such that the fntype members which were the only
    reason for PCH in those arrays are moved to separate arrays.

    Size-wise in .data sections it is (in bytes):

                                 vanilla    patched
    rs6000_builtin_info          130832     110704
    rs6000_instance_info          81568      40784
    rs6000_overload_info           7392       7392
    rs6000_builtin_info_fntype        0      10064
    rs6000_instance_info_fntype       0      20392
    sum                          219792     189336

    where previously we saved/restored for PCH those 130832+81568 bytes, now we
    save/restore just 10064+20392 bytes, so this change is beneficial for the
    data section size.

    Unfortunately, it grows the size of the rs6000_init_generated_builtins
    function, vanilla had 218328 bytes, patched has 228668.

    When I applied
     void
     rs6000_init_generated_builtins ()
     {
    +  bifdata *rs6000_builtin_info_p;
    +  tree *rs6000_builtin_info_fntype_p;
    +  ovlddata *rs6000_instance_info_p;
    +  tree *rs6000_instance_info_fntype_p;
    +  ovldrecord *rs6000_overload_info_p;
    +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
    +  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
    +  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
    +  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
    +  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
    +  #define rs6000_builtin_info rs6000_builtin_info_p
    +  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
    +  #define rs6000_instance_info rs6000_instance_info_p
    +  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
    +  #define rs6000_overload_info rs6000_overload_info_p
    +
    hack by hand, the size of the function is 209700 though, so if really
    wanted, we could add __attribute__((__noipa__)) to the function when
    building with recent enough GCC and pass pointers to the first elements
    of the 5 arrays to the function as arguments.  If you want such a change,
    could that be done incrementally?

    2024-06-03  Jakub Jelinek  <jakub@redhat.com>

            PR target/115324
            * config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
            GTY markup from struct bifdata and struct ovlddata and remove their
            fntype members.  Change next member in struct ovlddata and
            first_instance member of struct ovldrecord to have int type rather
            than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
            and rs6000_instance_info arrays, declare new
            rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
            which have GTY markup.
            (write_bif_static_init): Adjust for the above changes.
            (write_ovld_static_init): Likewise.
            (write_init_bif_table): Likewise.
            (write_init_ovld_table): Likewise.
            * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
            * config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make
static.
            (altivec_resolve_overloaded_builtin): Adjust for the above changes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13/14/15 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
@ 2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
  2024-06-04 14:29 ` [Bug target/115324] [12/13 " jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-04 14:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a7dd44c02ec1047166b4bacc3faa6255c816da2a

commit r14-10277-ga7dd44c02ec1047166b4bacc3faa6255c816da2a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 3 23:11:06 2024 +0200

    rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

    PCH doesn't work properly in --enable-host-pie configurations on
    powerpc*-linux*.
    The problem is that the rs6000_builtin_info and rs6000_instance_info
    arrays mix pointers to .rodata/.data (bifname and attr_string point
    to string literals in .rodata section, and the next member is either NULL
    or &rs6000_instance_info[XXX]) and GC member (tree fntype).
    Now, for normal GC this works just fine, we emit
      {
        &rs6000_instance_info[0].fntype,
        1 * (RS6000_INST_MAX),
        sizeof (rs6000_instance_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
      {
        &rs6000_builtin_info[0].fntype,
        1 * (RS6000_BIF_MAX),
        sizeof (rs6000_builtin_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
    GC roots which are strided and thus cover only the fntype members of all
    the elements of the two arrays.
    For PCH though it actually results in saving those huge arrays (one is
    130832 bytes, another 81568 bytes) into the .gch files and loading them
back
    in full.  While the bifname and attr_string and next pointers are marked as
    GTY((skip)), they are actually saved to point to the .rodata and .data
    sections of the process which writes the PCH, but because cc1/cc1plus etc.
    are position independent executables with --enable-host-pie, when it is
    loaded from the PCH file, it can point in a completely different addresses
    where nothing is mapped at all or some random different thing appears at.
    While gengtype supports the callback option, that one is meant for
    relocatable function pointers and doesn't work in the case of GTY arrays
    inside of .data section anyway.

    So, either we'd need to add some further GTY extensions, or the following
    patch instead reworks it such that the fntype members which were the only
    reason for PCH in those arrays are moved to separate arrays.

    Size-wise in .data sections it is (in bytes):

                                 vanilla    patched
    rs6000_builtin_info          130832     110704
    rs6000_instance_info          81568      40784
    rs6000_overload_info           7392       7392
    rs6000_builtin_info_fntype        0      10064
    rs6000_instance_info_fntype       0      20392
    sum                          219792     189336

    where previously we saved/restored for PCH those 130832+81568 bytes, now we
    save/restore just 10064+20392 bytes, so this change is beneficial for the
    data section size.

    Unfortunately, it grows the size of the rs6000_init_generated_builtins
    function, vanilla had 218328 bytes, patched has 228668.

    When I applied
     void
     rs6000_init_generated_builtins ()
     {
    +  bifdata *rs6000_builtin_info_p;
    +  tree *rs6000_builtin_info_fntype_p;
    +  ovlddata *rs6000_instance_info_p;
    +  tree *rs6000_instance_info_fntype_p;
    +  ovldrecord *rs6000_overload_info_p;
    +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
    +  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
    +  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
    +  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
    +  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
    +  #define rs6000_builtin_info rs6000_builtin_info_p
    +  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
    +  #define rs6000_instance_info rs6000_instance_info_p
    +  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
    +  #define rs6000_overload_info rs6000_overload_info_p
    +
    hack by hand, the size of the function is 209700 though, so if really
    wanted, we could add __attribute__((__noipa__)) to the function when
    building with recent enough GCC and pass pointers to the first elements
    of the 5 arrays to the function as arguments.  If you want such a change,
    could that be done incrementally?

    2024-06-03  Jakub Jelinek  <jakub@redhat.com>

            PR target/115324
            * config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
            GTY markup from struct bifdata and struct ovlddata and remove their
            fntype members.  Change next member in struct ovlddata and
            first_instance member of struct ovldrecord to have int type rather
            than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
            and rs6000_instance_info arrays, declare new
            rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
            which have GTY markup.
            (write_bif_static_init): Adjust for the above changes.
            (write_ovld_static_init): Likewise.
            (write_init_bif_table): Likewise.
            (write_init_ovld_table): Likewise.
            * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
            * config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make
static.
            (altivec_resolve_overloaded_builtin): Adjust for the above changes.

    (cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
@ 2024-06-04 14:29 ` jakub at gcc dot gnu.org
  2024-06-11  6:17 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-04 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[12/13/14/15 Regression]    |[12/13 Regression] PCH of
                   |PCH of rs6000 builtins      |rs6000 builtins broken
                   |broken                      |

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed for 14.2+/15.1+ for now.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-06-04 14:29 ` [Bug target/115324] [12/13 " jakub at gcc dot gnu.org
@ 2024-06-11  6:17 ` cvs-commit at gcc dot gnu.org
  2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11  6:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:50b5019fde97c20a377e004c9d73df62e4898773

commit r13-8834-g50b5019fde97c20a377e004c9d73df62e4898773
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 3 23:11:06 2024 +0200

    rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

    PCH doesn't work properly in --enable-host-pie configurations on
    powerpc*-linux*.
    The problem is that the rs6000_builtin_info and rs6000_instance_info
    arrays mix pointers to .rodata/.data (bifname and attr_string point
    to string literals in .rodata section, and the next member is either NULL
    or &rs6000_instance_info[XXX]) and GC member (tree fntype).
    Now, for normal GC this works just fine, we emit
      {
        &rs6000_instance_info[0].fntype,
        1 * (RS6000_INST_MAX),
        sizeof (rs6000_instance_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
      {
        &rs6000_builtin_info[0].fntype,
        1 * (RS6000_BIF_MAX),
        sizeof (rs6000_builtin_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
    GC roots which are strided and thus cover only the fntype members of all
    the elements of the two arrays.
    For PCH though it actually results in saving those huge arrays (one is
    130832 bytes, another 81568 bytes) into the .gch files and loading them
back
    in full.  While the bifname and attr_string and next pointers are marked as
    GTY((skip)), they are actually saved to point to the .rodata and .data
    sections of the process which writes the PCH, but because cc1/cc1plus etc.
    are position independent executables with --enable-host-pie, when it is
    loaded from the PCH file, it can point in a completely different addresses
    where nothing is mapped at all or some random different thing appears at.
    While gengtype supports the callback option, that one is meant for
    relocatable function pointers and doesn't work in the case of GTY arrays
    inside of .data section anyway.

    So, either we'd need to add some further GTY extensions, or the following
    patch instead reworks it such that the fntype members which were the only
    reason for PCH in those arrays are moved to separate arrays.

    Size-wise in .data sections it is (in bytes):

                                 vanilla    patched
    rs6000_builtin_info          130832     110704
    rs6000_instance_info          81568      40784
    rs6000_overload_info           7392       7392
    rs6000_builtin_info_fntype        0      10064
    rs6000_instance_info_fntype       0      20392
    sum                          219792     189336

    where previously we saved/restored for PCH those 130832+81568 bytes, now we
    save/restore just 10064+20392 bytes, so this change is beneficial for the
    data section size.

    Unfortunately, it grows the size of the rs6000_init_generated_builtins
    function, vanilla had 218328 bytes, patched has 228668.

    When I applied
     void
     rs6000_init_generated_builtins ()
     {
    +  bifdata *rs6000_builtin_info_p;
    +  tree *rs6000_builtin_info_fntype_p;
    +  ovlddata *rs6000_instance_info_p;
    +  tree *rs6000_instance_info_fntype_p;
    +  ovldrecord *rs6000_overload_info_p;
    +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
    +  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
    +  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
    +  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
    +  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
    +  #define rs6000_builtin_info rs6000_builtin_info_p
    +  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
    +  #define rs6000_instance_info rs6000_instance_info_p
    +  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
    +  #define rs6000_overload_info rs6000_overload_info_p
    +
    hack by hand, the size of the function is 209700 though, so if really
    wanted, we could add __attribute__((__noipa__)) to the function when
    building with recent enough GCC and pass pointers to the first elements
    of the 5 arrays to the function as arguments.  If you want such a change,
    could that be done incrementally?

    2024-06-03  Jakub Jelinek  <jakub@redhat.com>

            PR target/115324
            * config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
            GTY markup from struct bifdata and struct ovlddata and remove their
            fntype members.  Change next member in struct ovlddata and
            first_instance member of struct ovldrecord to have int type rather
            than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
            and rs6000_instance_info arrays, declare new
            rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
            which have GTY markup.
            (write_bif_static_init): Adjust for the above changes.
            (write_ovld_static_init): Likewise.
            (write_init_bif_table): Likewise.
            (write_init_ovld_table): Likewise.
            * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
            * config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make
static.
            (altivec_resolve_overloaded_builtin): Adjust for the above changes.

    (cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-06-11  6:17 ` cvs-commit at gcc dot gnu.org
@ 2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
  2024-06-11 10:54 ` jakub at gcc dot gnu.org
  2024-06-18  6:33 ` cvs-commit at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11 10:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:bda8c28e6fcdbe0b486b54616877eec32c86d322

commit r12-10531-gbda8c28e6fcdbe0b486b54616877eec32c86d322
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 3 23:11:06 2024 +0200

    rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

    PCH doesn't work properly in --enable-host-pie configurations on
    powerpc*-linux*.
    The problem is that the rs6000_builtin_info and rs6000_instance_info
    arrays mix pointers to .rodata/.data (bifname and attr_string point
    to string literals in .rodata section, and the next member is either NULL
    or &rs6000_instance_info[XXX]) and GC member (tree fntype).
    Now, for normal GC this works just fine, we emit
      {
        &rs6000_instance_info[0].fntype,
        1 * (RS6000_INST_MAX),
        sizeof (rs6000_instance_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
      {
        &rs6000_builtin_info[0].fntype,
        1 * (RS6000_BIF_MAX),
        sizeof (rs6000_builtin_info[0]),
        &gt_ggc_mx_tree_node,
        &gt_pch_nx_tree_node
      },
    GC roots which are strided and thus cover only the fntype members of all
    the elements of the two arrays.
    For PCH though it actually results in saving those huge arrays (one is
    130832 bytes, another 81568 bytes) into the .gch files and loading them
back
    in full.  While the bifname and attr_string and next pointers are marked as
    GTY((skip)), they are actually saved to point to the .rodata and .data
    sections of the process which writes the PCH, but because cc1/cc1plus etc.
    are position independent executables with --enable-host-pie, when it is
    loaded from the PCH file, it can point in a completely different addresses
    where nothing is mapped at all or some random different thing appears at.
    While gengtype supports the callback option, that one is meant for
    relocatable function pointers and doesn't work in the case of GTY arrays
    inside of .data section anyway.

    So, either we'd need to add some further GTY extensions, or the following
    patch instead reworks it such that the fntype members which were the only
    reason for PCH in those arrays are moved to separate arrays.

    Size-wise in .data sections it is (in bytes):

                                 vanilla    patched
    rs6000_builtin_info          130832     110704
    rs6000_instance_info          81568      40784
    rs6000_overload_info           7392       7392
    rs6000_builtin_info_fntype        0      10064
    rs6000_instance_info_fntype       0      20392
    sum                          219792     189336

    where previously we saved/restored for PCH those 130832+81568 bytes, now we
    save/restore just 10064+20392 bytes, so this change is beneficial for the
    data section size.

    Unfortunately, it grows the size of the rs6000_init_generated_builtins
    function, vanilla had 218328 bytes, patched has 228668.

    When I applied
     void
     rs6000_init_generated_builtins ()
     {
    +  bifdata *rs6000_builtin_info_p;
    +  tree *rs6000_builtin_info_fntype_p;
    +  ovlddata *rs6000_instance_info_p;
    +  tree *rs6000_instance_info_fntype_p;
    +  ovldrecord *rs6000_overload_info_p;
    +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
    +  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0"
(rs6000_builtin_info_fntype));
    +  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
    +  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0"
(rs6000_instance_info_fntype));
    +  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
    +  #define rs6000_builtin_info rs6000_builtin_info_p
    +  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
    +  #define rs6000_instance_info rs6000_instance_info_p
    +  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
    +  #define rs6000_overload_info rs6000_overload_info_p
    +
    hack by hand, the size of the function is 209700 though, so if really
    wanted, we could add __attribute__((__noipa__)) to the function when
    building with recent enough GCC and pass pointers to the first elements
    of the 5 arrays to the function as arguments.  If you want such a change,
    could that be done incrementally?

    2024-06-03  Jakub Jelinek  <jakub@redhat.com>

            PR target/115324
            * config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
            GTY markup from struct bifdata and struct ovlddata and remove their
            fntype members.  Change next member in struct ovlddata and
            first_instance member of struct ovldrecord to have int type rather
            than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
            and rs6000_instance_info arrays, declare new
            rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
            which have GTY markup.
            (write_bif_static_init): Adjust for the above changes.
            (write_ovld_static_init): Likewise.
            (write_init_bif_table): Likewise.
            (write_init_ovld_table): Likewise.
            * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
            * config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make
static.
            (altivec_resolve_overloaded_builtin): Adjust for the above changes.

    (cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
@ 2024-06-11 10:54 ` jakub at gcc dot gnu.org
  2024-06-18  6:33 ` cvs-commit at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-11 10:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed for 12.4+ and 13.4+ too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken
  2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-06-11 10:54 ` jakub at gcc dot gnu.org
@ 2024-06-18  6:33 ` cvs-commit at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-18  6:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #9 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:e17114f99c9ea754787573679b3b4d2b52434b61

commit r15-1393-ge17114f99c9ea754787573679b3b4d2b52434b61
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jun 18 08:32:37 2024 +0200

    rs6000: Shrink rs6000_init_generated_builtins size [PR115324]

    While my r15-1001-g4cf2de9b5268224 PCH PIE power fix change decreased the
    .data section sizes (219792 -> 189336), it increased the size of already
    huge rs6000_init_generated_builtins generated function, from 218328
    to 228668 bytes.  That is because there are thousands of array references
    to global arrays and we keep constructing the addresses of the arrays
    again and again.

    Ideally some optimization would figure out we have a single function which
    has
        461   rs6000_overload_info
       1257   rs6000_builtin_info_fntype
       1768   rs6000_builtin_decls
       2548   rs6000_instance_info_fntype
    array references and that maybe it might be a good idea to just preload
    the addresses of those arrays into some register if it decreases code size
    and doesn't slow things down.
    The function actually is called just once and is huge, so code size is even
    more important than speed, which is dominated by all the GC allocations
    anyway.

    Until that is done, here is a slightly cleaner version of the hack, which
    makes the function noipa (so that LTO doesn't undo it) for GCC 8.1+ and
    passes the 4 arrays as arguments to the function from the caller.
    This decreases the function size from 228668 bytes to 207572 bytes.

    2024-06-18  Jakub Jelinek  <jakub@redhat.com>

            PR target/115324
            * config/rs6000/rs6000-gen-builtins.cc (write_decls): Change
            declaration of rs6000_init_generated_builtins from no arguments
            to 4 pointer arguments.
            (write_init_bif_table): Change rs6000_builtin_info_fntype to
            builtin_info_fntype and rs6000_builtin_decls to builtin_decls.
            (write_init_ovld_table): Change rs6000_instance_info_fntype to
            instance_info_fntype, rs6000_builtin_decls to builtin_decls and
            rs6000_overload_info to overload_info.
            (write_init_file): Add __noipa__ attribute to
            rs6000_init_generated_builtins for GCC 8.1+ and change the function
            from no arguments to 4 pointer arguments.  Change
rs6000_builtin_decls
            to builtin_decls.
            * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Adjust
            rs6000_init_generated_builtins caller.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-06-18  6:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-03  8:44 [Bug target/115324] New: [12/13/14/15 Regression] PCH (and maybe GC) of rs6000 builtins broken jakub at gcc dot gnu.org
2024-06-03  9:09 ` [Bug target/115324] [12/13/14/15 Regression] PCH " jakub at gcc dot gnu.org
2024-06-03  9:16 ` jakub at gcc dot gnu.org
2024-06-03 21:12 ` cvs-commit at gcc dot gnu.org
2024-06-04 14:26 ` cvs-commit at gcc dot gnu.org
2024-06-04 14:29 ` [Bug target/115324] [12/13 " jakub at gcc dot gnu.org
2024-06-11  6:17 ` cvs-commit at gcc dot gnu.org
2024-06-11 10:38 ` cvs-commit at gcc dot gnu.org
2024-06-11 10:54 ` jakub at gcc dot gnu.org
2024-06-18  6:33 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).