public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Eric Gallager <egall@gwmail.gwu.edu>
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Richard Biener <rguenther@suse.de>,
	 gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] pch: Add support for relocation of the PCH data [PR71934]
Date: Thu, 9 Dec 2021 20:20:59 -0500	[thread overview]
Message-ID: <CAMfHzOvYCROTg4fsg5+cGZE_HQwh0SiQ1JAvRnE05hSVBzWiiw@mail.gmail.com> (raw)
In-Reply-To: <42aa9a74-56ee-980f-5be7-57ca218f5023@gmail.com>

On Wed, Dec 8, 2021 at 6:10 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 12/7/2021 2:55 AM, Jakub Jelinek wrote:
> > Hi!
> >
> > The following patch adds support for relocation of the PCH blob on PCH
> > restore if we don't manage to get the preferred map slot for it.
> > The GTY stuff knows where all the pointers are, after all it relocates
> > it once during PCH save from the addresses where it was initially allocated
> > to addresses in the preferred map slot.
> > But, if we were to do it solely using GTY info upon PCH restore, we'd need
> > another set of GTY functions, which I think would make it less maintainable
> > and I think it would also be more costly at PCH restore time.  Those
> > functions would need to call something to add bias to pointers that haven't
> > been marked yet and make sure not to add bias to any pointer twice.
> >
> > So, this patch instead builds a relocation table (sorted list of addresses
> > in the blob which needs relocation) at PCH save time, stores it in a very
> > compact form into the gch file and upon restore, adjusts pointers in GTY
> > roots (that is right away in the root structures) and the addresses in the
> > relocation table.
> > The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size
> > growth, there are 2.5 million pointers that need relocation in the gch blob
> > and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes
> > for one address that needs relocation, and about 20% compile time during
> > PCH save (I think it is mainly because of the need to qsort those 2.5
> > million pointers).  On PCH restore, if it doesn't need relocation (the usual
> > case), it is just an extra fread of sizeof (size_t) data and fseek
> > (in my tests real time on vanilla tree for #include <bits/stdc++.h> CU
> > was ~0.175s and with the patch but no relocation ~0.173s), while if it needs
> > relocation it took ~0.193s, i.e. 11.5% slower.
> >
> > The discovery of the pointers in the blob that need relocation is done
> > in the relocate_ptrs hook which does the pointer relocation during PCH save.
> > Unfortunately, I had to make one change to the gengtype stuff due to the
> > nested_ptr feature of GTY, which some libcpp headers and stringpool.c use.
> > The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie.
> > When relocate_ptrs is done, in most cases it is called solely on the
> > subfields of the current object, so e.g.
> >            if ((void *)(x) == this_obj)
> >              op (&((*x).u.fld[0].rt_rtx), cookie);
> > so relocate_ptrs can assert that ptr_p is within the
> > state->ptrs[state->ptrs_i]->obj ..
> > state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void*)
> > range and compute from that the address in the blob which will need
> > relocation (state->ptrs[state->ptrs_i]->new_addr is the new address
> > given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative
> > offset.  Unfortunately, for nested_ptr gengtype emits something like:
> >        {
> >          union tree_node * x0 =
> >            ((*x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((*x).val.node.node))) : NULL;
> >          if ((void *)(x) == this_obj)
> >            op (&(x0), cookie);
> >          (*x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL;
> >        }
> > so relocate_ptrs is called with an address of some temporary variable and
> > so doesn't know where the pointer will finally be.
> > So, I've added another argument to relocate_ptrs (and to
> > gt_pointer_operator).  For the most common case I pass NULL as the new middle
> > argument to that function, first one remains pointer to the pointer that
> > needs adjustment and last the cookie.  The NULL seems to be cheap to compute
> > and short in the gt*.[ch] files and stands for ptr_p is an address within
> > the this_obj's range, remember its address.  For the nested_ptr case, the
> > new middle argument contains actual address of the pointer that might need
> > to be relocated, so instead of the above
> >            op (&(x0), &((*x).val.node.node), cookie);
> > in there.  And finally, e.g. for the reorder case I need a way to tell
> > restore_ptrs to ignore a particular address for the relocation purposes
> > and only treat it the old way.  I've used for that the case when
> > the first and second arguments are equal.
> >
> > In order to enable support for mapping PCH as fallback at different
> > addresses than the preferred ones, a small change is needed to the
> > host pch_use_address hooks.  One change I've done to all of them is
> > the change of the type of the first argument from void * to void *&,
> > such that the actual address can be told to the callers (or shall I
> > instead use void **?), but another change that still needs to be done
> > in them if they want the relocation is actually not fail if they couldn't
> > get a preferred address, but instead modify what the first argument
> > refers to.  I've done that only for host-linux.c and Iain is testing
> > similar change for host-darwin.c.  Didn't change hpux, netbsd, openbsd,
> > solaris, mingw32 or the fallbacks because I can't test those.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux (both as is and with
> > incremental
> > --- gcc/config/host-linux.c.jj        2021-12-06 22:22:42.007777367 +0100
> > +++ gcc/config/host-linux.c   2021-12-07 00:21:53.052674040 +0100
> > @@ -191,6 +191,8 @@ linux_gt_pch_use_address (void *&base, s
> >     if (size == 0)
> >       return -1;
> >
> > +base = (char *) base + ((size + 8191) & (size_t) -4096);
> > +
> >     /* Try to map the file with MAP_PRIVATE.  */
> >     addr = mmap (base, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, offset);
> >
> > which forces all PCH restores to be relocated.  An earlier version of the
> > patch has been also regrest with base = (char *) base + 16384; in that spot,
> > so both relocation to a non-overlapping spot and to an overlapping spot have
> > been tested.
> >
> > 2021-12-07  Jakub Jelinek  <jakub@redhat.com>
> >
> >       PR pch/71934
> >       * coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead
> >       of two.
> >       * gengtype.c (struct walk_type_data): Add in_nested_ptr argument.
> >       (walk_type): Temporarily set d->in_nested_ptr around nested_ptr
> >       handling.
> >       (write_types_local_user_process_field): Pass a new middle pointer
> >       to gt_pointer_operator op calls, if d->in_nested_ptr pass there
> >       address of d->prev_val[2], otherwise NULL.
> >       (write_types_local_process_field): Likewise.
> >       * ggc-common.c (relocate_ptrs): Add real_ptr_p argument.  If equal
> >       to ptr_p, do nothing, otherwise if NULL remember ptr_p's
> >       or if non-NULL real_ptr_p's corresponding new address in
> >       reloc_addrs_vec.
> >       (reloc_addrs_vec): New variable.
> >       (compare_ptr, read_uleb128, write_uleb128): New functions.
> >       (gt_pch_save): When iterating over objects through relocate_ptrs,
> >       save current i into state.ptrs_i.  Sort reloc_addrs_vec and emit
> >       it as uleb128 of differences between pointer addresses into the
> >       PCH file.
> >       (gt_pch_restore): Allow restoring of PCH to a different address
> >       than the preferred one, in that case adjust global pointers by bias
> >       and also adjust by bias addresses read from the relocation table
> >       as uleb128 differences.  Otherwise fseek over it.  Perform
> >       gt_pch_restore_stringpool only after adjusting callbacks and for
> >       callback adjustments also take into account the bias.
> >       (default_gt_pch_use_address): Change type of first argument from
> >       void * to void *&.
> >       (mmap_gt_pch_use_address): Likewise.
> >       * ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op.
> >       * hash-map.h (hash_map::pch_nx_helper): Likewise.
> >       (gt_pch_nx): Likewise.
> >       * hash-set.h (gt_pch_nx): Likewise.
> >       * hash-table.h (gt_pch_nx): Likewise.
> >       * hash-traits.h (ggc_remove::pch_nx): Likewise.
> >       * hosthooks-def.h (default_gt_pch_use_address): Change type of first
> >       argument from void * to void *&.
> >       (mmap_gt_pch_use_address): Likewise.
> >       * hosthooks.h (struct host_hooks): Change type of first argument of
> >       gt_pch_use_address hook from void * to void *&.
> >       * machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of
> >       two in the middle argument.
> >       * poly-int.h (gt_pch_nx): Likewise.
> >       * stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op.
> >       * tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass
> >       the same &(block) twice.
> >       * value-range.h (gt_pch_nx): Pass NULL as new middle argument to op.
> >       * vec.h (gt_pch_nx): Likewise.
> >       * wide-int.h (gt_pch_nx): Likewise.
> >       * config/host-darwin.c (darwin_gt_pch_use_address): Change type of
> >       first argument from void * to void *&.
> >       * config/host-darwin.h (darwin_gt_pch_use_address): Likewise.
> >       * config/host-hpux.c (hpux_gt_pch_use_address): Likewise.
> >       * config/host-linux.c (linux_gt_pch_use_address): Likewise.  If
> >       it couldn't succeed to mmap at the preferred location, set base
> >       to the actual one.  Update addr in the manual reading loop instead of
> >       base.
> >       * config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of
> >       first argument from void * to void *&.
> >       * config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise.
> >       * config/host-solaris.c (sol_gt_pch_use_address): Likewise.
> >       * config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise.
> >       * config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL
> >       as new middle argument to op in the generated code.
> >       * doc/gty.texi: Adjust samples for the addition of middle pointer
> >       to gt_pointer_operator callback.
> > gcc/ada/
> >       * gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument
> >       to op.
> > gcc/c-family/
> >       * c-pch.c (c_common_no_more_pch): Pass a temporary void * var
> >       with NULL value instead of NULL to host_hooks.gt_pch_use_address.
> > gcc/c/
> >       * c-decl.c (resort_field_decl_cmp): Pass the same pointer twice
> >       to resort_data.new_value.
> > gcc/cp/
> >       * module.cc (nop): Add another void * argument.
> >       * name-lookup.c (resort_member_name_cmp): Pass the same pointer twice
> >       to resort_data.new_value.
> Like the prior patch in this space, you know this better than anyone, so
> if you're comfortable, I think you should commit.
>
> I wonder if this is sufficient for the Ruby team to re-enable GCC PCH
> support.  IIRC the whole problem was we couldn't deal with PIE & PCH and
> I think this fixes that problem, right?  If so, you might want to reach
> out to them.
>
> Jeff

Having a note in changes.html would be helpful to point to for those purposes.

Eric

      reply	other threads:[~2021-12-10  1:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07  9:55 Jakub Jelinek
2021-12-07 14:50 ` Jakub Jelinek
2021-12-08  8:00   ` Iain Sandoe
2021-12-08 23:10     ` Jeff Law
2021-12-09 14:59     ` Jakub Jelinek
2021-12-09 15:03       ` Iain Sandoe
2021-12-09 16:42       ` Christophe Lyon
2021-12-09 16:52         ` Jeff Law
2021-12-09 16:59         ` [committed] pch: Fix aarch64 build [PR71934] Jakub Jelinek
2021-12-10 12:56           ` [PATCH] pch: Small cleanup Jakub Jelinek
2021-12-10 21:20             ` Jeff Law
2021-12-08 23:09 ` [PATCH] pch: Add support for relocation of the PCH data [PR71934] Jeff Law
2021-12-10  1:20   ` Eric Gallager [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMfHzOvYCROTg4fsg5+cGZE_HQwh0SiQ1JAvRnE05hSVBzWiiw@mail.gmail.com \
    --to=egall@gwmail.gwu.edu \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).