From: Eric Gallager <egall@gwmail.gwu.edu>
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
Richard Biener <rguenther@suse.de>,
gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] pch: Add support for relocation of the PCH data [PR71934]
Date: Thu, 9 Dec 2021 20:20:59 -0500 [thread overview]
Message-ID: <CAMfHzOvYCROTg4fsg5+cGZE_HQwh0SiQ1JAvRnE05hSVBzWiiw@mail.gmail.com> (raw)
In-Reply-To: <42aa9a74-56ee-980f-5be7-57ca218f5023@gmail.com>
On Wed, Dec 8, 2021 at 6:10 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 12/7/2021 2:55 AM, Jakub Jelinek wrote:
> > Hi!
> >
> > The following patch adds support for relocation of the PCH blob on PCH
> > restore if we don't manage to get the preferred map slot for it.
> > The GTY stuff knows where all the pointers are, after all it relocates
> > it once during PCH save from the addresses where it was initially allocated
> > to addresses in the preferred map slot.
> > But, if we were to do it solely using GTY info upon PCH restore, we'd need
> > another set of GTY functions, which I think would make it less maintainable
> > and I think it would also be more costly at PCH restore time. Those
> > functions would need to call something to add bias to pointers that haven't
> > been marked yet and make sure not to add bias to any pointer twice.
> >
> > So, this patch instead builds a relocation table (sorted list of addresses
> > in the blob which needs relocation) at PCH save time, stores it in a very
> > compact form into the gch file and upon restore, adjusts pointers in GTY
> > roots (that is right away in the root structures) and the addresses in the
> > relocation table.
> > The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size
> > growth, there are 2.5 million pointers that need relocation in the gch blob
> > and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes
> > for one address that needs relocation, and about 20% compile time during
> > PCH save (I think it is mainly because of the need to qsort those 2.5
> > million pointers). On PCH restore, if it doesn't need relocation (the usual
> > case), it is just an extra fread of sizeof (size_t) data and fseek
> > (in my tests real time on vanilla tree for #include <bits/stdc++.h> CU
> > was ~0.175s and with the patch but no relocation ~0.173s), while if it needs
> > relocation it took ~0.193s, i.e. 11.5% slower.
> >
> > The discovery of the pointers in the blob that need relocation is done
> > in the relocate_ptrs hook which does the pointer relocation during PCH save.
> > Unfortunately, I had to make one change to the gengtype stuff due to the
> > nested_ptr feature of GTY, which some libcpp headers and stringpool.c use.
> > The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie.
> > When relocate_ptrs is done, in most cases it is called solely on the
> > subfields of the current object, so e.g.
> > if ((void *)(x) == this_obj)
> > op (&((*x).u.fld[0].rt_rtx), cookie);
> > so relocate_ptrs can assert that ptr_p is within the
> > state->ptrs[state->ptrs_i]->obj ..
> > state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void*)
> > range and compute from that the address in the blob which will need
> > relocation (state->ptrs[state->ptrs_i]->new_addr is the new address
> > given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative
> > offset. Unfortunately, for nested_ptr gengtype emits something like:
> > {
> > union tree_node * x0 =
> > ((*x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((*x).val.node.node))) : NULL;
> > if ((void *)(x) == this_obj)
> > op (&(x0), cookie);
> > (*x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL;
> > }
> > so relocate_ptrs is called with an address of some temporary variable and
> > so doesn't know where the pointer will finally be.
> > So, I've added another argument to relocate_ptrs (and to
> > gt_pointer_operator). For the most common case I pass NULL as the new middle
> > argument to that function, first one remains pointer to the pointer that
> > needs adjustment and last the cookie. The NULL seems to be cheap to compute
> > and short in the gt*.[ch] files and stands for ptr_p is an address within
> > the this_obj's range, remember its address. For the nested_ptr case, the
> > new middle argument contains actual address of the pointer that might need
> > to be relocated, so instead of the above
> > op (&(x0), &((*x).val.node.node), cookie);
> > in there. And finally, e.g. for the reorder case I need a way to tell
> > restore_ptrs to ignore a particular address for the relocation purposes
> > and only treat it the old way. I've used for that the case when
> > the first and second arguments are equal.
> >
> > In order to enable support for mapping PCH as fallback at different
> > addresses than the preferred ones, a small change is needed to the
> > host pch_use_address hooks. One change I've done to all of them is
> > the change of the type of the first argument from void * to void *&,
> > such that the actual address can be told to the callers (or shall I
> > instead use void **?), but another change that still needs to be done
> > in them if they want the relocation is actually not fail if they couldn't
> > get a preferred address, but instead modify what the first argument
> > refers to. I've done that only for host-linux.c and Iain is testing
> > similar change for host-darwin.c. Didn't change hpux, netbsd, openbsd,
> > solaris, mingw32 or the fallbacks because I can't test those.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux (both as is and with
> > incremental
> > --- gcc/config/host-linux.c.jj 2021-12-06 22:22:42.007777367 +0100
> > +++ gcc/config/host-linux.c 2021-12-07 00:21:53.052674040 +0100
> > @@ -191,6 +191,8 @@ linux_gt_pch_use_address (void *&base, s
> > if (size == 0)
> > return -1;
> >
> > +base = (char *) base + ((size + 8191) & (size_t) -4096);
> > +
> > /* Try to map the file with MAP_PRIVATE. */
> > addr = mmap (base, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, offset);
> >
> > which forces all PCH restores to be relocated. An earlier version of the
> > patch has been also regrest with base = (char *) base + 16384; in that spot,
> > so both relocation to a non-overlapping spot and to an overlapping spot have
> > been tested.
> >
> > 2021-12-07 Jakub Jelinek <jakub@redhat.com>
> >
> > PR pch/71934
> > * coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead
> > of two.
> > * gengtype.c (struct walk_type_data): Add in_nested_ptr argument.
> > (walk_type): Temporarily set d->in_nested_ptr around nested_ptr
> > handling.
> > (write_types_local_user_process_field): Pass a new middle pointer
> > to gt_pointer_operator op calls, if d->in_nested_ptr pass there
> > address of d->prev_val[2], otherwise NULL.
> > (write_types_local_process_field): Likewise.
> > * ggc-common.c (relocate_ptrs): Add real_ptr_p argument. If equal
> > to ptr_p, do nothing, otherwise if NULL remember ptr_p's
> > or if non-NULL real_ptr_p's corresponding new address in
> > reloc_addrs_vec.
> > (reloc_addrs_vec): New variable.
> > (compare_ptr, read_uleb128, write_uleb128): New functions.
> > (gt_pch_save): When iterating over objects through relocate_ptrs,
> > save current i into state.ptrs_i. Sort reloc_addrs_vec and emit
> > it as uleb128 of differences between pointer addresses into the
> > PCH file.
> > (gt_pch_restore): Allow restoring of PCH to a different address
> > than the preferred one, in that case adjust global pointers by bias
> > and also adjust by bias addresses read from the relocation table
> > as uleb128 differences. Otherwise fseek over it. Perform
> > gt_pch_restore_stringpool only after adjusting callbacks and for
> > callback adjustments also take into account the bias.
> > (default_gt_pch_use_address): Change type of first argument from
> > void * to void *&.
> > (mmap_gt_pch_use_address): Likewise.
> > * ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op.
> > * hash-map.h (hash_map::pch_nx_helper): Likewise.
> > (gt_pch_nx): Likewise.
> > * hash-set.h (gt_pch_nx): Likewise.
> > * hash-table.h (gt_pch_nx): Likewise.
> > * hash-traits.h (ggc_remove::pch_nx): Likewise.
> > * hosthooks-def.h (default_gt_pch_use_address): Change type of first
> > argument from void * to void *&.
> > (mmap_gt_pch_use_address): Likewise.
> > * hosthooks.h (struct host_hooks): Change type of first argument of
> > gt_pch_use_address hook from void * to void *&.
> > * machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of
> > two in the middle argument.
> > * poly-int.h (gt_pch_nx): Likewise.
> > * stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op.
> > * tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass
> > the same &(block) twice.
> > * value-range.h (gt_pch_nx): Pass NULL as new middle argument to op.
> > * vec.h (gt_pch_nx): Likewise.
> > * wide-int.h (gt_pch_nx): Likewise.
> > * config/host-darwin.c (darwin_gt_pch_use_address): Change type of
> > first argument from void * to void *&.
> > * config/host-darwin.h (darwin_gt_pch_use_address): Likewise.
> > * config/host-hpux.c (hpux_gt_pch_use_address): Likewise.
> > * config/host-linux.c (linux_gt_pch_use_address): Likewise. If
> > it couldn't succeed to mmap at the preferred location, set base
> > to the actual one. Update addr in the manual reading loop instead of
> > base.
> > * config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of
> > first argument from void * to void *&.
> > * config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise.
> > * config/host-solaris.c (sol_gt_pch_use_address): Likewise.
> > * config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise.
> > * config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL
> > as new middle argument to op in the generated code.
> > * doc/gty.texi: Adjust samples for the addition of middle pointer
> > to gt_pointer_operator callback.
> > gcc/ada/
> > * gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument
> > to op.
> > gcc/c-family/
> > * c-pch.c (c_common_no_more_pch): Pass a temporary void * var
> > with NULL value instead of NULL to host_hooks.gt_pch_use_address.
> > gcc/c/
> > * c-decl.c (resort_field_decl_cmp): Pass the same pointer twice
> > to resort_data.new_value.
> > gcc/cp/
> > * module.cc (nop): Add another void * argument.
> > * name-lookup.c (resort_member_name_cmp): Pass the same pointer twice
> > to resort_data.new_value.
> Like the prior patch in this space, you know this better than anyone, so
> if you're comfortable, I think you should commit.
>
> I wonder if this is sufficient for the Ruby team to re-enable GCC PCH
> support. IIRC the whole problem was we couldn't deal with PIE & PCH and
> I think this fixes that problem, right? If so, you might want to reach
> out to them.
>
> Jeff
Having a note in changes.html would be helpful to point to for those purposes.
Eric
prev parent reply other threads:[~2021-12-10 1:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-07 9:55 Jakub Jelinek
2021-12-07 14:50 ` Jakub Jelinek
2021-12-08 8:00 ` Iain Sandoe
2021-12-08 23:10 ` Jeff Law
2021-12-09 14:59 ` Jakub Jelinek
2021-12-09 15:03 ` Iain Sandoe
2021-12-09 16:42 ` Christophe Lyon
2021-12-09 16:52 ` Jeff Law
2021-12-09 16:59 ` [committed] pch: Fix aarch64 build [PR71934] Jakub Jelinek
2021-12-10 12:56 ` [PATCH] pch: Small cleanup Jakub Jelinek
2021-12-10 21:20 ` Jeff Law
2021-12-08 23:09 ` [PATCH] pch: Add support for relocation of the PCH data [PR71934] Jeff Law
2021-12-10 1:20 ` Eric Gallager [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMfHzOvYCROTg4fsg5+cGZE_HQwh0SiQ1JAvRnE05hSVBzWiiw@mail.gmail.com \
--to=egall@gwmail.gwu.edu \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=jeffreyalaw@gmail.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).