From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id 868AB3858C3A for ; Fri, 10 Dec 2021 01:21:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 868AB3858C3A Received: by mail-wm1-x32a.google.com with SMTP id 77-20020a1c0450000000b0033123de3425so7973238wme.0 for ; Thu, 09 Dec 2021 17:21:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6oxggretnxJAF2I1KFYY3nr75uH1ms+Fdydd5zsZo/4=; b=SSoT4JSs56zAXFGOtd48hv3UQw6UtDaxMr9znDmIIU6czZK+ap24AcBfn6a9VcJeqy 4ADlnZI+sMJL7+QrCWp2WlR5ICxxMYVtu9zpibusUXdm/43bCjSXFeinMqGdwcQ9B87A rYu4Zbc+iMhcqV02XQW1YpFxavAKBVItPZv55ZuO+tvEIEBmiCu+84jiRD325wZgzv0v pILf8oEz6iewRUC6x+qkSYuOFWN1KDsIJK4/aGDdk34yQ2Ab6DSriN2BWxt8nG9BODXz WY8s7O0SmFoYaYRuh4tUjerCr5h6VV5/9t27Tb3p7FxGWRX9I3sMNGYpxvlsMsqCME1H aYnA== X-Gm-Message-State: AOAM533pC/VZOKaaB/YGE/rMsJ5O1Jfxx/Xu+m9EsWlE3Lzyys+dWSjR jFRhxlkcuGJvOm2g+5RVMu6eef+PvYIDSLALti3S9w== X-Google-Smtp-Source: ABdhPJz9I9gloqxl0a8TyijxtUNBeD7vEulHzE3ZypvvHtCeLqgJbJ+UTudCW6dUifVdDpHBzZK88LXzOkL6Gelrp24= X-Received: by 2002:a05:600c:501f:: with SMTP id n31mr11992763wmr.101.1639099270991; Thu, 09 Dec 2021 17:21:10 -0800 (PST) MIME-Version: 1.0 References: <20211207095507.GV2646553@tucnak> <42aa9a74-56ee-980f-5be7-57ca218f5023@gmail.com> In-Reply-To: <42aa9a74-56ee-980f-5be7-57ca218f5023@gmail.com> From: Eric Gallager Date: Thu, 9 Dec 2021 20:20:59 -0500 Message-ID: Subject: Re: [PATCH] pch: Add support for relocation of the PCH data [PR71934] To: Jeff Law Cc: Jakub Jelinek , Richard Biener , gcc-patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2021 01:21:16 -0000 On Wed, Dec 8, 2021 at 6:10 PM Jeff Law via Gcc-patches wrote: > > > > On 12/7/2021 2:55 AM, Jakub Jelinek wrote: > > Hi! > > > > The following patch adds support for relocation of the PCH blob on PCH > > restore if we don't manage to get the preferred map slot for it. > > The GTY stuff knows where all the pointers are, after all it relocates > > it once during PCH save from the addresses where it was initially allocated > > to addresses in the preferred map slot. > > But, if we were to do it solely using GTY info upon PCH restore, we'd need > > another set of GTY functions, which I think would make it less maintainable > > and I think it would also be more costly at PCH restore time. Those > > functions would need to call something to add bias to pointers that haven't > > been marked yet and make sure not to add bias to any pointer twice. > > > > So, this patch instead builds a relocation table (sorted list of addresses > > in the blob which needs relocation) at PCH save time, stores it in a very > > compact form into the gch file and upon restore, adjusts pointers in GTY > > roots (that is right away in the root structures) and the addresses in the > > relocation table. > > The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size > > growth, there are 2.5 million pointers that need relocation in the gch blob > > and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes > > for one address that needs relocation, and about 20% compile time during > > PCH save (I think it is mainly because of the need to qsort those 2.5 > > million pointers). On PCH restore, if it doesn't need relocation (the usual > > case), it is just an extra fread of sizeof (size_t) data and fseek > > (in my tests real time on vanilla tree for #include CU > > was ~0.175s and with the patch but no relocation ~0.173s), while if it needs > > relocation it took ~0.193s, i.e. 11.5% slower. > > > > The discovery of the pointers in the blob that need relocation is done > > in the relocate_ptrs hook which does the pointer relocation during PCH save. > > Unfortunately, I had to make one change to the gengtype stuff due to the > > nested_ptr feature of GTY, which some libcpp headers and stringpool.c use. > > The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie. > > When relocate_ptrs is done, in most cases it is called solely on the > > subfields of the current object, so e.g. > > if ((void *)(x) == this_obj) > > op (&((*x).u.fld[0].rt_rtx), cookie); > > so relocate_ptrs can assert that ptr_p is within the > > state->ptrs[state->ptrs_i]->obj .. > > state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void*) > > range and compute from that the address in the blob which will need > > relocation (state->ptrs[state->ptrs_i]->new_addr is the new address > > given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative > > offset. Unfortunately, for nested_ptr gengtype emits something like: > > { > > union tree_node * x0 = > > ((*x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((*x).val.node.node))) : NULL; > > if ((void *)(x) == this_obj) > > op (&(x0), cookie); > > (*x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL; > > } > > so relocate_ptrs is called with an address of some temporary variable and > > so doesn't know where the pointer will finally be. > > So, I've added another argument to relocate_ptrs (and to > > gt_pointer_operator). For the most common case I pass NULL as the new middle > > argument to that function, first one remains pointer to the pointer that > > needs adjustment and last the cookie. The NULL seems to be cheap to compute > > and short in the gt*.[ch] files and stands for ptr_p is an address within > > the this_obj's range, remember its address. For the nested_ptr case, the > > new middle argument contains actual address of the pointer that might need > > to be relocated, so instead of the above > > op (&(x0), &((*x).val.node.node), cookie); > > in there. And finally, e.g. for the reorder case I need a way to tell > > restore_ptrs to ignore a particular address for the relocation purposes > > and only treat it the old way. I've used for that the case when > > the first and second arguments are equal. > > > > In order to enable support for mapping PCH as fallback at different > > addresses than the preferred ones, a small change is needed to the > > host pch_use_address hooks. One change I've done to all of them is > > the change of the type of the first argument from void * to void *&, > > such that the actual address can be told to the callers (or shall I > > instead use void **?), but another change that still needs to be done > > in them if they want the relocation is actually not fail if they couldn't > > get a preferred address, but instead modify what the first argument > > refers to. I've done that only for host-linux.c and Iain is testing > > similar change for host-darwin.c. Didn't change hpux, netbsd, openbsd, > > solaris, mingw32 or the fallbacks because I can't test those. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux (both as is and with > > incremental > > --- gcc/config/host-linux.c.jj 2021-12-06 22:22:42.007777367 +0100 > > +++ gcc/config/host-linux.c 2021-12-07 00:21:53.052674040 +0100 > > @@ -191,6 +191,8 @@ linux_gt_pch_use_address (void *&base, s > > if (size == 0) > > return -1; > > > > +base = (char *) base + ((size + 8191) & (size_t) -4096); > > + > > /* Try to map the file with MAP_PRIVATE. */ > > addr = mmap (base, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, offset); > > > > which forces all PCH restores to be relocated. An earlier version of the > > patch has been also regrest with base = (char *) base + 16384; in that spot, > > so both relocation to a non-overlapping spot and to an overlapping spot have > > been tested. > > > > 2021-12-07 Jakub Jelinek > > > > PR pch/71934 > > * coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead > > of two. > > * gengtype.c (struct walk_type_data): Add in_nested_ptr argument. > > (walk_type): Temporarily set d->in_nested_ptr around nested_ptr > > handling. > > (write_types_local_user_process_field): Pass a new middle pointer > > to gt_pointer_operator op calls, if d->in_nested_ptr pass there > > address of d->prev_val[2], otherwise NULL. > > (write_types_local_process_field): Likewise. > > * ggc-common.c (relocate_ptrs): Add real_ptr_p argument. If equal > > to ptr_p, do nothing, otherwise if NULL remember ptr_p's > > or if non-NULL real_ptr_p's corresponding new address in > > reloc_addrs_vec. > > (reloc_addrs_vec): New variable. > > (compare_ptr, read_uleb128, write_uleb128): New functions. > > (gt_pch_save): When iterating over objects through relocate_ptrs, > > save current i into state.ptrs_i. Sort reloc_addrs_vec and emit > > it as uleb128 of differences between pointer addresses into the > > PCH file. > > (gt_pch_restore): Allow restoring of PCH to a different address > > than the preferred one, in that case adjust global pointers by bias > > and also adjust by bias addresses read from the relocation table > > as uleb128 differences. Otherwise fseek over it. Perform > > gt_pch_restore_stringpool only after adjusting callbacks and for > > callback adjustments also take into account the bias. > > (default_gt_pch_use_address): Change type of first argument from > > void * to void *&. > > (mmap_gt_pch_use_address): Likewise. > > * ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op. > > * hash-map.h (hash_map::pch_nx_helper): Likewise. > > (gt_pch_nx): Likewise. > > * hash-set.h (gt_pch_nx): Likewise. > > * hash-table.h (gt_pch_nx): Likewise. > > * hash-traits.h (ggc_remove::pch_nx): Likewise. > > * hosthooks-def.h (default_gt_pch_use_address): Change type of first > > argument from void * to void *&. > > (mmap_gt_pch_use_address): Likewise. > > * hosthooks.h (struct host_hooks): Change type of first argument of > > gt_pch_use_address hook from void * to void *&. > > * machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of > > two in the middle argument. > > * poly-int.h (gt_pch_nx): Likewise. > > * stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op. > > * tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass > > the same &(block) twice. > > * value-range.h (gt_pch_nx): Pass NULL as new middle argument to op. > > * vec.h (gt_pch_nx): Likewise. > > * wide-int.h (gt_pch_nx): Likewise. > > * config/host-darwin.c (darwin_gt_pch_use_address): Change type of > > first argument from void * to void *&. > > * config/host-darwin.h (darwin_gt_pch_use_address): Likewise. > > * config/host-hpux.c (hpux_gt_pch_use_address): Likewise. > > * config/host-linux.c (linux_gt_pch_use_address): Likewise. If > > it couldn't succeed to mmap at the preferred location, set base > > to the actual one. Update addr in the manual reading loop instead of > > base. > > * config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of > > first argument from void * to void *&. > > * config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise. > > * config/host-solaris.c (sol_gt_pch_use_address): Likewise. > > * config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise. > > * config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL > > as new middle argument to op in the generated code. > > * doc/gty.texi: Adjust samples for the addition of middle pointer > > to gt_pointer_operator callback. > > gcc/ada/ > > * gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument > > to op. > > gcc/c-family/ > > * c-pch.c (c_common_no_more_pch): Pass a temporary void * var > > with NULL value instead of NULL to host_hooks.gt_pch_use_address. > > gcc/c/ > > * c-decl.c (resort_field_decl_cmp): Pass the same pointer twice > > to resort_data.new_value. > > gcc/cp/ > > * module.cc (nop): Add another void * argument. > > * name-lookup.c (resort_member_name_cmp): Pass the same pointer twice > > to resort_data.new_value. > Like the prior patch in this space, you know this better than anyone, so > if you're comfortable, I think you should commit. > > I wonder if this is sufficient for the Ruby team to re-enable GCC PCH > support. IIRC the whole problem was we couldn't deal with PIE & PCH and > I think this fixes that problem, right? If so, you might want to reach > out to them. > > Jeff Having a note in changes.html would be helpful to point to for those purposes. Eric