From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 37775 invoked by alias); 16 Sep 2016 08:59:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 37737 invoked by uid 89); 16 Sep 2016 08:59:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RCVD_IN_SORBS_SPAM,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=yesterday, H*i:sk:87twdgb, H*f:sk:87twdgb, you! X-Spam-User: qpsmtpd, 2 recipients X-HELO: mail-wm0-f47.google.com Received: from mail-wm0-f47.google.com (HELO mail-wm0-f47.google.com) (74.125.82.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Sep 2016 08:59:19 +0000 Received: by mail-wm0-f47.google.com with SMTP id b187so25333555wme.1; Fri, 16 Sep 2016 01:59:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ZeEhes+5I7jpQMsOLOmGzNEvc8QhVvLHgeSH+KY9Wus=; b=TMC5QGv8zVNdZYaUQeQIJ4kbiuy/cPo5LX+wnvC8n67rJ+8czmP1PE2zKkdhJm6EDc vZxyPeCyxVUUmsvWea8X9dEHNYKnwRCkX/l2zJCw//I218PF05rqpI7S4t5oOPyPldLc 9e3alASRqQKLo8Q8adG2jcX44HQ+/Ct85b1L6RFxsv/MYAIC4/IY12maRiPQzY+P3sbl 9MCwlIdU6MTUy0S99pi9vDUwGGMYw/N1nSwSrDTyvr1mncnUfZ9o9o3rMadTJiVKz7QD /Xz9Na6fWZts581aAb+nHV+gTbn7aQEsdFsz6V1oJudAi+NuQmNs9stFN3B7dKMa+5nR cCHA== X-Gm-Message-State: AE9vXwPDkvIjn0aq+2JvhYs76a5AhBQ9A/tdKskdnfkfkzgX4MPT+LFNdbdNj4AMp/G8Jw0oqW3qik+G99BNkg== X-Received: by 10.194.88.137 with SMTP id bg9mr11465002wjb.155.1474016357457; Fri, 16 Sep 2016 01:59:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.137.129 with HTTP; Fri, 16 Sep 2016 01:59:16 -0700 (PDT) In-Reply-To: <87twdgb9zi.fsf@hertz.schwinge.homeip.net> References: <20160817154244.GA39270@arm.com> <87h99s53ls.fsf@kepler.schwinge.homeip.net> <87d1kefwgt.fsf@hertz.schwinge.homeip.net> <87twdgb9zi.fsf@hertz.schwinge.homeip.net> From: Richard Biener Date: Fri, 16 Sep 2016 09:02:00 -0000 Message-ID: Subject: Re: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6])) To: Thomas Schwinge Cc: GCC Patches , Joseph Myers , GCC Development , Jakub Jelinek , Bernd Schmidt Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2016-09/txt/msg01016.txt.bz2 On Fri, Sep 16, 2016 at 9:05 AM, Thomas Schwinge wrote: > Hi! > > (CCing Bernd and Jakub -- for your information, or: "amusement" -- as > you've discussed offloading preload_common_nodes issues before...) > > Got to look into this some more, yesterday: > > On Thu, 08 Sep 2016 13:43:30 +0200, I wrote: >> On Wed, 7 Sep 2016 14:23:18 +0200, Richard Biener wrote: >> > On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge wrote: >> > > On Fri, 19 Aug 2016 11:05:59 +0000, Joseph Myers wrote: >> > >> On Fri, 19 Aug 2016, Richard Biener wrote: >> > >> > Can you quickly verify if LTO works with the new types? I don't = see anything >> > >> > that would prevent it but having new global trees and backends in= itializing them >> > >> > might come up with surprises (see tree-streamer.c:preload_common_= nodes) >> > >> >> > >> Well, the execution tests are in gcc.dg/torture, which is run with = various >> > >> options including -flto (and I've checked the testsuite logs to con= firm >> > >> these tests are indeed run with such options). Is there something = else >> > >> you think should be tested? >> > > >> > > As I noted in : >> > > >> > > As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx = types", nvptx >> > > offloading is broken, ICEs in LTO stream-in. Probably some kind= of data-type >> > > mismatch that is not visible with Intel MIC offloading (using th= e same data >> > > types) but explodes with nvptx. I'm having a look. > >> > [...] preload_common_nodes. This is carefully crafted to _not_ diverg= e by >> > frontend (!) it wasn't even designed to cope with global trees being p= resent >> > or not dependent on target (well, because the target is always the >> > same! mind you!) >> >> Scary. ;-/ >> >> > Now -- in theory it should deal with NULLs just fine (resulting in >> > error_mark_node), but it can diverge when there are additional >> > compount types (like vectors, complex >> > or array or record types) whose element types are not in the set of >> > global trees. >> > The complex _FloatN types would be such a case given they appear befor= e their >> > components. That mixes up the ordering at least. >> >> ACK, but that's also an issue for "regular" float/complex float, which >> also is in "reverse" order. But that's "fixed" by the recursion in >> gcc/tree-streamer.c:record_common_node for "TREE_CODE (node) =3D=3D >> COMPLEX_TYPE". This likewise seems to work for the _FloatN types. > > So, that mechanism does work, but what's going wrong is the following: > with differing target vs. offload target, we potentially (and actually > do, in the case of x86_64 vs. nvptx) have different sets of _FloatN and > _FloatNx types. That is, for nvptx, a few of these don't exist (so, > NULL_TREE), and so it follows their complex variants also don't exist, > and thus the recursion that I just mentioned for complex types is no > longer done in lockstep in the x86_64 cc1 vs. the nvptx lto1, hence we > get an offset (of two, in this specific case), and consequently streaming > explodes, for example, as soon as it hits a forward-reference (due to > looking for tree 185 (x86_64 cc1 view; as encoded in the stream) when it > should be looking for tree 183 (nvptx lto1 view). > >> (I've >> put "fixed" in quotes -- doesn't that recursion mean that we're thus >> putting "complex float", "float", [...], "float" (again) into the cache? >> Anyway that's for later...) > > Maybe it would make sense to do this tree streaming in two passes: first > build a set of what we actually need, and then stream that, without > duplicates. (Or, is also for these "pickled" trees their order relevant, > so that one tree may only refer to other earlier but not later ones? > Anyway, we could still remember the set of trees already streamed, and > avoid the double streaming I described?) > > So I now understand that due to the stream format, the integer tree IDs > (cache->next_id) have to match in all cc1/lto1s (etc.), so we'll have to > make that work for x86_64 target vs. nvptx offload target being > different. (I'm pondering some ideas about how to rework that integer > tree ID generation.) > > (I have not digested completely yet what the implications are for the > skipping we're doing for some trees in preload_common_nodes), but here is > a first patch that at least gets us rid of the immediate problem. (I'll > fix the TODO by adding a "#define TI_COMPLEX_FLOATN_NX_TYPE_LAST" to > gcc/tree-core.h, OK?) Will such a patch be OK for trunk, at least for > now? Humm ... do we anywhere compare to those global trees by pointer equivalenc= e? If so then it breaks LTO support for those types. I think forcing proper ordering so that we can assert that at the point we'd like to recurse the nodes we recurse for are already in the cache would be a bet= ter fix. This might need some additional global trees in case components do not explicitely exist and it might need re-ordering of a few globals. Can you try if all hell breaks lose if you change the recursions to instead= do gcc_assert (streamer_tree_cache_lookup (cache, , &te= m)); ? Thanks, Richard. > commit d2045bd028104be2ede5ae1d5d7b9395e67a8180 > Author: Thomas Schwinge > Date: Thu Sep 15 21:56:16 2016 +0200 > > [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _Float= Nx types > > gcc/ > PR lto/77458 > * tree-streamer.c (preload_common_nodes): Skip _FloatN and > _FloatNx types if offloading. > --- > gcc/tree-streamer.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git gcc/tree-streamer.c gcc/tree-streamer.c > index 7ea7096..061d831 100644 > --- gcc/tree-streamer.c > +++ gcc/tree-streamer.c > @@ -333,7 +333,12 @@ preload_common_nodes (struct streamer_tree_cache_d *= cache) > && (!lto_stream_offload_p > || (i !=3D TI_VA_LIST_TYPE > && i !=3D TI_VA_LIST_GPR_COUNTER_FIELD > - && i !=3D TI_VA_LIST_FPR_COUNTER_FIELD))) > + && i !=3D TI_VA_LIST_FPR_COUNTER_FIELD > + /* Likewise for the _FloatN and _FloatNx types. */ > + && (i < TI_FLOATN_NX_TYPE_FIRST > + || i > TI_FLOATN_NX_TYPE_LAST) > + && (i < TI_COMPLEX_FLOATN_NX_TYPE_FIRST > + || i > /* TODO */ TI_COMPLEX_FLOAT128X_TYPE)))) > record_common_node (cache, global_trees[i]); > } > > > > Gr=C3=BC=C3=9Fe > Thomas