From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by sourceware.org (Postfix) with ESMTPS id D99403858035 for ; Fri, 15 Sep 2023 13:10:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D99403858035 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x133.google.com with SMTP id 2adb3069b0e04-50078eba7afso3527631e87.0 for ; Fri, 15 Sep 2023 06:10:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1694783422; x=1695388222; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7lyIEfwV6+sSiJlbQ6laN8Iqhz/1fLVj4F/Lfmvoq40=; b=JICPUcAP4rnn3SuMWsF7VN6N16kkz3M8CmVxKzfZrdgrts/IdR1OgPrj8QYsbPP/bn +/a0bQriUhZ2aek2gopGf5UKvfRkCP6QMPeJPzjMzJ/O+LjazM7AXUgRMkY6u3yNqGOk VJ53CjtRAcwWdP54M4SbCtQV4FjBGzf3bKIHl4FJyG6GvM7sDMgWG4Pficvzw+S18ZJV K9FyzYv+Pd0mCQjKo4jrBYJ0YyQqKzk4P8pesxGEFJy7w4miB9zP5JOoAqd6qIy5Cbry GY8B4kYncJBZ6REdK8kXQiB2QYS3QElDAgbbuYH6Kg3QENqgczUgftnQcBR1DSg7P5Xk 4xcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694783422; x=1695388222; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7lyIEfwV6+sSiJlbQ6laN8Iqhz/1fLVj4F/Lfmvoq40=; b=s8xarh0TtChfTIOeqvCKjqr57eWlJLAp0n1yGphmcBZdThO5o6JeRfvJkZPGxy0abj szG+ABYznnoTzn3VydtWu6Kv9hGjVx5UXlqge6IrURaGDdXsQTxevPoH5fU7HOFWGgwb R3wtTi4/3vhsDsBv6/TiVJWtmecppAt4NHYYYEMkUsGZT9pvxJ9jSf7tqps//YW4a87L S4jBsVCoLNMNYuQdAcOyXiK++18+LhF1b+urkGqsnN0DicPHVfiHYhuWtVMsS0c3gzOR D9gi165wLWPdp6x5JLE4Bkw71N6yAPKX1dZTauooPcvLDxbkgzJ43hsM59vfjyOPIg40 tRcw== X-Gm-Message-State: AOJu0YzTEgCL+I8eXbuaGErf/UrNB3DKS5k8bZm5s/GdaLgpEv8QHLQI FFmQu5iP9MpdyrxehBefFyQmSRh98btiqqI1H7eFdQcw X-Google-Smtp-Source: AGHT+IFJdA8QTZz9MFVpEFuXg0FFFpRRKK4sC+n+TYIzNXxBBvmFk6U8YWkf/5k0UCeGElLQEYorWdLyHCif9cIx91Y= X-Received: by 2002:a05:6512:12cd:b0:500:7f71:e46b with SMTP id p13-20020a05651212cd00b005007f71e46bmr1835604lfg.1.1694783422098; Fri, 15 Sep 2023 06:10:22 -0700 (PDT) MIME-Version: 1.0 References: <87jzsryerg.fsf@euler.schwinge.homeip.net> <871qez7fqm.fsf@euler.schwinge.homeip.net> In-Reply-To: From: Richard Biener Date: Fri, 15 Sep 2023 15:10:10 +0200 Message-ID: Subject: Re: [WIP] Re-introduce 'TREE_USED' in tree streaming To: Thomas Schwinge Cc: Jan Hubicka , gcc-patches@gcc.gnu.org, Tobias Burnus , Jakub Jelinek Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Sep 15, 2023 at 3:05=E2=80=AFPM Richard Biener wrote: > > On Fri, Sep 15, 2023 at 3:01=E2=80=AFPM Thomas Schwinge wrote: > > > > Hi! > > > > On 2023-09-15T12:11:44+0200, Richard Biener via Gcc-patches wrote: > > > On Fri, Sep 15, 2023 at 11:20=E2=80=AFAM Thomas Schwinge > > > wrote: > > >> Now, that was another quirky debug session: in > > >> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set > > >> 'TREE_USED (t) =3D 1;' for '.omp_data_i', which ends up as formal pa= rameter > > >> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP b= lob". > > >> Yet, in offloading compilation, I only ever got '!TREE_USED' for the > > >> formal parameter '.omp_data_i'. This greatly disturbs a nvptx back = end > > >> expand-time transformation that I have implemented, that's active > > >> 'if (!TREE_USED ([formal parameter]))'. > > >> > > >> After checking along all the host-side OMP handling, eventually (in > > >> hindsight: "obvious"...) I found that, "simply", we're not streaming > > >> 'TREE_USED'! With that changed (see attached > > >> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in > > >> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'),= my > > >> issue was quickly addressed -- if not for the question *why* 'TREE_U= SED' > > >> isn't streamed (..., and apparently, that's a problem only for my > > >> case..?), and then I found that it's *intentionally been removed* > > >> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972 > > >> (Subversion r200151) "Re-write LTO type merging again, do tree mergi= ng". > > >> > > >> At this point, I need help: is this OK to re-introduce unconditional= ly, > > >> or in some conditionalized form (but, "ugh..."), or be done differen= tly > > >> altogether in the nvptx back end (is 'TREE_USED' considered "stale" = at > > >> some point in the compilation pipeline?), or do we need some logic i= n > > >> tree stream read-in (?) to achieve the same thing that removing > > >> 'TREE_USED' streaming apparently did achieve, or yet something else? > > >> Indeed, from a quick look, most use of 'TREE_USED' seems to be "earl= y", > > >> but I saw no reason that it couldn't be used "late", either? > > > > > > TREE_USED is considered stale, it doesn't reflect reality and is used= with > > > different semantics throughout the pass pipeline > > > > Aha, thanks. Any suggestion about how to update 'gcc/tree.h:TREE_USED'= , > > for next time, to detail at which stages the properties indicated there > > are meaningful? (..., and we shall also add some such comment in the t= wo > > tree streamer functions.) > > > > > so it doesn't make much sense > > > to stream it also because it will needlessly cause divergence between= TUs > > > during tree merging. > > > > Right, that's what I'd assumed from quickly skimming the 2013 discussio= n. > > > > > So we definitely do not want to stream TREE_USED for > > > every tree. > > > > > > Why would you guard anything late on TREE_USED? If you want to know > > > whether a formal parameter is "used" (used in code generation? used = in the > > > source?) you have to compute this property. As you can see using TRE= E_USED > > > is fragile. > > > > The issue is: for function call outgoing/incoming arguments, the nvptx > > back end has (to use) a mechanism different from usual targets. For th= e > > latter, the incoming arguments are readily available in registers or on > > the stack, without requiring emission of any setup instructions. For > > nvptx, we have to generate boilerplate code for every function incoming > > argument, to load the argument value into a local register. (The latte= r > > are then, at least for '-O0', spilled to and restored from the stack > > frame, before the first actual use -- if there's any use at all.) > > > > This generates some bulky PTX code, which goes so far that we run into > > timeout or OOM-killed 'ptxas' for 'gcc.c-torture/compile/limits-fndefn.= c' > > at '-O0', for example, where we've got half a million lines of > > boilerplate PTX code. That one certainly is a rogue test case, but I > > then found that if I conditionalize emission of that incoming argument > > setup code on 'TREE_USED' of the respective element of the chain of > > 'DECL_ARGUMENTS', then I do get the desired behavior: zero-instructions > > 'limits-fndefn.S'. So this "late" use of 'TREE_USED' does work -- just > > that, as discussed, 'TREE_USED' isn't available in the offloading > > setting. ;-) > > > > I'll look into computing "unused" locally, before/for nvptx expand time= . > > (To make the '-O0' case work, I figure this has to happen early, instea= d > > of later DCEing the mess that we generated earlier.) Any quick > > suggestions? My na=C3=AFve first idea would be to simply in > > 'TARGET_FUNCTION_INCOMING_ARG' scan if the corresponding element of > > 'DECL_ARGUMENTS' is used in the function, or maybe do that once for all > > 'DECL_ARGUMENTS' in 'INIT_CUMULATIVE_INCOMING_ARGS'. > > RTL expansion re-computes TREE_USED (well, it computes something into > it related to use), but it does so only for BLOCK scope variables and > local decls. > I suppose extending it to also re-compute TREE_USED for formal parameters > should be straight-forward. Btw, it does sound somewhat like premature optimization for the limits-fndefn testcase, doesn't it? > Richard. > > > > > Gr=C3=BC=C3=9Fe > > Thomas > > > > > > >> Original discussion "not streaming and comparing TREE_USED": > > >> > > >> "[RFC] Re-write LTO type merging again, do tree merging", continued > > >> > > >> "Re-write LTO type merging again, do tree merging". > > >> > > >> > > >> In 2013, offloading compilation was just around the corner -- > > >> > > >> "Summary of the Accelerator BOF at Cauldron" -- and you easily could= 've > > >> foreseen this issue, no? ;-P > > >> > > >> > > >> Gr=C3=BC=C3=9Fe > > >> Thomas > > >> > > >> > > >> ----------------- > > >> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3= =9Fe 201, 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; G= esch=C3=A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gese= llschaft: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 > > ----------------- > > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe= 201, 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch= =C3=A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellsc= haft: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955