From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) by sourceware.org (Postfix) with ESMTPS id EB311385781F for ; Thu, 2 Sep 2021 09:25:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EB311385781F Received: by mail-ed1-x529.google.com with SMTP id n11so1861701edv.11 for ; Thu, 02 Sep 2021 02:25:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hpwagmcq+qvLba0t5agecTTjOeloPh+0znjZfFxZh7A=; b=TspA8qf/wGssv7TP14f07w2qhjN1qMMe6YPuGnrYz4nLImKf5YZUqGdq6E6Y4ntbo2 Px0F7w05Tmtyz9P+4dNT8jqff7IPOjLT3RwW16QGm5a4E3j0HlKN/QdxreHhoC1Ku02w GIe3yoOcUqYLFIIARyolD0VaJpS3+B/E3d6Z4GeHgD4PIr4Os4g6bEhHgyD+ih+VpbbL jLpy6l2LQWNI4hgqSas5WCNT8ZJiRo2+cov1BAoEI2DBIQzSMzBLWFLsiHxPT8589kki mtE4b0PjCVulMJphSH1uRv43Q47gMMwXHTFlqiBgOCKQwU5gj3gtczKmX9gbBEkmgoRt J1HQ== X-Gm-Message-State: AOAM533R8PfMwawKRiFna+hKncx2d0bfYU9lWIg97vNVu38qebnARYRT SnCRzJwILPbtZkIP5uzrvFmC5Is8zaFGIVMyDxg= X-Google-Smtp-Source: ABdhPJw/NepWGXceP2nB/qjneiFOjGEtl37akTED5VwWeATrch1QhFSVFh6ndNvK1vO1OrtCexiSXLWoAgFwDAkmJAA= X-Received: by 2002:aa7:cc88:: with SMTP id p8mr1408579edt.361.1630574757907; Thu, 02 Sep 2021 02:25:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Thu, 2 Sep 2021 11:25:46 +0200 Message-ID: Subject: Re: [RFC/PATCH] ipa-inline: Add target info into fn summary [PR102059] To: "Kewen.Lin" Cc: GCC Patches , Jan Hubicka , =?UTF-8?Q?Martin_Li=C5=A1ka?= , Segher Boessenkool , Bill Schmidt , Florian Weimer , Martin Jambor Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Sep 2021 09:26:09 -0000 On Wed, Sep 1, 2021 at 9:02 AM Kewen.Lin wrote: > > Hi! > > Power ISA 2.07 (Power8) introduces transactional memory feature > but ISA3.1 (Power10) removes it. It exposes one troublesome > issue as PR102059 shows. Users define some function with > target pragma cpu=power10 then it calls one function with > attribute always_inline which inherits command line option > cpu=power8 which enables HTM implicitly. The current isa_flags > check doesn't allow this inlining due to "target specific > option mismatch" and error mesasge is emitted. > > Normally, the callee function isn't intended to exploit HTM > feature, but the default flag setting make it look it has. > As Richi raised in the PR, we have fp_expressions flag in > function summary, and allow us to check the function actually > contains any floating point expressions to avoid overkill. > So this patch follows the similar idea but is more target > specific, for this rs6000 port specific requirement on HTM > feature check, we would like to check rs6000 specific HTM > built-in functions and inline assembly, it allows targets > to do their own customized checks and updates. > > It introduces two target hooks need_ipa_fn_target_info and > update_ipa_fn_target_info. The former allows target to do > some previous check and decides to collect target specific > information for this function or not. For some special case, > it can predict the analysis result and push it early without > any scannings. The latter allows the analyze_function_body > to pass gimple stmts down just like fp_expressions handlings, > target can do its own tricks. I put them as one hook initially > with one boolean to indicates whether it's initial time, but > the code looks a bit ugly, to separate them seems to have > better readability. > > To make it simple, this patch uses HOST_WIDE_INT to record the > flags just like what we use for isa_flags. For rs6000's HTM > need, one HOST_WIDE_INT variable is quite enough, but it seems > good to have one auto_vec for scalability as I noticed some > targets have more than one HOST_WIDE_INT flag. For now, this > target information collection is only for always_inline function, > function ipa_merge_fn_summary_after_inlining deals with target > information merging. > > The patch has been bootstrapped and regress-tested on > powerpc64le-linux-gnu Power9. > > Is it on the right track? + if (always_inline) + { + cgraph_node *callee_node = cgraph_node::get (callee); + if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL) + { + if (dump_file) + ipa_dump_fn_summary (dump_file, callee_node); + const vec &info = + ipa_fn_summaries->get (callee_node)->target_info; + if (!info.is_empty ()) + always_inline_safe_mask |= ~info[0] & OPTION_MASK_HTM; + } + + caller_isa &= ~always_inline_safe_mask; + callee_isa &= ~always_inline_safe_mask; + } that's a bit convoluted but obviously the IPA info can be used for non-always_inline cases as well. As said above the info can be produced for not always-inline functions as well, the usual case would be for LTO inlining across TUs compiled with different target options. In your case the special -mcpu=power10 TU would otherwise not be able to inline from a general -mcpu=power8 TU. On the streaming side we possibly have to take care about the GPU offloading path where we likely want to avoid pushing host target bits to the GPU target in some way. Your case is specifically looking for HTM target builtins - for more general cases, like for example deciding whether we can inline across, say, -mlzcnt on x86 the scanning would have to include at least direct internal-fns mapping to optabs that could change. With such inlining we might also work against heuristic tuning decisions based on the ISA availability making code (much) more expensive to expand without such ISA availability, but that wouldn't be a correctness issue then. Otherwise the overall bits look OK but I'll leave the details to our IPA folks. Thanks, Richard. > > Any comments are highly appreciated! > > BR, > Kewen > ------ > gcc/ChangeLog: > > PR ipa/102059 > * config/rs6000/rs6000-call.c (rs6000_builtin_mask_set_p): New > function. > * config/rs6000/rs6000-internal.h (rs6000_builtin_mask_set_p): New > declare. > * config/rs6000/rs6000.c (TARGET_NEED_IPA_FN_TARGET_INFO): New macro. > (TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise. > (rs6000_need_ipa_fn_target_info): New function. > (rs6000_update_ipa_fn_target_info): Likewise. > (rs6000_can_inline_p): Adjust for ipa function summary target info. > * ipa-fnsummary.c (ipa_dump_fn_summary): Adjust for ipa function > summary target info. > (analyze_function_body): Adjust for ipa function summary target > info and call hook rs6000_need_ipa_fn_target_info and > rs6000_update_ipa_fn_target_info. > (ipa_merge_fn_summary_after_inlining): Adjust for ipa function > summary target info. > (inline_read_section): Likewise. > (ipa_fn_summary_write): Likewise. > * ipa-fnsummary.h (ipa_fn_summary::target_info): New member. > * doc/tm.texi: Regenerate. > * doc/tm.texi.in (TARGET_UPDATE_IPA_FN_TARGET_INFO): Document new > hook. > (TARGET_NEED_IPA_FN_TARGET_INFO): Likewise. > * target.def (update_ipa_fn_target_info): New hook. > (need_ipa_fn_target_info): Likewise. > * targhooks.c (default_need_ipa_fn_target_info): New function. > (default_update_ipa_fn_target_info): Likewise. > * targhooks.h (default_update_ipa_fn_target_info): New declare. > (default_need_ipa_fn_target_info): Likewise. > > gcc/testsuite/ChangeLog: > > PR ipa/102059 > * gcc.dg/lto/pr102059_0.c: New test. > * gcc.dg/lto/pr102059_1.c: New test. > * gcc.dg/lto/pr102059_2.c: New test. > * gcc.target/powerpc/pr102059-5.c: New test. > * gcc.target/powerpc/pr102059-6.c: New test. >