From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 72543 invoked by alias); 26 Oct 2018 08:42:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 36550 invoked by uid 89); 26 Oct 2018 08:40:12 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=HX-Received:sk:h2-v6mr, CALL_EXPR_ARG, call_expr_arg X-HELO: mail-lf1-f66.google.com Received: from mail-lf1-f66.google.com (HELO mail-lf1-f66.google.com) (209.85.167.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 26 Oct 2018 08:40:10 +0000 Received: by mail-lf1-f66.google.com with SMTP id c16so297032lfj.8 for ; Fri, 26 Oct 2018 01:40:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ULdcfAS58m4aRwK4zFPmaMH21gtGZqWHICLWQx3QDQQ=; b=RI2zNEi8icSbDomnm5jXfcXSKhfIeRSq8Yb3amTQnVUdrr7S8uJzMC9FPx30SQS3iI 1lEPpgB/jtf3KectoRkOwNluJF2zN3tYtenlcR3d7CngCFtPOqB8z0hhvb2dWqwdGIXv zaPY5dYyogyw+ctS83yh7IC/Yy7nk5p1/vH/r+NZpPjmb+c5X9QtK12bf8HlQlA8ql0A pBk5A6PqqBg9nmFgLVSlR/FtyOKGfUa4PGVrn+J6aNHxJFgWrFooihBWF4HLRTdBt1Oq clmmWY73h+EKadQVkDOWmlRhy39+W+t3f83wJijGaNblEXgT69Fk9pvLe4xt9l4FZSl3 dMmA== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Fri, 26 Oct 2018 09:18:00 -0000 Message-ID: Subject: Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it To: Jeff Law , Richard Sandiford Cc: Kugan Vivekanandarajah , GCC Patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-10/txt/msg01677.txt.bz2 On Fri, Oct 26, 2018 at 4:55 AM Jeff Law wrote: > > On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote: > > Hi, > > > > PR87528 showed a case where libgcc generated popcount is causing > > regression for Skylake. > > We also have PR86677 where kernel build is failing because the kernel > > does not use the libgcc (when backend is not defining popcount > > pattern). While I agree that the kernel should implement its own > > functionality when it is not using the libgcc, I am afraid that the > > implementation can have the same performance issues reported for > > Skylake in PR87528. > > > > Therefore, I would like to propose that we disable popcount detection > > when we don't have a pattern for that. The attached patch (based on > > previous discussions) does this. > > > > Bootstrapped and regression tested on x86_64-linux-gnu with no new > > regressions. We need to disable the popcount* testcases. I will have > > to define a effective_target_with_popcount in > > gcc/testsuite/lib/target-supports.exp if this patch is OK? > > Thanks, > > Kugan > > > > > > gcc/ChangeLog: > > > > 2018-10-25 Kugan Vivekanandarajah > > > > * tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN POPCOUNT > > as expensive when backend does not define it. > > > > > > gcc/testsuite/ChangeLog: > > > > 2018-10-25 Kugan Vivekanandarajah > > > > * gcc.target/aarch64/popcount4.c: New test. > > > FWIW, I've been disabling by checking direct_optab_handler elsewhere > (number_of_iterations_popcount) in my tester. It may in fact be an old > patch from you. > > Richi argued that it's the kernel team's responsibility to provide a > popcount since they don't link with libgcc. And I'm generally in > agreement with that position, though it does tend to generate some > friction with the kernel developers. We also run the real risk of GCC 9 > not being able to build the kernel which, IMHO, would be a disaster from > a PR standpoint. > > I'd like to hear from others here. I fully realize we're beyond the > realm of what is strictly technically correct here from a review standpoint. As said final value replacement to a library call is probably not wanted for optimization purpose, so adjusting expression_expensive_p is OK with me. It might not fully solve the (non-)issue in case another optimization pass chooses to materialize niter computation result. Few comments on the patch: + tree fndecl = get_callee_fndecl (expr); + + if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL) + { + combined_fn cfn = as_combined_fn (DECL_FUNCTION_CODE (fndecl)); combined_fn cfn = gimple_call_combined_fn (expr); switch (cfn) { ... cfn will be CFN_LAST for a non-builtin/internal call. I know Richard is mostly offline but eventually he knows whether there is a better way to query + CASE_CFN_POPCOUNT: + /* Check if opcode for popcount is available. */ + if (optab_handler (popcount_optab, + TYPE_MODE (TREE_TYPE (CALL_EXPR_ARG (expr, 0)))) + == CODE_FOR_nothing) + return true; note that we currently generate builtin calls rather than IFN calls (when a direct optab is supported). Another comment on the patch is that you probably have to adjust existing popcount testcases to add architecture specific flags enabling suport for the instructions, otherwise you won't see loop replacement. Also I think that the expression is only expensive (for final value replacement!) if you consider optimizing for speed. When optimizing for size getting rid of the loop is probably beneificial unconditionally. That would leave the possibility to switch said testcases to -Os. It would require adding a bool size_p flag to expression_expensive and passing down optimize_loop_for_size_p (). _NOTE_ that expression_expensive_p is also used by IVOPTs and there replacing sth with an expression based on the niter analysis result doesn't mean we get rid of the loop (but only of an IV), so maybe that reasoning doesn't apply there. Richard. > Jeff >