From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 6CCB13858D20 for ; Thu, 3 Aug 2023 07:05:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6CCB13858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2b9b50be31aso8783421fa.3 for ; Thu, 03 Aug 2023 00:05:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691046327; x=1691651127; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+ULyAxUqWy+rh6OY2KpwoljrT3Cyq6ZbD7F4IsBIlYI=; b=nHdTwg2UrG6LyEAc5I6NvZtr6VQT23I2N00wjlOn9CXPPle9BytLTq8uemtX0XujfG fJpRJ5UbVZsnfx0jDP0g8OSZD5MNCXiYcpLQ1VQbNyd+sHkxUbr58e6pgkuVAUXLvhrq iJbLMJG7N2rDkaHgLGyqFbN2kVAl/yYcJRwjaMl7CmKHhd/73M90gNTcNvQnc3NXJqnn PRzOKQXF15F185wC54Tzu6139IEbouZg71RBOrXpWrCpzYPoIlCBgf9uSQ/8JMgLKpd8 NT34tBrVfJX9Bnnlu4Fe3uajvCLfa4Ryp/10W1epRKUMC4h8tQHViVn/hg2Kv8t+ZPKh Wx6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691046327; x=1691651127; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+ULyAxUqWy+rh6OY2KpwoljrT3Cyq6ZbD7F4IsBIlYI=; b=Ajko9Djz08vNpLglJT1QVxvamZ82GGALfEWmw+yS/mLHbTYwOuJastCFuQtAoWO+93 xEaolutcOqONB7q9EGtnepSjmd06gnxB1QKHyvH8+afzgtKIm4aUSsAAyvnty4zKbTyO YICl9Wo9CuVA8pVqtjncLTVd3tT/weKvvuU0MayUnCGOxgtt2+HiWL6ceVp8RIcKryfS wBPr+ns5WL1vJdcGbY55r8Z89p3XbgZietxLIKRQPd3s1AMZdhmKrk17AvTEwTJ9salN aAI3T2PjG43up4y9GnJyEtSz6x9D/aDhG5q7jp+vF3Ok7Y0plhCw66H9NWeSOe4GUZEn eEhw== X-Gm-Message-State: ABy/qLZNUIjl23aY7dlkooqLtnQxduOHcF09Jm3jclPncYflBbnkepXd UgwY/deXaHIxde8NhmsD2aK49VZfreZtNAe+1LY= X-Google-Smtp-Source: APBJJlGkKMmBnP88s7N6GMIHGyZtgJO548SzBiz9rYYgMJeC3uofNcPry2Cfp37sqJsLJ0asQYH/PSlioljMrhcEV4c= X-Received: by 2002:a2e:2414:0:b0:2b9:ba75:6620 with SMTP id k20-20020a2e2414000000b002b9ba756620mr6528756ljk.6.1691046326906; Thu, 03 Aug 2023 00:05:26 -0700 (PDT) MIME-Version: 1.0 References: <20230316152706.2214124-1-manolis.tsamis@vrull.eu> <8bd8b246-a252-0e05-414c-ab1e35975aea@gmail.com> In-Reply-To: From: Richard Biener Date: Thu, 3 Aug 2023 09:04:28 +0200 Message-ID: Subject: Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop. To: Manolis Tsamis Cc: Jeff Law , Philipp Tomsich , Andrew MacLeod , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Aug 2, 2023 at 4:08=E2=80=AFPM Manolis Tsamis wrote: > > Hi all, > > I'm pinging to discuss again if we want to move this forward for GCC14. > > I did some testing again and I haven't been able to find obvious > regressions, including testing the code from PR86270 and PR70359 that > Richard mentioned. > I still believe that zero can be considered a special case even for > hardware that doesn't directly benefit in the comparison. > For example it happens that the testcase from the commit compiles to > one instruction less in x86: > > .LFB0: > movl (%rdi), %eax > leal 1(%rax), %edx > movl %edx, (%rdi) > testl %eax, %eax > je .L4 > ret > .L4: > jmp g > > vs > > .LFB0: > movl (%rdi), %eax > addl $1, %eax > movl %eax, (%rdi) > cmpl $1, %eax > je .L4 > ret > .L4: > xorl %eax, %eax > jmp g > > (The xorl is not emitted when testl is used. LLVM uses testl but also > does xor eax, eax :) ) > Although this is accidental, I believe it also showcases that zero is > a preferential value in various ways. > > I'm running benchmarks comparing the effects of this change and I'm > also still looking for testcases that result in problematic > regressions. > Any feedback or other concerns about this are appreciated! My comment from Apr 24th still holds, IMO this is something for instruction selection (aka the ISEL pass) or the out-of-SSA tweaks we do during RTL expansion (see insert_backedge_copies) Richard. > Thanks, > Manolis > > On Wed, Apr 26, 2023 at 9:43=E2=80=AFAM Richard Biener > wrote: > > > > On Wed, Apr 26, 2023 at 4:30=E2=80=AFAM Jeff Law wrote: > > > > > > > > > > > > On 4/25/23 01:21, Richard Biener wrote: > > > > On Tue, Apr 25, 2023 at 1:05=E2=80=AFAM Jeff Law wrote > > > >> > > > >> > > > >> > > > >> > > > >> On 4/24/23 02:06, Richard Biener via Gcc-patches wrote: > > > >>> On Fri, Apr 21, 2023 at 11:01=E2=80=AFPM Philipp Tomsich > > > >>> wrote: > > > >>>> > > > >>>> Any guidance on the next steps for this patch? > > > >>> > > > >>> I think we want to perform this transform later, in particular wh= en > > > >>> the test is a loop exit test we do not want to do it as it preven= ts > > > >>> coalescing of the IV on the backedge at out-of-SSA time. > > > >>> > > > >>> That means doing the transform in folding and/or before inlining > > > >>> (the test could become a loop exit test) would be a no-go. In fa= ct > > > >>> for SSA coalescing we'd want the reverse transform in some cases,= see > > > >>> PRs 86270 and 70359. > > > >>> > > > >>> If we can reliably undo for the loop case I suppose we can do the > > > >>> canonicalization to compare against zero. In any case please spl= it > > > >>> up the patch (note > > > >> I've also > > > >>> hoped we could eventually get rid of that part of > > > >>> tree-ssa-forwprop.cc > > > >> in favor > > > >>> of match.pd patterns since it uses GENERIC folding :/). > > > >>> > > > >> Do we have enough information to do this at expansion time? That = would > > > >> avoid introducing the target dependencies to drive this in gimple. > > > > > > > > I think so, but there isn't any convenient place to do this I think= . I suppose > > > > there's no hope to catch it during RTL opts? > > > Combine would be the most natural place in the RTL pipeline, but it'd= be > > > a 2->2 combination which would be rejected. > > > > > > We could possibly do it as a define_insn_and_split, but the gimple->R= TL > > > interface seems like a better fit to me. If TER has done its job, we > > > should see a complex enough tree node to do the right thing. > > > > Of course we'd want to get rid of TER in favor of ISEL > > > > Richard. > > > > > jeff