From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by sourceware.org (Postfix) with ESMTPS id A11623858284 for ; Wed, 2 Aug 2023 14:08:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A11623858284 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-55ae2075990so4298880a12.0 for ; Wed, 02 Aug 2023 07:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1690985311; x=1691590111; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yGnEAWzLvjaGIQTWnoEzCNECYvrxtNETcwp4fZcgDLk=; b=WJMYdE6O3A5DC32r6K48CDDl9lUgT+KiKLQHNeCGMVHZSCrAilW5PmkaPrWjOwKE4m TEplVfHS/IMtzYUPfnJOpB0xTB9k6W72wvxNDXXZEJu1VDQF0ghY7Mx6w/rx7w+GpUbi 9xU3R91LkWzmwAO882mKD+mMtff+o0Bx+iuyybo7zD2wsefagr9nJs62B20nKBeS86QE z+TrqyagU4NcPvyhF81qFBl0kpqK3rYubUaza+faXbzbeOX3EJLBZLMBjGQAyJgZ2YAr jzrbaJgGltycabUr7ZWzZUvZsU2FZMSeR1p6ifCXRj3Q+eYNTxWYptEArPXRuzGTOB1J AoBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690985311; x=1691590111; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yGnEAWzLvjaGIQTWnoEzCNECYvrxtNETcwp4fZcgDLk=; b=Qwb3vmJ268mhCUcmEWH9XUgpLylT46cYlsfVYbvPfJXyHx/z3qG6DytZYA58lPESwQ 0S+mF+9Xk7GQkGUSW6EopiMdjFcktZmjsYc2VfjZqYmTYwwPspmxAUKd5AiHJ9y52lRU 7iu8U6OHYQ7Ky9xfz/3qFcQcKESqmJkFjrJJZ00znHy+E6WsEEhTvZHPK/5nkjmUFza/ CkGsAmTZhH8SBMsdwea7kyYQE/0rHHEQbL5Uga1GCkhSblJsU1nZZS+EGQn8k8G6bcx/ ck73VUG2809c5XvsOap38R5rEDkSqjo9U9Zv95kOUOOzOBgeWxTad2Y13vdYZvgl4Dsw Ovzg== X-Gm-Message-State: ABy/qLaW7OIhsslfzVA6jYHdsouq584P9Y0vt7PQqWwDrcRMgHh3Kp0D gLczIZtuI/B300+7VpI0SSvtfW2u0Kc1+vuyAcPdUQ== X-Google-Smtp-Source: APBJJlFET+wgZ8YI5I6NN9MrY57IbUWqg7/VS5xqoTwuZxJJys1Q/weRIwRE8i4jgUSBXEktXwIfywEnSziS9802JT8= X-Received: by 2002:a17:90a:1289:b0:268:105b:ca1c with SMTP id g9-20020a17090a128900b00268105bca1cmr14227703pja.32.1690985311406; Wed, 02 Aug 2023 07:08:31 -0700 (PDT) MIME-Version: 1.0 References: <20230316152706.2214124-1-manolis.tsamis@vrull.eu> <8bd8b246-a252-0e05-414c-ab1e35975aea@gmail.com> In-Reply-To: From: Manolis Tsamis Date: Wed, 2 Aug 2023 17:07:55 +0300 Message-ID: Subject: Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop. To: Richard Biener Cc: Jeff Law , Philipp Tomsich , Andrew MacLeod , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi all, I'm pinging to discuss again if we want to move this forward for GCC14. I did some testing again and I haven't been able to find obvious regressions, including testing the code from PR86270 and PR70359 that Richard mentioned. I still believe that zero can be considered a special case even for hardware that doesn't directly benefit in the comparison. For example it happens that the testcase from the commit compiles to one instruction less in x86: .LFB0: movl (%rdi), %eax leal 1(%rax), %edx movl %edx, (%rdi) testl %eax, %eax je .L4 ret .L4: jmp g vs .LFB0: movl (%rdi), %eax addl $1, %eax movl %eax, (%rdi) cmpl $1, %eax je .L4 ret .L4: xorl %eax, %eax jmp g (The xorl is not emitted when testl is used. LLVM uses testl but also does xor eax, eax :) ) Although this is accidental, I believe it also showcases that zero is a preferential value in various ways. I'm running benchmarks comparing the effects of this change and I'm also still looking for testcases that result in problematic regressions. Any feedback or other concerns about this are appreciated! Thanks, Manolis On Wed, Apr 26, 2023 at 9:43=E2=80=AFAM Richard Biener wrote: > > On Wed, Apr 26, 2023 at 4:30=E2=80=AFAM Jeff Law = wrote: > > > > > > > > On 4/25/23 01:21, Richard Biener wrote: > > > On Tue, Apr 25, 2023 at 1:05=E2=80=AFAM Jeff Law wrote > > >> > > >> > > >> > > >> > > >> On 4/24/23 02:06, Richard Biener via Gcc-patches wrote: > > >>> On Fri, Apr 21, 2023 at 11:01=E2=80=AFPM Philipp Tomsich > > >>> wrote: > > >>>> > > >>>> Any guidance on the next steps for this patch? > > >>> > > >>> I think we want to perform this transform later, in particular when > > >>> the test is a loop exit test we do not want to do it as it prevents > > >>> coalescing of the IV on the backedge at out-of-SSA time. > > >>> > > >>> That means doing the transform in folding and/or before inlining > > >>> (the test could become a loop exit test) would be a no-go. In fact > > >>> for SSA coalescing we'd want the reverse transform in some cases, s= ee > > >>> PRs 86270 and 70359. > > >>> > > >>> If we can reliably undo for the loop case I suppose we can do the > > >>> canonicalization to compare against zero. In any case please split > > >>> up the patch (note > > >> I've also > > >>> hoped we could eventually get rid of that part of > > >>> tree-ssa-forwprop.cc > > >> in favor > > >>> of match.pd patterns since it uses GENERIC folding :/). > > >>> > > >> Do we have enough information to do this at expansion time? That wo= uld > > >> avoid introducing the target dependencies to drive this in gimple. > > > > > > I think so, but there isn't any convenient place to do this I think. = I suppose > > > there's no hope to catch it during RTL opts? > > Combine would be the most natural place in the RTL pipeline, but it'd b= e > > a 2->2 combination which would be rejected. > > > > We could possibly do it as a define_insn_and_split, but the gimple->RTL > > interface seems like a better fit to me. If TER has done its job, we > > should see a complex enough tree node to do the right thing. > > Of course we'd want to get rid of TER in favor of ISEL > > Richard. > > > jeff