From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by sourceware.org (Postfix) with ESMTPS id 0A5BB384F010 for ; Mon, 23 May 2022 12:16:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0A5BB384F010 Received: by mail-qk1-x735.google.com with SMTP id r84so1171576qke.10 for ; Mon, 23 May 2022 05:16:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=p5SN9bCevvzb/Lc1ImqX2RbtU4UlpNhX3RnikY1Mwo0=; b=clbFoQjwJWHFFN6Uxtt3mzZv6UTZbzja/UPVbPrSCBLmWJyDjiz9ilR296HDnvqb4X pa06CUBaGqWJsbg/mSAanTEDQPWFtcHVRCO9nhDLiKhgEtR+J9Kjc/VXIRufcW919uKW AmeZurC4gjTvhKJjuIJYFry05e/pNwsN2+nS9oXnHdo2ePmKcuZNYmBhgvygF0paSOjF J9oHuBU/0czKyjXMCO15O8pobQvTRPwLXfZnlAlzGgk8MGqdrSKCSFka+ZosO9oEoeZ7 /T/lWUvbC5PQtk9R84jKtD5Vx09ttGIEeDcFRo4sr5I1Cgou9xBDRPDrwGqHceKVrpSG rFTQ== X-Gm-Message-State: AOAM532bKwX8GSqPCa/aHdBN99tqaCGSiIRZiNVuqKVnviKAzphDxbT0 e2KismB8h+bVnWnKj8+QSE43181dm4qG4st9pzBqqw2ubNNXPg== X-Google-Smtp-Source: ABdhPJwbxZsH1V0OpPP/ajrUfonOBWu8oqkhdguPI5aGGacHwKq5NNwsdi0vsMzyPHRz3GPt+cYyYzvKFYYQWFP68hk= X-Received: by 2002:a05:620a:258e:b0:680:f33c:dbcd with SMTP id x14-20020a05620a258e00b00680f33cdbcdmr14071324qko.542.1653308193320; Mon, 23 May 2022 05:16:33 -0700 (PDT) MIME-Version: 1.0 References: <00f501d86e81$5c48c8a0$14da59e0$@nextmovesoftware.com> <010201d86e83$6dc3f6c0$494be440$@nextmovesoftware.com> <016d01d86e92$b42a20d0$1c7e6270$@nextmovesoftware.com> In-Reply-To: <016d01d86e92$b42a20d0$1c7e6270$@nextmovesoftware.com> From: Uros Bizjak Date: Mon, 23 May 2022 14:16:21 +0200 Message-ID: Subject: Re: [x86 PING] Peephole pand;pxor into pandn To: Roger Sayle Cc: "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 May 2022 12:16:36 -0000 On Mon, May 23, 2022 at 12:49 PM Roger Sayle wrote: > > > Hi Uros, > Hopefully, if I explain even more of the context, you'll better understand why > this harmless (and at worse seemingly redundant) peephole2 is actually critical > for addressing significant regressions in the compiler without introducing new > testsuite failures. I wouldn't ask (again), if I didn't feel it's important. > > Basically, I'm trying to unblock Hongtao's patch (for PR target/104610) > which in your own review, explained is better handled by/during STV: > https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594070.html > > Unfortunately, that patch of mine to STV (that I want to ping next) that solves > the P2 code quality regression PR target/70321, is itself blocked by another > review of yours: > https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593200.html > where this fix (alone) leads to a regression of the test case pr65105-5.c. > > This pending regression has nothing to do with TARGET_BMI's andn, but > the idiom "if ((x & y) != y)" on ia32, where x and y are DImode, and stv/reload > has decided to place these values in SSE registers. > > After combine we have an *anddi3_doubleword and *cmpdi3_doubleword: > (insn 22 21 23 4 (parallel [ > (set (reg:DI 97) > (and:DI (reg/v:DI 92 [ p2 ]) > (reg:DI 88 [ _25 ]))) > (clobber (reg:CC 17 flags)) > ]) "pr65105-5.c":20:18 530 {*anddi3_doubleword} > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil))) > (insn 23 22 24 4 (set (reg:CCZ 17 flags) > (compare:CCZ (reg/v:DI 92 [ p2 ]) > (reg:DI 97))) "pr65105-5.c":20:8 29 {*cmpdi_doubleword} > (expr_list:REG_DEAD (reg:DI 97) > (nil))) But originally, during combine we have (pr65105-5.c): Trying 22 -> 23: 22: {r97:DI=r92:DI&r88:DI;clobber flags:CC;} REG_UNUSED flags:CC 23: {r98:DI=r92:DI^r97:DI;clobber flags:CC;} REG_DEAD r97:DI REG_UNUSED flags:CC Successfully matched this instruction: (parallel [ (set (reg:DI 98) (and:DI (not:DI (reg:DI 88 [ _25 ])) (reg/v:DI 92 [ p2 ]))) (clobber (reg:CC 17 flags)) ]) allowing combination of insns 22 and 23 original costs 8 + 8 = 16 replacement cost 16 deferring deletion of insn with uid = 22. modifying insn i3 23: {r98:DI=~r88:DI&r92:DI;clobber flags:CC;} REG_UNUSED flags:CC deferring rescan insn with uid = 23. so combine is creating: (insn 23 22 24 4 (parallel [ (set (reg:DI 98) (and:DI (not:DI (reg:DI 88 [ _25 ])) (reg/v:DI 92 [ p2 ]))) (clobber (reg:CC 17 flags)) ]) "pr65105-5.c":20:8 552 {*andndi3_doubleword} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) why is this not the case anymore with your patch? Uros.