From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 439ED3858C3A for ; Tue, 11 Jul 2023 13:08:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 439ED3858C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689080897; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gzn4kJFxuWIX4XhiIFYbI9OGaKsZweDWTE7wTWuQ2i0=; b=XG8IC5PlWOHYc6AQKBnsmzcmIIEBFLHeERStd2OhivosI/ZQP8tZ3bn+8DO/gA8fpyToH3 X7zTtWipfbCgWwOfubps9E+diOPE0CV3AskdnIUfvJUyprPOIISWCDeiR55WINs/+02644 M+LIQVB+v3u7sYZtnZf4hczQAc49x6E= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-656-8jTyUXZ_Pke6Jkj4zaS29A-1; Tue, 11 Jul 2023 09:08:15 -0400 X-MC-Unique: 8jTyUXZ_Pke6Jkj4zaS29A-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6E89B10504B8; Tue, 11 Jul 2023 13:08:15 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.10]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 333CB1401C2E; Tue, 11 Jul 2023 13:08:15 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 36BD8CtB3904613 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 15:08:13 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 36BD8CMa3904612; Tue, 11 Jul 2023 15:08:12 +0200 Date: Tue, 11 Jul 2023 15:08:12 +0200 From: Jakub Jelinek To: Richard Biener Cc: Drew Ross , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] match.pd: Implement missed optimization (~X | Y) ^ X -> ~(X & Y) [PR109986] Message-ID: Reply-To: Jakub Jelinek References: <20230705134147.13325-1-drross@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jul 06, 2023 at 03:00:28PM +0200, Richard Biener via Gcc-patches wrote: > On Wed, Jul 5, 2023 at 3:42 PM Drew Ross via Gcc-patches > wrote: > > > > Adds a simplification for (~X | Y) ^ X to be folded into ~(X & Y). > > Tested successfully on x86_64 and x86 targets. > > > > PR middle-end/109986 > > > > gcc/ChangeLog: > > > > * match.pd ((~X | Y) ^ X -> ~(X & Y)): New simplification. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.c-torture/execute/pr109986.c: New test. > > * gcc.dg/tree-ssa/pr109986.c: New test. > > --- > > gcc/match.pd | 11 ++ > > .../gcc.c-torture/execute/pr109986.c | 41 ++++ > > gcc/testsuite/gcc.dg/tree-ssa/pr109986.c | 177 ++++++++++++++++++ > > 3 files changed, 229 insertions(+) > > create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr109986.c > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr109986.c > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index a17d6838c14..d9d7d932881 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -1627,6 +1627,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > > (convert (bit_and @1 (bit_not @0))))) > > > > +/* (~X | Y) ^ X -> ~(X & Y). */ > > +(simplify > > + (bit_xor:c (nop_convert1? > > + (bit_ior:c (nop_convert2? (bit_not (nop_convert3? @0))) > > + @1)) (nop_convert4? @0)) > > you want to reduce the number of nop_convert? - for example > I wonder if we can canonicalize > > (T)~X and ~(T)X > > for nop-conversions. The same might apply to binary bitwise operations > where we should push those to a direction where they are likely eliminated. > Usually we'd push them outwards. > > The issue with the above pattern is that nop_convertN? expands to 2^N > separate patterns. Together with the two :c you get 64 out of this. > > I do not see that all of the combinations can happen when X has to > match unless we fail to contract some of them like if we have > (unsigned)(~(signed)X | Y) ^ X which we could rewrite like > -> (unsigned)((signed)~X | Y) ^ X -> (~X | (unsigned) Y) ^ X > with the last step being somewhat difficult unless we do > (signed)~X | Y -> (signed)(~X | (unsigned)Y). It feels like a > propagation problem and less of a direct pattern matching one. The nop_convert1? in the pattern might seem to be unnecessary for cases like: int i, j, k, l; unsigned u, v, w, x; void foo (void) { int t0 = i; int t1 = (~t0) | j; x = t1 ^ (unsigned) t0; unsigned t2 = u; unsigned t3 = (~t2) | v; i = ((int) t3) ^ (int) t2; } we actually optimize it with or without the nop_convert1? in place, because we have the /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) when profitable. ... (bitop (convert@2 @0) (convert?@3 @1)) ... (convert (bitop @0 (convert @1))))) simplification. Except that on void bar (void) { unsigned t0 = u; int t1 = (~(int) t0) | j; x = t1 ^ t0; int t2 = i; unsigned t3 = (~(unsigned) t2) | v; i = ((int) t3) ^ t2; } the optimization doesn't trigger without the nop_convert1? and does with it. Perhaps we could get rid of nop_convert3? and nop_convert4? by introducing a macro/inline function predicate like: bitwise_equal_p (expr1, expr2) and instead of using (nop_convert3? @0) and (nop_convert4? @0) in the pattern use @0 and @2 and then add if (bitwise_equal_p (@0, @2)) to the condition. For GENERIC (i.e. in generic-match-head.cc) it could be something like: static inline bool bitwise_equal_p (tree expr1, tree expr2) { STRIP_NOPS (expr1); STRIP_NOPS (expr2); if (expr1 == expr2) return true; if (!tree_nop_conversion_p (TREE_TYPE (expr1), TREE_TYPE (expr2))) return false; if (TREE_CODE (expr1) == INTEGER_CST && TREE_CODE (expr2) == INTEGER_CST) return wi::to_wide (expr1) == wi::to_wide (expr2); return operand_equal_p (expr1, expr2, 0); } (the INTEGER_CST special case because operand_equal_p compares wi::to_widest which could be different if one constant is signed and the other unsigned). For GIMPLE, I wonder if it shouldn't be a macro that takes valueize into account, and do something like: #define bitwise_equal_p(expr1, expr2) gimple_bitwise_equal_p (expr1, expr2, valueize) bool gimple_nop_convert (tree, tree *, tree (*)(tree)); static inline bool gimple_bitwise_equal_p (tree expr1, tree expr2, tree (*valueize) (tree)) { if (expr1 == expr2) return true; if (!tree_nop_conversion_p (TREE_TYPE (expr1), TREE_TYPE (expr2))) return false; if (TREE_CODE (expr1) == INTEGER_CST && TREE_CODE (expr2) == INTEGER_CST) return wi::to_wide (expr1) == wi::to_wide (expr2); if (operand_equal_p (expr1, expr2, 0)) return true; tree expr3, expr4; if (!gimple_nop_convert (expr1, &expr3, valueize)) expr3 = expr1; if (!gimple_nop_convert (expr2, &expr4, valueize)) expr4 = expr2; if (expr1 != expr3) { if (operand_equal_p (expr3, expr2, 0)) return true; if (expr2 != expr4 && operand_equal_p (expr3, expr4, 0)) return true; } if (expr2 != expr4 && operand_equal_p (expr1, expr4, 0)) return true; return false; } Completely untested. What do you think? Though, that brings us only still to 16 cases of this. Jakub