From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 80938 invoked by alias); 28 May 2015 19:48:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 80922 invoked by uid 89); 28 May 2015 19:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mail2-relais-roc.national.inria.fr Received: from mail2-relais-roc.national.inria.fr (HELO mail2-relais-roc.national.inria.fr) (192.134.164.83) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 28 May 2015 19:48:35 +0000 Received: from ip-111.net-81-220-140.rev.numericable.fr (HELO laptop-mg.local) ([81.220.140.111]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 28 May 2015 21:48:16 +0200 Date: Thu, 28 May 2015 20:33:00 -0000 From: Marc Glisse Reply-To: gcc-patches@gcc.gnu.org To: Marek Polacek , Jakub Jelinek cc: GCC Patches Subject: Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299) In-Reply-To: <20150528123436.GM10247@tucnak.redhat.com> Message-ID: References: <20150528121545.GE27320@redhat.com> <20150528123436.GM10247@tucnak.redhat.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-SW-Source: 2015-05/txt/msg02690.txt.bz2 On Thu, 28 May 2015, Marek Polacek wrote: > This PR points out that we weren't able to optimize 1 << x == 2 to just > x == 1. Side note: if we are looking for extra patterns to simplify, llvm has an almost unlimited supply. Here are a few we don't seem to have (there are more where those came from), of course several need constraining / generalizing, it is just a list of hints I wrote for myself. (A|B) & ~(A&B) -> A^B (A | B) & ((~A) ^ B) -> (A & B) (A & (~B)) | (A ^ B) -> (A ^ B) ((B | C) & A) | B -> B | (A & C) A | ( A ^ B) -> A | B A | (~A ^ B) -> A | ~B (A ^ B) & ((B ^ C) ^ A) -> (A ^ B) & ~C (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (A & B) | (A ^ B) -> (A | B) A | ~(A ^ B) -> A | ~B (A & B) | ((~A) ^ B) -> (~A ^ B) ~(~X & Y) -> (X | ~Y) ~(~X >>s Y) -> (X >>s Y) (A & B)^(A | B) -> A ^ B (A | ~B) ^ (~A | B) -> A ^ B (A & ~B) ^ (~A & B) -> A ^ B (A ^ C)^(A | B) -> ((~A) & B) ^ C (A & B) ^ (A ^ B) -> (A | B) (A & ~B) ^ (~A) -> ~(A & B) (A&B)+(A^B) -> A|B (A&B)+(A|B) -> A+B (A|B)-(A^B) -> A&B ((X | Y) - X) -> (~X & Y) fmax(x,NaN) -> x fmax(a,fmax(a,b)) -> fmax(a,b) (X+2) >u X -> x X <= 4 ((X & ~7) == 0) -> X < 8 2 * X < 5 -> X <= 2 ((1 << x)&8) == 0 -> x != 3 ((1 << x)&7) == 0 -> x > 2 Y - Z < X - Z -> Y < X 3 * X == 3 * Y -> X == Y A >> 3 == B >> 3 -> (A ^ B) < 8 (float)int <= 4.4 -> int <= 4 x unle x -> x ord x > +/* (CST1 << A) == CST2 -> A == log2 (CST2 / CST1) > + (CST1 << A) != CST2 -> A != log2 (CST2 / CST1) > + if CST2 is a multiple of CST1. */ > +(for cmp (ne eq) > + (simplify > + (cmp (lshift@3 INTEGER_CST@0 @1) INTEGER_CST@2) > + (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3)) > + && wi::multiple_of_p (@2, @0, TYPE_SIGN (type))) Doesn't "type" refer to the result of the EQ_EXPR here? On Thu, 28 May 2015, Jakub Jelinek wrote: > Is CST2 a multiple of CST1 the best test though? > I mean say in > (0x8001U << x) == 0x20000U > 0x20000U isn't a multiple of 0x8001U, yet there is only one > valid value of x for which it holds (17), so we could very well > optimize that to x == 17. > If popcount of the CST1 is 1, then multiple_of_p is supposedly sufficient > (have you checked if CST1 is negative that it still works?), for others > supposedly we could have a helper function that would just try > in a loop all shift counts from 0 to precision - 1, and note when > (CST1 << b) == CST2 - if for no b, then it should fold regardless of > has_single_use to false or true, if for exactly one shift count, then > use a comparison against that shift count, otherwise give up? ctz(CST2)-ctz(CST1) should provide a single candidate without looping. ctz(CST1) is also relevant when CST2==0. -- Marc Glisse