From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id BFE4F3858C2C for ; Tue, 30 Nov 2021 10:00:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BFE4F3858C2C Received: by mail-ed1-x531.google.com with SMTP id w1so84539989edc.6 for ; Tue, 30 Nov 2021 02:00:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hlSUu9TG/dym7SSxb2utFibR0t02cOJwtvKxzuZlx8c=; b=J9Yzg3zqZEfHnSWnsUHOYm6t0G9HbNqZGFPpQyNyOOncPMlXU8aGciO4Ul2J9LDXNO V9AJoTKEeUjX6p3VLVydmbeAgDj1ZnEXdqExMkUfJDhaFIU88yATkAVAuDf/CNNO7jl+ 4LYi9tspTKGeDPDjlgyc+uSgRwyh5aRxStHWa9d7qn+N8jy70Kxfk6wRvp1Pg46LkUyC SYoDRni7VGz4za1l39wepukDmCmrndGOMsA6R/b6EJEnMoN4ZS0lLp6nN5tj/TgeJduH 6Gqv0z5bFSABpNwigHJoXu14nbgw66hpoY89g39GQZIgELWC5CQz/yJGGQdHtji2IiIi QLpw== X-Gm-Message-State: AOAM531P0fZO8E66tiI/HJiCnzJO/xkBFfKk/wqt2jYc9eg8DjFI5pBE zUnGCzljcnawk0M8RzFroA+mBO5UdUwEqgTGt0TLo65/Jmg= X-Google-Smtp-Source: ABdhPJz0n+ZptUY09mqcvOpvPZYbkFYUdrQNBnTHyxmRHla5JvroEgCTKq2gVHPVmXk57Mt0Q8IMXjLLe5dT21SmOVs= X-Received: by 2002:aa7:c406:: with SMTP id j6mr83571820edq.76.1638266416585; Tue, 30 Nov 2021 02:00:16 -0800 (PST) MIME-Version: 1.0 References: <016001d7e500$2628ae80$727a0b80$@nextmovesoftware.com> In-Reply-To: <016001d7e500$2628ae80$727a0b80$@nextmovesoftware.com> From: Richard Biener Date: Tue, 30 Nov 2021 11:00:05 +0100 Message-ID: Subject: Re: [PATCH] Final value replacement improvements for until-wrap loops. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Nov 2021 10:00:20 -0000 On Mon, Nov 29, 2021 at 10:07 AM Roger Sayle wrote: > > > This middle-end patch is inspired by Richard Biener's until-wrap > loop example in PR tree-optimization/101145. > > unsigned foo(unsigned val, unsigned start) > { > unsigned cnt = 0; > for (unsigned i = start; i > val; ++i) > cnt++; > return cnt; > } > > For this loop, the tree optimizers currently generate: > > unsigned int foo (unsigned int val, unsigned int start) > { > unsigned int cnt; > unsigned int _1; > unsigned int _5; > > [local count: 118111600]: > if (start_3(D) > val_4(D)) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > _1 = start_3(D) + 1; > _5 = -start_3(D); > cnt_2 = _1 > val_4(D) ? _5 : 1; > > [local count: 118111600]: > # cnt_11 = PHI > return cnt_11; > } > > or perhaps slightly easier to read: > > if (start > val) { > cnt = (start+1) > val ? -start : 1; > } else cnt = 0; > > In this snippet, if we know start > val, then (start+1) > val > unless start+1 overflows, i.e. (start+1) == 0 and start == ~0. > We can use this (loop header) context to simplify the ternary > expression to "(start != -1) ? -start : 1", which with a little > help from match.pd can be folded to -start. Hence the optimal > final value replacement should be: > > cnt = (start > val) ? -start : 0; > > Or as now generated by this patch: > > unsigned int foo (unsigned int val, unsigned int start) > { > unsigned int cnt; > > [local count: 118111600]: > if (start_3(D) > val_4(D)) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > cnt_2 = -start_3(D); > > [local count: 118111600]: > # cnt_11 = PHI > return cnt_11; > } > > > We can also improve until-wrap loops that don't have a (suitable) loop > header, as determined by simplify_using_initial_conditions. > > unsigned bar(unsigned val, unsigned start) > { > unsigned cnt = 0; > unsigned i = start; > do { > cnt++; > i++; > } while (i > val); > return cnt; > } > > which is currently optimized to: > > unsigned int foo (unsigned int val, unsigned int start) > { > unsigned int cnt; > unsigned int _9; > unsigned int _10; > > [local count: 118111600]: > _9 = start_4(D) + 1; > _10 = -start_4(D); > cnt_3 = val_7(D) < _9 ? _10 : 1; > return cnt_3; > } > > Here we have "val < (start+1) ? -start : 1", which again with the > help of match.pd can be slightly simplified to "val <= start ? -start : 1" > when dealing with unsigned types, because at the complicating value where > start == ~0, we fortunately have -start == 1, hence it doesn't matter > whether the second or third operand of the ternary operator is returned. > > To summarize, this patch (in addition to tweaking may_be_zero in > number_of_iterations_until_wrap) adds three new constant folding > transforms to match.pd. > > X != C1 ? -X : C2 simplifies to -X when -C1 == C2. > which is the generalized form of the simplification above. > > X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2. > which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case. > > and the "until-wrap final value replacement without context": > > (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when > X is unsigned, as when X + 1 overflows, X is -1, so -X == 1. > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures. Ok for mainline? +/* X != C1 ? -X : C2 simplifies to -X when -C1 == C2. */ +(simplify + (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2) + (if ((!wi::only_sign_bit_p (wi::to_wide (@1)) + || TYPE_UNSIGNED (type) + || TYPE_OVERFLOW_WRAPS (type) + || (!TYPE_OVERFLOW_SANITIZED (type) + && !TYPE_OVERFLOW_TRAPS (type) + && !TYPE_SATURATING (type))) I'm wondering about TYPE_UNSIGNED && TYPE_SATURATING. Also I think unsigned cannot trap or be sanitized and with TYPE_OVERFLOW_WRAPS sanitizing or trapping cannot be true either. Also unsigned types always wrap on overflow. So maybe (if (!TYPE_SATURATING (type) && (TYPE_OVERFLOW_WRAPS (type) || !wi::only_sign_bit_p (..))) ? + && wi::eq_p (wi::neg (wi::to_wide (@1)), wi::to_wide (@2))) + @3)) +/* (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when + X is unsigned, as when X + 1 overflows, X is -1, so -X == 1. */ +(simplify + (cond (gt (plus @0 integer_onep) @1) (negate @0) integer_onep@2) + (if (TYPE_UNSIGNED (type) + && TYPE_UNSIGNED (TREE_TYPE (@0))) I think the second test is redundant since @0 participates in both the comparison and the condition value. + (cond (ge @0 @1) (negate @0) @2))) Otherwise looks OK to me. Thanks, Richard. > > 2021-11-29 Roger Sayle > > gcc/ChangeLog > * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): > Check if simplify_using_initial_conditions allows us to > simplify the expression for may_be_zero. > * match.pd (X != C ? -X : -C -> -X): New transform. > (X != C ? ~X : ~C -> ~X): Likewise. > ((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise. > > gcc/testsuite/ChangeLog > * gcc.dg/fold-condneg-1.c: New test case. > * gcc.dg/fold-condneg-2.c: New test case. > * gcc.dg/fold-condnot-1.c: New test case. > * gcc.dg/pr101145-1.c: New test case. > * gcc.dg/pr101145-2.c: New test case. > > > Thanks in advance, > Roger > -- >