From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 101469 invoked by alias); 13 Jun 2017 11:28:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 101418 invoked by uid 89); 13 Jun 2017 11:28:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-6.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-ua0-f182.google.com Received: from mail-ua0-f182.google.com (HELO mail-ua0-f182.google.com) (209.85.217.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 Jun 2017 11:28:12 +0000 Received: by mail-ua0-f182.google.com with SMTP id 68so55893426uas.0 for ; Tue, 13 Jun 2017 04:28:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=xLmahtZ+CeU7Xbrh6EoqZKfw70PI2rOQDlQy1LdoXtM=; b=Myvk9Eg/gZUV6wdj2xR0oOmAMKlCHrAF7WlRiQdHaeDat5rfOBQSPnUWhi5HYnuvHm lTDiKb75J2QM0NEsVOKlj2mEzqeTF/K2eOtw2dm6QIkvefd798bRD7XGBr90q4pbZ0ch vs0lokgtPQ0q+sjioRI1WPPj31KecJE30BBqbW0KDEpeY13wccyDIeRJfL5vEK4FOdKh +nMdZKzvafGiopbUni7JoabqVp9Zot3pVybsZTHEox6nSIBuNlEdovSEHJQ4g8YZbpaG ycKKBxo0bgReZOFM18xgTaWKFvmILb4tLDu3ZUJWEC8+sFRKEcVd4wl61gfDm2s9XL1m PKXA== X-Gm-Message-State: AKS2vOyfrcsy8qKzZv+FpfEMpJGS4EmhthMF9YFMWTdgXn5LBnlXOkHk ZcwQw42VCNoXDGZXSmlirMJFJLGIPg== X-Received: by 10.159.48.1 with SMTP id h1mr2321504uab.102.1497353295355; Tue, 13 Jun 2017 04:28:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.47.19 with HTTP; Tue, 13 Jun 2017 04:28:14 -0700 (PDT) In-Reply-To: <8737b4f7wj.fsf@linaro.org> References: <877f0gfbwz.fsf@linaro.org> <8737b4f7wj.fsf@linaro.org> From: "Bin.Cheng" Date: Tue, 13 Jun 2017 11:28:00 -0000 Message-ID: Subject: Re: [PATCH] Fix PR66313 To: Richard Biener , gcc-patches List , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2017-06/txt/msg00905.txt.bz2 On Tue, Jun 13, 2017 at 12:23 PM, Richard Sandiford wrote: > Richard Biener writes: >> On Tue, 13 Jun 2017, Richard Sandiford wrote: >>> Richard Biener writes: >>> > So I've come back to PR66313 and found a solution to the tailrecursion >>> > missed optimization when fixing the factoring folding to use an unsigned >>> > type when we're not sure of overflow. >>> > >>> > The folding part is identical to my last try from 2015, the tailrecursion >>> > part makes us handle intermittent stmts that were introduced by foldings >>> > that "clobber" our quest walking the single-use chain of stmts between >>> > the call and the return (and failing at all stmts that are not part >>> > of said chain). A simple solution is to move the stmts that are not >>> > part of the chain and that we can move before the call. That handles >>> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c >>> > >>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. >>> > >>> > Richard. >>> > >>> > 2017-05-31 Richard Biener >>> > >>> > PR middle-end/66313 >>> > * fold-const.c (fold_plusminus_mult_expr): If the factored >>> > factor may be zero use a wrapping type for the inner operation. >>> > * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap >>> > and handle moved defs. >>> > (process_assignment): Properly guard the unary op case. Return a >>> > tri-state indicating that moving the stmt before the call may allow >>> > to continue. Pass through to_move. >>> > (find_tail_calls): Handle moving unrelated defs before >>> > the call. >>> > >>> > * c-c++-common/ubsan/pr66313.c: New testcase. >>> > * gcc.dg/tree-ssa/loop-15.c: Adjust. >>> > >>> > Index: gcc/fold-const.c >>> > =================================================================== >>> > *** gcc/fold-const.c.orig 2015-10-29 12:32:33.302782318 +0100 >>> > --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100 >>> > *************** fold_plusminus_mult_expr (location_t loc >>> > *** 6916,6925 **** >>> > } >>> > same = NULL_TREE; >>> > >>> > ! if (operand_equal_p (arg01, arg11, 0)) >>> > ! same = arg01, alt0 = arg00, alt1 = arg10; >>> > ! else if (operand_equal_p (arg00, arg10, 0)) >>> > same = arg00, alt0 = arg01, alt1 = arg11; >>> > else if (operand_equal_p (arg00, arg11, 0)) >>> > same = arg00, alt0 = arg01, alt1 = arg10; >>> > else if (operand_equal_p (arg01, arg10, 0)) >>> > --- 6916,6926 ---- >>> > } >>> > same = NULL_TREE; >>> > >>> > ! /* Prefer factoring a common non-constant. */ >>> > ! if (operand_equal_p (arg00, arg10, 0)) >>> > same = arg00, alt0 = arg01, alt1 = arg11; >>> > + else if (operand_equal_p (arg01, arg11, 0)) >>> > + same = arg01, alt0 = arg00, alt1 = arg10; >>> > else if (operand_equal_p (arg00, arg11, 0)) >>> > same = arg00, alt0 = arg01, alt1 = arg10; >>> > else if (operand_equal_p (arg01, arg10, 0)) >>> > *************** fold_plusminus_mult_expr (location_t loc >>> > *** 6974,6987 **** >>> > } >>> > } >>> > >>> > ! if (same) >>> > return fold_build2_loc (loc, MULT_EXPR, type, >>> > fold_build2_loc (loc, code, type, >>> > fold_convert_loc (loc, type, alt0), >>> > fold_convert_loc (loc, type, alt1)), >>> > fold_convert_loc (loc, type, same)); >>> > >>> > ! return NULL_TREE; >>> > } >>> > >>> > /* Subroutine of native_encode_expr. Encode the INTEGER_CST >>> > --- 6975,7010 ---- >>> > } >>> > } >>> > >>> > ! if (!same) >>> > ! return NULL_TREE; >>> > ! >>> > ! if (! INTEGRAL_TYPE_P (type) >>> > ! || TYPE_OVERFLOW_WRAPS (type) >>> > ! /* We are neither factoring zero nor minus one. */ >>> > ! || TREE_CODE (same) == INTEGER_CST) >>> > return fold_build2_loc (loc, MULT_EXPR, type, >>> > fold_build2_loc (loc, code, type, >>> > fold_convert_loc (loc, type, alt0), >>> > fold_convert_loc (loc, type, alt1)), >>> > fold_convert_loc (loc, type, same)); >>> > >>> > ! /* Same may be zero and thus the operation 'code' may overflow. Likewise >>> > ! same may be minus one and thus the multiplication may overflow. Perform >>> > ! the operations in an unsigned type. */ >>> > ! tree utype = unsigned_type_for (type); >>> > ! tree tem = fold_build2_loc (loc, code, utype, >>> > ! fold_convert_loc (loc, utype, alt0), >>> > ! fold_convert_loc (loc, utype, alt1)); >>> > ! /* If the sum evaluated to a constant that is not -INF the multiplication >>> > ! cannot overflow. */ >>> > ! if (TREE_CODE (tem) == INTEGER_CST >>> > ! && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED))) >>> > ! return fold_build2_loc (loc, MULT_EXPR, type, >>> > ! fold_convert (type, tem), same); >>> > ! >>> > ! return fold_convert_loc (loc, type, >>> > ! fold_build2_loc (loc, MULT_EXPR, utype, tem, >>> > ! fold_convert_loc (loc, utype, same))); >>> > } >>> > >>> > /* Subroutine of native_encode_expr. Encode the INTEGER_CST >>> >>> Sorry if you already know, but this part means that we can no longer >>> vectorise: >>> >>> int >>> f (int *x, int b1, int b2, int b3) >>> { >>> int foo = 0; >>> for (int i1 = 0; i1 < b1; ++i1) >>> for (int i2 = 0; i2 < b2; ++i2) >>> for (int i3 = 0; i3 < b3; ++i3) >>> foo += x[i1 * b2 * b3 + i2 * b3 + (i3 - 1)]; >>> return foo; >>> } >>> >>> We now convert all the arithmetic in the [...] to unsigned int >>> and reassociate it so that the "- 1" is applied last. We then assume >>> that the overflow in this subtraction is well-defined and that the >>> &x[...] might not be linear for the inner loop. >> >> Can you file a PR? > > OK, filed as PR81082. > >> # i3_44 = PHI >> i3.2_7 = (unsigned int) i3_44; >> _9 = i3.2_7 + _34; >> _10 = (int) _9; >> _11 = (long unsigned int) _10; >> _12 = _11 * 4; >> _13 = x_30(D) + _12; >> _14 = *_13; >> ... >> i3_32 = i3_44 + 1; >> >> so we have *(x_30(D) + 4 * ((sizetype)(int)((unsigned){ 0, + , 1} + X)) >> >> It might mean that we need to avoid this re-association. Not sure how >> to detect worthwhile vs. not worthwhile cases during folding. At least >> I see no easy way to recover from it in SCEV analysis unless the >> number of iterations is constrained more than in your example. > > Yeah, in the example that this was reduced from, not reassociating > is far more preferable to changing the types. But like you say, > I've no idea how we'd know that at this stage. Looks like it has become a general problem. IIRC, loop-im also hoists signed computation out of loop into unsigned computations. Thanks, bin > > Thanks, > Richard > >> Ideas welcome ;) >> >> Thanks, >> Richard.