From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34308 invoked by alias); 13 Jun 2017 09:57:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 34001 invoked by uid 89); 13 Jun 2017 09:57:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-7.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=quest, Prefer X-HELO: mail-wr0-f170.google.com Received: from mail-wr0-f170.google.com (HELO mail-wr0-f170.google.com) (209.85.128.170) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 Jun 2017 09:57:00 +0000 Received: by mail-wr0-f170.google.com with SMTP id 36so12964316wry.3 for ; Tue, 13 Jun 2017 02:57:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:references :date:in-reply-to:message-id:user-agent:mime-version; bh=9DeLE+mXRHryKK+/U5tMVZEZyYGQ8BDxnAwZaZjs6+E=; b=pYJ9BRDeoNbrs7RaMa09CpyYlzTYWpzYzLMDYfb9/1bhAlDygTKpw+zLXPRc0PqnAH mUsIDp5MGTfuHJZuiWXiizAmXJSg+k+Hc8AhbwLkgIUcmQKMvOU+8BXCUaVQOvoqwWH3 EAnqIf6xEE8GXasJX8Pz9nHgLUbXrxGFkGb8a/jxHvHCPue3tmMIq/IMhew/+pCWqrUP tNG1DBCeXrnxnnWqsMXwRhUEtpaPYJ9zbtI1uRwYHQRSPp3iApQp3N/REDhvZQHW4IXR fSu6nlp53QrH+V60C+9RMFuKPaKAFns+AwuF7E/O/ZpdE1WU2X+Zb73GS5nIQvEnFz9G /KNQ== X-Gm-Message-State: AKS2vOxDuYht4ijWNjNaiL4c2V2fnqnY8C7wczBR5EDS04sDNQNuvRgZ dU+HRmb4oljsEKYGeYzd2w== X-Received: by 10.223.130.199 with SMTP id 65mr2335429wrc.150.1497347823028; Tue, 13 Jun 2017 02:57:03 -0700 (PDT) Received: from localhost (92.40.249.225.threembb.co.uk. [92.40.249.225]) by smtp.gmail.com with ESMTPSA id v5sm2022337wrd.22.2017.06.13.02.57.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 13 Jun 2017 02:57:02 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix PR66313 References: Date: Tue, 13 Jun 2017 09:57:00 -0000 In-Reply-To: (Richard Biener's message of "Wed, 31 May 2017 16:07:18 +0200 (CEST)") Message-ID: <877f0gfbwz.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2017-06/txt/msg00879.txt.bz2 Richard Biener writes: > So I've come back to PR66313 and found a solution to the tailrecursion > missed optimization when fixing the factoring folding to use an unsigned > type when we're not sure of overflow. > > The folding part is identical to my last try from 2015, the tailrecursion > part makes us handle intermittent stmts that were introduced by foldings > that "clobber" our quest walking the single-use chain of stmts between > the call and the return (and failing at all stmts that are not part > of said chain). A simple solution is to move the stmts that are not > part of the chain and that we can move before the call. That handles > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > Richard. > > 2017-05-31 Richard Biener > > PR middle-end/66313 > * fold-const.c (fold_plusminus_mult_expr): If the factored > factor may be zero use a wrapping type for the inner operation. > * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap > and handle moved defs. > (process_assignment): Properly guard the unary op case. Return a > tri-state indicating that moving the stmt before the call may allow > to continue. Pass through to_move. > (find_tail_calls): Handle moving unrelated defs before > the call. > > * c-c++-common/ubsan/pr66313.c: New testcase. > * gcc.dg/tree-ssa/loop-15.c: Adjust. > > Index: gcc/fold-const.c > =================================================================== > *** gcc/fold-const.c.orig 2015-10-29 12:32:33.302782318 +0100 > --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100 > *************** fold_plusminus_mult_expr (location_t loc > *** 6916,6925 **** > } > same = NULL_TREE; > > ! if (operand_equal_p (arg01, arg11, 0)) > ! same = arg01, alt0 = arg00, alt1 = arg10; > ! else if (operand_equal_p (arg00, arg10, 0)) > same = arg00, alt0 = arg01, alt1 = arg11; > else if (operand_equal_p (arg00, arg11, 0)) > same = arg00, alt0 = arg01, alt1 = arg10; > else if (operand_equal_p (arg01, arg10, 0)) > --- 6916,6926 ---- > } > same = NULL_TREE; > > ! /* Prefer factoring a common non-constant. */ > ! if (operand_equal_p (arg00, arg10, 0)) > same = arg00, alt0 = arg01, alt1 = arg11; > + else if (operand_equal_p (arg01, arg11, 0)) > + same = arg01, alt0 = arg00, alt1 = arg10; > else if (operand_equal_p (arg00, arg11, 0)) > same = arg00, alt0 = arg01, alt1 = arg10; > else if (operand_equal_p (arg01, arg10, 0)) > *************** fold_plusminus_mult_expr (location_t loc > *** 6974,6987 **** > } > } > > ! if (same) > return fold_build2_loc (loc, MULT_EXPR, type, > fold_build2_loc (loc, code, type, > fold_convert_loc (loc, type, alt0), > fold_convert_loc (loc, type, alt1)), > fold_convert_loc (loc, type, same)); > > ! return NULL_TREE; > } > > /* Subroutine of native_encode_expr. Encode the INTEGER_CST > --- 6975,7010 ---- > } > } > > ! if (!same) > ! return NULL_TREE; > ! > ! if (! INTEGRAL_TYPE_P (type) > ! || TYPE_OVERFLOW_WRAPS (type) > ! /* We are neither factoring zero nor minus one. */ > ! || TREE_CODE (same) == INTEGER_CST) > return fold_build2_loc (loc, MULT_EXPR, type, > fold_build2_loc (loc, code, type, > fold_convert_loc (loc, type, alt0), > fold_convert_loc (loc, type, alt1)), > fold_convert_loc (loc, type, same)); > > ! /* Same may be zero and thus the operation 'code' may overflow. Likewise > ! same may be minus one and thus the multiplication may overflow. Perform > ! the operations in an unsigned type. */ > ! tree utype = unsigned_type_for (type); > ! tree tem = fold_build2_loc (loc, code, utype, > ! fold_convert_loc (loc, utype, alt0), > ! fold_convert_loc (loc, utype, alt1)); > ! /* If the sum evaluated to a constant that is not -INF the multiplication > ! cannot overflow. */ > ! if (TREE_CODE (tem) == INTEGER_CST > ! && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED))) > ! return fold_build2_loc (loc, MULT_EXPR, type, > ! fold_convert (type, tem), same); > ! > ! return fold_convert_loc (loc, type, > ! fold_build2_loc (loc, MULT_EXPR, utype, tem, > ! fold_convert_loc (loc, utype, same))); > } > > /* Subroutine of native_encode_expr. Encode the INTEGER_CST Sorry if you already know, but this part means that we can no longer vectorise: int f (int *x, int b1, int b2, int b3) { int foo = 0; for (int i1 = 0; i1 < b1; ++i1) for (int i2 = 0; i2 < b2; ++i2) for (int i3 = 0; i3 < b3; ++i3) foo += x[i1 * b2 * b3 + i2 * b3 + (i3 - 1)]; return foo; } We now convert all the arithmetic in the [...] to unsigned int and reassociate it so that the "- 1" is applied last. We then assume that the overflow in this subtraction is well-defined and that the &x[...] might not be linear for the inner loop. Thanks, Richard