public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "amker at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can
Date: Thu, 29 Jan 2015 06:48:00 -0000	[thread overview]
Message-ID: <bug-62173-4-RDtFxEuaVG@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-62173-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173

--- Comment #30 from amker at gcc dot gnu.org ---
(In reply to Richard Biener from comment #17)
> I really wonder why IVOPTs calls convert_affine_scev with
> !use_overflow_semantics.
> 
> Note that for the original testcase 'i' may be negative or zero and thus 'd'
> may be zero.  We do a bad analysis here because IVOPTs follows complete
> peeling immediately...  but at least we have range information that looks
> useful:
> 
>   <bb 16>:
>   # RANGE [0, 10] NONZERO 15
>   # d_26 = PHI <i_6(D)(15), d_13(17)>
>   # RANGE [0, 9] NONZERO 15
>   d_13 = d_26 + -1;
>   _14 = A[d_26];
>   # RANGE [0, 255] NONZERO 255
>   _15 = (int) _14;
>   # USE = nonlocal
>   # CLB = nonlocal
>   foo (_15);
>   if (d_13 != 0)
>     goto <bb 17>;
>   else
>     goto <bb 3>;
> 
>   <bb 17>:
>   goto <bb 16>;
> 
> but unfortunately we expand the initial value of the IV for d all the way
> to i_6(D) so we don't see that i_6(D) is constrained by the range for d_26.
> 
> So when we are in idx_find_step before we replace *idx with iv->base
> we could check range-information on whether it wrapped.  Hmm, I think
> we can't really compute this.  But we can transfer range information
> (temporarily) from d_26 to iv->base i_6(D) and make use of that in
> scev_probably_wraps_p.  There we currently compute whether
> (unsigned) i_6(D) + 2147483648 (??) > 9 using fold_binary but with
> range information [0, 10] it would compute as false (huh, so what is it
> actually testing?!).  I think the computation of 'delta' should instead
> be adjusted to use range information - max for negative step and min
> for positive step.  Like the following:
> 
> Index: gcc/tree-ssa-loop-niter.c
> ===================================================================
> --- gcc/tree-ssa-loop-niter.c   (revision 220038)
> +++ gcc/tree-ssa-loop-niter.c   (working copy)
> @@ -3863,12 +3863,17 @@ scev_probably_wraps_p (tree base, tree s
>       bound of the type, and verify that the loop is exited before this
>       occurs.  */
>    unsigned_type = unsigned_type_for (type);
> -  base = fold_convert (unsigned_type, base);
> -
>    if (tree_int_cst_sign_bit (step))
>      {
>        tree extreme = fold_convert (unsigned_type,
>                                    lower_bound_in_type (type, type));
> +      wide_int min, max;
> +      if (TREE_CODE (base) == SSA_NAME
> +         && INTEGRAL_TYPE_P (TREE_TYPE (base))
> +         && get_range_info (base, &min, &max) == VR_RANGE)
> +       base = wide_int_to_tree (unsigned_type, max);
> +      else
> +       base = fold_convert (unsigned_type, base);
>        delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
>        step_abs = fold_build1 (NEGATE_EXPR, unsigned_type,
>                               fold_convert (unsigned_type, step));
> @@ -3877,6 +3882,13 @@ scev_probably_wraps_p (tree base, tree s
>      {
>        tree extreme = fold_convert (unsigned_type,
>                                    upper_bound_in_type (type, type));
> +      wide_int min, max;
> +      if (TREE_CODE (base) == SSA_NAME
> +         && INTEGRAL_TYPE_P (TREE_TYPE (base))
> +         && get_range_info (base, &min, &max) == VR_RANGE)
> +       base = wide_int_to_tree (unsigned_type, min);
> +      else
> +       base = fold_convert (unsigned_type, base);
>        delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
>        step_abs = fold_convert (unsigned_type, step);
>      }
> 
> doesn't really help this case unless i_6(D) gets range-information transfered
> temporarily as I said above, of course.

As you said, range information for i_6(D) actually is flow-sensitive
information, we can't simply propagate range_info(d) to i_6(D) generally.  Even
if we can, the code change is not natural since the related functions are
scattered across different functions (ivopt/chrec/niter).
So I take the other way around by passing the IV's ssa_name into
scev_probably_wraps_p along call sequence
"idx_find_step->convert_affine_scev->scev_probably_wraps".  Since the IV's
ssa_name is tagged with right range information, we can use it when proving
there is no overflow/wrap in src scev.

This mothed works and GCC now recognizes the address iv use in A[d].

BUT, the problem not only exists in address type iv's code, but also in compare
type iv's.  The IV dump file is now as below:


  <bb 8>:
  _4 = (sizetype) i_6(D);
  _3 = &A + _4;
  ivtmp.11_17 = (unsigned long) _3;
  _1 = (sizetype) i_6(D);
  _2 = (unsigned int) i_6(D);
  _22 = _2 + 4294967295;
  _21 = (sizetype) _22;
  _20 = _1 - _21;
  _29 = _20 + 18446744073709551615;
  _30 = &A + _29;
  _31 = (unsigned long) _30;

  <bb 9>:
  # ivtmp.11_18 = PHI <ivtmp.11_17(8), ivtmp.11_8(11)>
  _5 = (void *) ivtmp.11_18;
  _14 = MEM[base: _5, offset: 0B];
  foo (_14);
  ivtmp.11_8 = ivtmp.11_18 - 1;
  if (ivtmp.11_8 != _31)
    goto <bb 11>;
  else
    goto <bb 10>;

  <bb 10>:
  goto <bb 3>;

  <bb 11>:
  goto <bb 9>;

The loop preheader is bloated by computing "cand_value_at (loop, cand,
use->stmt, desc->niter, &bnd);" in function "may_eliminate_iv" and we have:

(gdb) call debug_generic_expr(cand->iv->base)
(unsigned long) ((char *) &A + (sizetype) i_6(D))
(gdb) call debug_generic_expr(cand->iv->step)
18446744073709551615
(gdb) call debug_generic_expr(desc->niter)
(unsigned int) (i_6(D) + -1)

GCC is not aware of RANGE_INFO(i_6(D)): [1:10], it doesn't know that below
condition holds:
  (unsigned long)(unsigned int)(i_6(D) + -1)  == (unsigned long)(i_6(D) + -1)


The problem is niter is computed in unsigned version type of control iv, which
is unsigned int because control biv (d) is of type int.  But the candidate is
of larger type "unsigned long" on 64 bits target.

I really think LLVM's front-end makes better decision by promoting the index of
array_ref A[d] to 64 bits signed type, while GCC keeps it in 32 bits.


  parent reply	other threads:[~2015-01-29  6:48 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-18 16:12 [Bug target/62173] New: [AArch64] Performance regression due to r213488 spop at gcc dot gnu.org
2014-08-18 16:39 ` [Bug target/62173] " pinskia at gcc dot gnu.org
2014-08-18 19:13 ` spop at gcc dot gnu.org
2014-08-19  1:37 ` amker at gcc dot gnu.org
2014-10-28 11:28 ` [Bug target/62173] [5.0 regression] " jiwang at gcc dot gnu.org
2014-11-14  9:37 ` jiwang at gcc dot gnu.org
2014-11-17  2:14 ` amker.cheng at gmail dot com
2014-11-24 12:15 ` jiwang at gcc dot gnu.org
2014-11-24 12:38 ` jiwang at gcc dot gnu.org
2014-11-24 13:06 ` rguenth at gcc dot gnu.org
2014-11-24 23:01 ` jiwang at gcc dot gnu.org
2014-11-26 10:54 ` [Bug target/62173] [5.0 regression] [AArch64] Can't ivopt array base address while ARM can jiwang at gcc dot gnu.org
2014-11-27  9:35 ` jiwang at gcc dot gnu.org
2014-11-27 12:00 ` [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can rguenther at suse dot de
2014-11-27 12:16 ` rguenther at suse dot de
2014-11-27 13:34 ` jiwang at gcc dot gnu.org
2015-01-23 17:33 ` jiwang at gcc dot gnu.org
2015-01-26 10:30 ` rguenth at gcc dot gnu.org
2015-01-26 11:10 ` rguenth at gcc dot gnu.org
2015-01-26 13:48 ` ramana at gcc dot gnu.org
2015-01-26 14:19 ` amker at gcc dot gnu.org
2015-01-26 14:51 ` rguenther at suse dot de
2015-01-26 14:53 ` rguenther at suse dot de
2015-01-26 15:03 ` amker at gcc dot gnu.org
2015-01-26 15:38 ` jiwang at gcc dot gnu.org
2015-01-27  3:21 ` amker at gcc dot gnu.org
2015-01-27  7:56 ` amker at gcc dot gnu.org
2015-01-27  9:11 ` rguenther at suse dot de
2015-01-28 18:26 ` LpSolit at netscape dot net
2015-01-29  6:48 ` amker at gcc dot gnu.org [this message]
2015-01-30  6:42 ` amker at gcc dot gnu.org
2015-01-30 12:32 ` rguenth at gcc dot gnu.org
2015-02-05  7:27 ` amker at gcc dot gnu.org
2015-03-11 17:30 ` [Bug tree-optimization/62173] [5 Regression] " jiwang at gcc dot gnu.org
2015-03-11 17:46 ` jiwang at gcc dot gnu.org
2015-03-11 17:52 ` [Bug tree-optimization/62173] [5/6 " jakub at gcc dot gnu.org
2015-03-13  8:34 ` amker at gcc dot gnu.org
2015-06-02  3:34 ` amker at gcc dot gnu.org
2015-06-03  3:56 ` amker at gcc dot gnu.org
2015-07-22 11:44 ` jiwang at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-62173-4-RDtFxEuaVG@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).