From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-388631-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 16108 invoked by alias); 9 Jan 2015 06:09:07 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 16097 invoked by uid 89); 9 Jan 2015 06:09:06 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 09 Jan 2015 06:09:04 +0000
Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t09692gH014305	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL);	Fri, 9 Jan 2015 01:09:02 -0500
Received: from [10.3.113.12] ([10.3.113.12])	by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t09691Vl009073;	Fri, 9 Jan 2015 01:09:01 -0500
Message-ID: <54AF707D.6080800@redhat.com>
Date: Fri, 09 Jan 2015 06:09:00 -0000
From: Jeff Law <law@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: "Zamyatin, Igor" <igor.zamyatin@intel.com>,        "GCC Patches (gcc-patches@gcc.gnu.org)" <gcc-patches@gcc.gnu.org>
CC: "ysrumyan@gmail.com" <ysrumyan@gmail.com>
Subject: Re: [PATCH] Fix for PR64081 in RTL loop unroller
References: <0EFAB2BDD0F67E4FB6CCC8B9F87D756969CF7BFC@IRSMSX101.ger.corp.intel.com>
In-Reply-To: <0EFAB2BDD0F67E4FB6CCC8B9F87D756969CF7BFC@IRSMSX101.ger.corp.intel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2015-01/txt/msg00453.txt.bz2

On 12/19/14 03:20, Zamyatin, Igor wrote:
> Hi!
>
> This is an attempt to extend RTL unroller to allow cases like mentioned in the PR -
> namely when loop has duplicated exit blocks and back edges.
>
> Bootstrapped and regtested on x86_64, also checking wide range of benchmarks - spec2K, spec2006, EEMBC
> Is it ok for trunk in case if no testing issues?
>
> Thanks,
> Igor
>
> Changelog:
>
> Gcc:
>
> 2014-12-19  Igor Zamyatin  <igor.zamyatin@intel.com>
>
> 	PR rtl-optimization/64081
> 	* loop-iv.c (def_pred_latch_p): New function.
> 	(latch_dominating_def): Allow specific cases with non-single
> 	definitions.
> 	(iv_get_reaching_def): Likewise.
> 	(check_complex_exit_p): New function.
> 	(check_simple_exit): Use check_complex_exit_p to allow certain cases
> 	with exits not executing on any iteration.
>
>
> Testsuite:
>
> 2014-12-19  Igor Zamyatin  <igor.zamyatin@intel.com>
>
> 	PR rtl-optimization/64081
> 	* gcc.dg/pr64081.c: New test.
>
>
> diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
> index f55cea2..d5d48f1 100644
> --- a/gcc/loop-iv.c
> +++ b/gcc/loop-iv.c
>   /* Finds the definition of REG that dominates loop latch and stores
>      it to DEF.  Returns false if there is not a single definition
> -   dominating the latch.  If REG has no definition in loop, DEF
> +   dominating the latch or all defs are same and they are on different
> +   predecessors of loop latch.  If REG has no definition in loop, DEF
>      is set to NULL and true is returned.  */
Is it really sufficient here to verify that all the defs are on latch 
predecessors, what about the case where there is a predecessor without a 
def.  How do you guarantee domination in that case?

ISTM that given the structure for the code you're writing that you'd 
want to verify that in the event of multiple definitions that all of 
them appear on immediate predecessors of the latch *and* that each 
immediate predecessor has a definition.


>
>   static bool
>   latch_dominating_def (rtx reg, df_ref *def)
>   {
>     df_ref single_rd = NULL, adef;
> -  unsigned regno = REGNO (reg);
> +  unsigned regno = REGNO (reg), def_num = 0;
>     struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (current_loop->latch);
>
>     for (adef = DF_REG_DEF_CHAIN (regno); adef; adef = DF_REF_NEXT_REG (adef))
>       {
> +      /* Initialize this to true for the very first iteration when
> +	 SINGLE_RD is NULL.  */
> +      bool def_pred_latch = true;
> +
>         if (!bitmap_bit_p (df->blocks_to_analyze, DF_REF_BBNO (adef))
>   	  || !bitmap_bit_p (&bb_info->out, DF_REF_ID (adef)))
>   	continue;
>
> -      /* More than one reaching definition.  */
> +      /* More than one reaching definition is ok in case definitions are
> +	 in predecessors of latch block and those definitions are the same.
> +	 Probably this could be relaxed and check for sub-dominance instead
> +	 predecessor.  */
>         if (single_rd)
> -	return false;
> -
> -      if (!just_once_each_iteration_p (current_loop, DF_REF_BB (adef)))
> -	return false;
> +	{
> +	  def_num++;
> +	  if (!(def_pred_latch = def_pred_latch_p (adef))
> +	      || !rtx_equal_p( PATTERN (DF_REF_INSN (single_rd)),
Whitespace nit here.  Whitespace goes before the open paren for the 
function call, not after.


> @@ -351,10 +384,10 @@ latch_dominating_def (rtx reg, df_ref *def)
>   static enum iv_grd_result
>   iv_get_reaching_def (rtx_insn *insn, rtx reg, df_ref *def)
And in this routine, you appear to do both checks.  ie, each def is on 
an immediate predecessor and each immediate predecessor has a def.  Is 
there some reason why iv_get_reaching_def has the stronger check while 
latch_dominating_def does not?

jeff