From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 9607E3858C53
 for <gcc-patches@gcc.gnu.org>; Fri,  5 Aug 2022 12:46:09 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9607E3858C53
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ED42C106F;
 Fri,  5 Aug 2022 05:46:09 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.37])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C314D3F67D;
 Fri,  5 Aug 2022 05:46:08 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: "Roger Sayle" <roger@nextmovesoftware.com>
Mail-Followup-To: "Roger Sayle" <roger@nextmovesoftware.com>,
 <gcc-patches@gcc.gnu.org>, richard.sandiford@arm.com
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] middle-end: Allow backend to expand/split double word
 compare to 0/-1.
References: <02ac01d8a733$653480e0$2f9d82a0$@nextmovesoftware.com>
Date: Fri, 05 Aug 2022 13:46:07 +0100
In-Reply-To: <02ac01d8a733$653480e0$2f9d82a0$@nextmovesoftware.com> (Roger
 Sayle's message of "Wed, 3 Aug 2022 13:20:17 +0100")
Message-ID: <mpta68in9io.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Status: No, score=-52.1 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE,
 SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Aug 2022 12:46:11 -0000

"Roger Sayle" <roger@nextmovesoftware.com> writes:
> This patch to the middle-end's RTL expansion reorders the code in
> emit_store_flag_1 so that the backend has more control over how best
> to expand/split double word equality/inequality comparisons against
> zero or minus one.  With the current implementation, the middle-end
> always decides to lower this idiom during RTL expansion using SUBREGs
> and word mode instructions, without ever consulting the backend's
> machine description.  Hence on x86_64, a TImode comparison against zero
> is always expanded as:
>
> (parallel [
>   (set (reg:DI 91)
>        (ior:DI (subreg:DI (reg:TI 88) 0)
>                (subreg:DI (reg:TI 88) 8)))
>   (clobber (reg:CC 17 flags))])
>
> (set (reg:CCZ 17 flags)
>      (compare:CCZ (reg:DI 91)
>                   (const_int 0 [0])))
>
> This patch, which makes no changes to the code itself, simply reorders
> the clauses in emit_store_flag_1 so that the middle-end first attempts
> expansion using the target's doubleword mode cstore optab/expander,
> and only if this fails, falls back to lowering to word mode operations.
> On x86_64, this allows the expander to produce:
>
> (set (reg:CCZ 17 flags)
>      (compare:CCZ (reg:TI 88)
>                   (const_int 0 [0])))
>
> which is a candidate for scalar-to-vector transformations (and
> combine simplifications etc.).  On targets that don't define a cstore
> pattern for doubleword integer modes, there should be no change in
> behaviour.  For those that do, the current behaviour can be restored
> (if desired) by restricting the expander/insn to not apply when the
> comparison is EQ or NE, and operand[2] is either const0_rtx or
> constm1_rtx.
>
> This change just keeps RTL expansion more consistent (in philosophy).
> For other doubleword comparisons, such as with operators LT and GT,
> or with constants other than zero or -1, the wishes of the backend
> are respected, and only if the optab expansion fails are the default
> fall-back implementations using narrower integer mode operations
> (and conditional jumps) used.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures. I'm happy to help tweak any backends that notice
> a change in their generated code.  Ok for mainline?
>
> 2022-08-03  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * expmed.cc (emit_store_flag_1): Move code to expand double word
>         equality and inequality against zero or -1, using word operations,
>         to after trying to use the backend's cstore<mode>4 optab/expander.

LGTM.  I guess this raises the question of whether the shift conversion
should still come first.  But I think the reason for treating the two
cases differently is that the one that you're moving is still a cstore
operation, just in a different mode.  It makes sense to give the target
a go in the original mode before trying a smaller one.

Thanks,
Richard

> Thanks in advance,
> Roger
> --
>
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index 9b01b5a..8d7418b 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -5662,63 +5662,9 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1,
>        break;
>      }
>  
> -  /* If we are comparing a double-word integer with zero or -1, we can
> -     convert the comparison into one involving a single word.  */
> -  scalar_int_mode int_mode;
> -  if (is_int_mode (mode, &int_mode)
> -      && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2
> -      && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0)))
> -    {
> -      rtx tem;
> -      if ((code == EQ || code == NE)
> -	  && (op1 == const0_rtx || op1 == constm1_rtx))
> -	{
> -	  rtx op00, op01;
> -
> -	  /* Do a logical OR or AND of the two words and compare the
> -	     result.  */
> -	  op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0);
> -	  op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD);
> -	  tem = expand_binop (word_mode,
> -			      op1 == const0_rtx ? ior_optab : and_optab,
> -			      op00, op01, NULL_RTX, unsignedp,
> -			      OPTAB_DIRECT);
> -
> -	  if (tem != 0)
> -	    tem = emit_store_flag (NULL_RTX, code, tem, op1, word_mode,
> -				   unsignedp, normalizep);
> -	}
> -      else if ((code == LT || code == GE) && op1 == const0_rtx)
> -	{
> -	  rtx op0h;
> -
> -	  /* If testing the sign bit, can just test on high word.  */
> -	  op0h = simplify_gen_subreg (word_mode, op0, int_mode,
> -				      subreg_highpart_offset (word_mode,
> -							      int_mode));
> -	  tem = emit_store_flag (NULL_RTX, code, op0h, op1, word_mode,
> -				 unsignedp, normalizep);
> -	}
> -      else
> -	tem = NULL_RTX;
> -
> -      if (tem)
> -	{
> -	  if (target_mode == VOIDmode || GET_MODE (tem) == target_mode)
> -	    return tem;
> -	  if (!target)
> -	    target = gen_reg_rtx (target_mode);
> -
> -	  convert_move (target, tem,
> -			!val_signbit_known_set_p (word_mode,
> -						  (normalizep ? normalizep
> -						   : STORE_FLAG_VALUE)));
> -	  return target;
> -	}
> -    }
> -
>    /* If this is A < 0 or A >= 0, we can do this by taking the ones
>       complement of A (for GE) and shifting the sign bit to the low bit.  */
> +  scalar_int_mode int_mode;
>    if (op1 == const0_rtx && (code == LT || code == GE)
>        && is_int_mode (mode, &int_mode)
>        && (normalizep || STORE_FLAG_VALUE == 1
> @@ -5764,6 +5710,7 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1,
>        return op0;
>      }
>  
> +  /* Next try expanding this via the backend's cstore<mode>4.  */
>    mclass = GET_MODE_CLASS (mode);
>    FOR_EACH_MODE_FROM (compare_mode, mode)
>      {
> @@ -5788,6 +5735,60 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1,
>  	}
>      }
>  
> +  /* If we are comparing a double-word integer with zero or -1, we can
> +     convert the comparison into one involving a single word.  */
> +  if (is_int_mode (mode, &int_mode)
> +      && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2
> +      && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0)))
> +    {
> +      rtx tem;
> +      if ((code == EQ || code == NE)
> +	  && (op1 == const0_rtx || op1 == constm1_rtx))
> +	{
> +	  rtx op00, op01;
> +
> +	  /* Do a logical OR or AND of the two words and compare the
> +	     result.  */
> +	  op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0);
> +	  op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD);
> +	  tem = expand_binop (word_mode,
> +			      op1 == const0_rtx ? ior_optab : and_optab,
> +			      op00, op01, NULL_RTX, unsignedp,
> +			      OPTAB_DIRECT);
> +
> +	  if (tem != 0)
> +	    tem = emit_store_flag (NULL_RTX, code, tem, op1, word_mode,
> +				   unsignedp, normalizep);
> +	}
> +      else if ((code == LT || code == GE) && op1 == const0_rtx)
> +	{
> +	  rtx op0h;
> +
> +	  /* If testing the sign bit, can just test on high word.  */
> +	  op0h = simplify_gen_subreg (word_mode, op0, int_mode,
> +				      subreg_highpart_offset (word_mode,
> +							      int_mode));
> +	  tem = emit_store_flag (NULL_RTX, code, op0h, op1, word_mode,
> +				 unsignedp, normalizep);
> +	}
> +      else
> +	tem = NULL_RTX;
> +
> +      if (tem)
> +	{
> +	  if (target_mode == VOIDmode || GET_MODE (tem) == target_mode)
> +	    return tem;
> +	  if (!target)
> +	    target = gen_reg_rtx (target_mode);
> +
> +	  convert_move (target, tem,
> +			!val_signbit_known_set_p (word_mode,
> +						  (normalizep ? normalizep
> +						   : STORE_FLAG_VALUE)));
> +	  return target;
> +	}
> +    }
> +
>    return 0;
>  }
>