public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
@ 2023-08-30  1:48 juzhe.zhong
  0 siblings, 0 replies; 9+ messages in thread
From: juzhe.zhong @ 2023-08-30  1:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: jeffreyalaw, stefansf

[-- Attachment #1: Type: text/plain, Size: 154 bytes --]

Ping. This patch also fixed issue occurred in RISC-V backend:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111171 

Thanks.


juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-10-01 14:26   ` Stefan Schulze Frielinghaus
@ 2023-10-01 14:36     ` Jeff Law
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff Law @ 2023-10-01 14:36 UTC (permalink / raw)
  To: Stefan Schulze Frielinghaus, Eric Botcazou; +Cc: gcc-patches, Xi Ruoyao



On 10/1/23 08:26, Stefan Schulze Frielinghaus wrote:

>> FWIW, I should definitely have caught this hunk earlier -- we've gone the
>> rounds in this same space (GEN_INT vs gen_int_mode) elsewhere.
>>
>> Again, sorry for the long wait.
>>
>> jeff
> 
> No worries at all.  At least I have learned something new :)
> 
> Thanks Jeff and Eric for clarification.  This matches with my intuition,
> now, so I've pushed the patch.
BTW, in a completely different context I need to do some testing on the 
alpha port, which hasn't bootstrapped in a couple months.  I'd been 
assuming the failure was completely due to ongoing changes I'm making in 
the infrastructure I used to bootstrap QEMU emulated systems.

As it turns out, the infrastructure changes and this minor combine issue 
were both playing a role!

Jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-09-29 19:01 ` Jeff Law
@ 2023-10-01 14:26   ` Stefan Schulze Frielinghaus
  2023-10-01 14:36     ` Jeff Law
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Schulze Frielinghaus @ 2023-10-01 14:26 UTC (permalink / raw)
  To: Jeff Law, Eric Botcazou; +Cc: gcc-patches, Xi Ruoyao

On Fri, Sep 29, 2023 at 01:01:57PM -0600, Jeff Law wrote:
> 
> 
> On 8/10/23 07:04, Stefan Schulze Frielinghaus via Gcc-patches wrote:
> > In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> > completely missed the fact that the normal form of a generated constant for a
> > mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> > actual constant.  This even holds true for unsigned constants.
> > 
> > Fixed by masking out the upper bits for the incoming constant and sign
> > extending the resulting unsigned constant.
> > 
> > Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> > 
> [ snip]
> 
> > 
> > gcc/ChangeLog:
> > 
> > 	* combine.cc (simplify_compare_const): Properly handle unsigned
> > 	constants while narrowing comparison of memory and constants.
> OK.
> 
> 
> > ---
> > @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
> >   		HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
> >   		HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
> >   		GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> > -		(unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> > -		n);
> > +		(unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> > +		GET_RTX_NAME (adjusted_code), n);
> >   	    }
> >   	  poly_int64 offset = (BYTES_BIG_ENDIAN
> >   			       ? 0
> >   			       : (GET_MODE_SIZE (int_mode)
> >   				  - GET_MODE_SIZE (narrow_mode_iter)));
> >   	  *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> > -	  *pop1 = GEN_INT (n);
> > +	  *pop1 = gen_int_mode (n, narrow_mode_iter);
> >   	  return adjusted_code;
> FWIW, I should definitely have caught this hunk earlier -- we've gone the
> rounds in this same space (GEN_INT vs gen_int_mode) elsewhere.
> 
> Again, sorry for the long wait.
> 
> jeff

No worries at all.  At least I have learned something new :)

Thanks Jeff and Eric for clarification.  This matches with my intuition,
now, so I've pushed the patch.

Cheers,
Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-08-10 13:04 Stefan Schulze Frielinghaus
                   ` (2 preceding siblings ...)
  2023-08-29 10:24 ` Xi Ruoyao
@ 2023-09-29 19:01 ` Jeff Law
  2023-10-01 14:26   ` Stefan Schulze Frielinghaus
  3 siblings, 1 reply; 9+ messages in thread
From: Jeff Law @ 2023-09-29 19:01 UTC (permalink / raw)
  To: Stefan Schulze Frielinghaus, gcc-patches



On 8/10/23 07:04, Stefan Schulze Frielinghaus via Gcc-patches wrote:
> In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> completely missed the fact that the normal form of a generated constant for a
> mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> actual constant.  This even holds true for unsigned constants.
> 
> Fixed by masking out the upper bits for the incoming constant and sign
> extending the resulting unsigned constant.
> 
> Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> 
[ snip]

> 
> gcc/ChangeLog:
> 
> 	* combine.cc (simplify_compare_const): Properly handle unsigned
> 	constants while narrowing comparison of memory and constants.
OK.


> ---
> @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>   		HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
>   		HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
>   		GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> -		(unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> -		n);
> +		(unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> +		GET_RTX_NAME (adjusted_code), n);
>   	    }
>   	  poly_int64 offset = (BYTES_BIG_ENDIAN
>   			       ? 0
>   			       : (GET_MODE_SIZE (int_mode)
>   				  - GET_MODE_SIZE (narrow_mode_iter)));
>   	  *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> -	  *pop1 = GEN_INT (n);
> +	  *pop1 = gen_int_mode (n, narrow_mode_iter);
>   	  return adjusted_code;
FWIW, I should definitely have caught this hunk earlier -- we've gone 
the rounds in this same space (GEN_INT vs gen_int_mode) elsewhere.

Again, sorry for the long wait.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-08-10 13:04 Stefan Schulze Frielinghaus
  2023-08-12  1:04 ` Xi Ruoyao
  2023-08-18 11:04 ` Stefan Schulze Frielinghaus
@ 2023-08-29 10:24 ` Xi Ruoyao
  2023-09-29 19:01 ` Jeff Law
  3 siblings, 0 replies; 9+ messages in thread
From: Xi Ruoyao @ 2023-08-29 10:24 UTC (permalink / raw)
  To: Jeff Law; +Cc: Stefan Schulze Frielinghaus, gcc-patches

Hi Jeff,

Can you take a look at the patch?  It fixes a bootstrap failure on
LoongArch.  And in this month 3 related bugzilla tickets have been
created (110939, 111124, 111171).

On Thu, 2023-08-10 at 15:04 +0200, Stefan Schulze Frielinghaus via Gcc-
patches wrote:
> In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> completely missed the fact that the normal form of a generated constant for a
> mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> actual constant.  This even holds true for unsigned constants.
> 
> Fixed by masking out the upper bits for the incoming constant and sign
> extending the resulting unsigned constant.
> 
> Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> 
> While reading existing optimizations in combine I stumbled across two
> optimizations where either my intuition about the representation of
> unsigned integers via a const_int rtx is wrong, which then in turn would
> probably also mean that this patch is wrong, or that the optimizations
> are missed sometimes.  In other words in the following I would assume
> that the upper bits are masked out:

/* removed the inlined patch to avoid confusion */

> For example, while bootstrapping on x64 the optimization is missed since
> a LTU comparison in QImode is done and the constant equals
> 0xffffffffffffff80.
> 
> Sorry for inlining another patch, but I would really like to make sure
> that my understanding is correct, now, before I come up with another
> patch.  Thus it would be great if someone could shed some light on this.
> 
> gcc/ChangeLog:
> 
>         * combine.cc (simplify_compare_const): Properly handle unsigned
>         constants while narrowing comparison of memory and constants.
> ---
>  gcc/combine.cc | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index e46d202d0a7..468b7fde911 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        && !MEM_VOLATILE_P (op0)
>        /* The optimization makes only sense for constants which are big enough
>          so that we have a chance to chop off something at all.  */
> -      && (unsigned HOST_WIDE_INT) const_op > 0xff
> -      /* Bail out, if the constant does not fit into INT_MODE.  */
> -      && (unsigned HOST_WIDE_INT) const_op
> -        < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1)
> +      && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) > 0xff
>        /* Ensure that we do not overflow during normalization.  */
> -      && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U))
> +      && (code != GTU
> +         || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> +            < HOST_WIDE_INT_M1U)
> +      && trunc_int_for_mode (const_op, int_mode) == const_op)
>      {
> -      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> +      unsigned HOST_WIDE_INT n
> +       = (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
>        enum rtx_code adjusted_code;
>  
>        /* Normalize code to either LEU or GEU.  */
> @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>                 HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
>                 HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
>                 GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> -               (unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> -               n);
> +               (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> +               GET_RTX_NAME (adjusted_code), n);
>             }
>           poly_int64 offset = (BYTES_BIG_ENDIAN
>                                ? 0
>                                : (GET_MODE_SIZE (int_mode)
>                                   - GET_MODE_SIZE (narrow_mode_iter)));
>           *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> -         *pop1 = GEN_INT (n);
> +         *pop1 = gen_int_mode (n, narrow_mode_iter);
>           return adjusted_code;
>         }
>      }

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-08-10 13:04 Stefan Schulze Frielinghaus
  2023-08-12  1:04 ` Xi Ruoyao
@ 2023-08-18 11:04 ` Stefan Schulze Frielinghaus
  2023-08-29 10:24 ` Xi Ruoyao
  2023-09-29 19:01 ` Jeff Law
  3 siblings, 0 replies; 9+ messages in thread
From: Stefan Schulze Frielinghaus @ 2023-08-18 11:04 UTC (permalink / raw)
  To: gcc-patches

Ping.  Since this fixes bootstrap problem PR110939 for Loongarch I'm
pingen this one earlier.

On Thu, Aug 10, 2023 at 03:04:03PM +0200, Stefan Schulze Frielinghaus wrote:
> In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> completely missed the fact that the normal form of a generated constant for a
> mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> actual constant.  This even holds true for unsigned constants.
> 
> Fixed by masking out the upper bits for the incoming constant and sign
> extending the resulting unsigned constant.
> 
> Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> 
> While reading existing optimizations in combine I stumbled across two
> optimizations where either my intuition about the representation of
> unsigned integers via a const_int rtx is wrong, which then in turn would
> probably also mean that this patch is wrong, or that the optimizations
> are missed sometimes.  In other words in the following I would assume
> that the upper bits are masked out:
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 468b7fde911..80c4ff0fbaf 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        /* (unsigned) < 0x80000000 is equivalent to >= 0.  */
>        else if (is_a <scalar_int_mode> (mode, &int_mode)
>                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> -              && ((unsigned HOST_WIDE_INT) const_op
> +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
>                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
>         {
>           const_op = 0;
> @@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        /* (unsigned) >= 0x80000000 is equivalent to < 0.  */
>        else if (is_a <scalar_int_mode> (mode, &int_mode)
>                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> -              && ((unsigned HOST_WIDE_INT) const_op
> +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
>                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
>         {
>           const_op = 0;
> 
> For example, while bootstrapping on x64 the optimization is missed since
> a LTU comparison in QImode is done and the constant equals
> 0xffffffffffffff80.
> 
> Sorry for inlining another patch, but I would really like to make sure
> that my understanding is correct, now, before I come up with another
> patch.  Thus it would be great if someone could shed some light on this.
> 
> gcc/ChangeLog:
> 
> 	* combine.cc (simplify_compare_const): Properly handle unsigned
> 	constants while narrowing comparison of memory and constants.
> ---
>  gcc/combine.cc | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index e46d202d0a7..468b7fde911 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        && !MEM_VOLATILE_P (op0)
>        /* The optimization makes only sense for constants which are big enough
>  	 so that we have a chance to chop off something at all.  */
> -      && (unsigned HOST_WIDE_INT) const_op > 0xff
> -      /* Bail out, if the constant does not fit into INT_MODE.  */
> -      && (unsigned HOST_WIDE_INT) const_op
> -	 < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1)
> +      && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) > 0xff
>        /* Ensure that we do not overflow during normalization.  */
> -      && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U))
> +      && (code != GTU
> +	  || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> +	     < HOST_WIDE_INT_M1U)
> +      && trunc_int_for_mode (const_op, int_mode) == const_op)
>      {
> -      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> +      unsigned HOST_WIDE_INT n
> +	= (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
>        enum rtx_code adjusted_code;
>  
>        /* Normalize code to either LEU or GEU.  */
> @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>  		HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
>  		HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
>  		GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> -		(unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> -		n);
> +		(unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> +		GET_RTX_NAME (adjusted_code), n);
>  	    }
>  	  poly_int64 offset = (BYTES_BIG_ENDIAN
>  			       ? 0
>  			       : (GET_MODE_SIZE (int_mode)
>  				  - GET_MODE_SIZE (narrow_mode_iter)));
>  	  *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> -	  *pop1 = GEN_INT (n);
> +	  *pop1 = gen_int_mode (n, narrow_mode_iter);
>  	  return adjusted_code;
>  	}
>      }
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-08-12  1:04 ` Xi Ruoyao
@ 2023-08-14  6:39   ` Stefan Schulze Frielinghaus
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Schulze Frielinghaus @ 2023-08-14  6:39 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: gcc-patches

On Sat, Aug 12, 2023 at 09:04:19AM +0800, Xi Ruoyao wrote:
> On Thu, 2023-08-10 at 15:04 +0200, Stefan Schulze Frielinghaus via Gcc-
> patches wrote:
> > In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> > completely missed the fact that the normal form of a generated constant for a
> > mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> > actual constant.  This even holds true for unsigned constants.
> > 
> > Fixed by masking out the upper bits for the incoming constant and sign
> > extending the resulting unsigned constant.
> > 
> > Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> 
> The patch fails to apply:
> 
> patching file gcc/combine.cc
> Hunk #1 FAILED at 11923.
> Hunk #2 FAILED at 11962.
> 
> It looks like some indents are tabs in the source file, but white spaces
> in the patch.

The patch itself applies cleanly.  This is due to my inlined diff in
order to raise some discussion, i.e., just remove the following from
the email and the patch applies:

> > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > index 468b7fde911..80c4ff0fbaf 100644
> > --- a/gcc/combine.cc
> > +++ b/gcc/combine.cc
> > @@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
> >        /* (unsigned) < 0x80000000 is equivalent to >= 0.  */
> >        else if (is_a <scalar_int_mode> (mode, &int_mode)
> >                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> > -              && ((unsigned HOST_WIDE_INT) const_op
> > +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> >                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
> >         {
> >           const_op = 0;
> > @@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
> >        /* (unsigned) >= 0x80000000 is equivalent to < 0.  */
> >        else if (is_a <scalar_int_mode> (mode, &int_mode)
> >                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> > -              && ((unsigned HOST_WIDE_INT) const_op
> > +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> >                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
> >         {
> >           const_op = 0;

Looks like git am/apply is confused by that.

Cheers,
Stefan

> > 
> > For example, while bootstrapping on x64 the optimization is missed since
> > a LTU comparison in QImode is done and the constant equals
> > 0xffffffffffffff80.
> > 
> > Sorry for inlining another patch, but I would really like to make sure
> > that my understanding is correct, now, before I come up with another
> > patch.  Thus it would be great if someone could shed some light on this.
> > 
> > gcc/ChangeLog:
> > 
> >         * combine.cc (simplify_compare_const): Properly handle unsigned
> >         constants while narrowing comparison of memory and constants.
> > ---
> >  gcc/combine.cc | 19 ++++++++++---------
> >  1 file changed, 10 insertions(+), 9 deletions(-)
> > 
> > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > index e46d202d0a7..468b7fde911 100644
> > --- a/gcc/combine.cc
> > +++ b/gcc/combine.cc
> > @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
> >        && !MEM_VOLATILE_P (op0)
> >        /* The optimization makes only sense for constants which are big enough
> >          so that we have a chance to chop off something at all.  */
> > -      && (unsigned HOST_WIDE_INT) const_op > 0xff
> > -      /* Bail out, if the constant does not fit into INT_MODE.  */
> > -      && (unsigned HOST_WIDE_INT) const_op
> > -        < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1)
> > +      && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) > 0xff
> >        /* Ensure that we do not overflow during normalization.  */
> > -      && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U))
> > +      && (code != GTU
> > +         || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> > +            < HOST_WIDE_INT_M1U)
> > +      && trunc_int_for_mode (const_op, int_mode) == const_op)
> >      {
> > -      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> > +      unsigned HOST_WIDE_INT n
> > +       = (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
> >        enum rtx_code adjusted_code;
> >  
> >        /* Normalize code to either LEU or GEU.  */
> > @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
> >                 HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
> >                 HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
> >                 GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> > -               (unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> > -               n);
> > +               (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> > +               GET_RTX_NAME (adjusted_code), n);
> >             }
> >           poly_int64 offset = (BYTES_BIG_ENDIAN
> >                                ? 0
> >                                : (GET_MODE_SIZE (int_mode)
> >                                   - GET_MODE_SIZE (narrow_mode_iter)));
> >           *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> > -         *pop1 = GEN_INT (n);
> > +         *pop1 = gen_int_mode (n, narrow_mode_iter);
> >           return adjusted_code;
> >         }
> >      }
> 
> -- 
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
  2023-08-10 13:04 Stefan Schulze Frielinghaus
@ 2023-08-12  1:04 ` Xi Ruoyao
  2023-08-14  6:39   ` Stefan Schulze Frielinghaus
  2023-08-18 11:04 ` Stefan Schulze Frielinghaus
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Xi Ruoyao @ 2023-08-12  1:04 UTC (permalink / raw)
  To: Stefan Schulze Frielinghaus, gcc-patches

On Thu, 2023-08-10 at 15:04 +0200, Stefan Schulze Frielinghaus via Gcc-
patches wrote:
> In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> completely missed the fact that the normal form of a generated constant for a
> mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
> actual constant.  This even holds true for unsigned constants.
> 
> Fixed by masking out the upper bits for the incoming constant and sign
> extending the resulting unsigned constant.
> 
> Bootstrapped and regtested on x64 and s390x.  Ok for mainline?

The patch fails to apply:

patching file gcc/combine.cc
Hunk #1 FAILED at 11923.
Hunk #2 FAILED at 11962.

It looks like some indents are tabs in the source file, but white spaces
in the patch.

> While reading existing optimizations in combine I stumbled across two
> optimizations where either my intuition about the representation of
> unsigned integers via a const_int rtx is wrong, which then in turn would
> probably also mean that this patch is wrong, or that the optimizations
> are missed sometimes.  In other words in the following I would assume
> that the upper bits are masked out:
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 468b7fde911..80c4ff0fbaf 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        /* (unsigned) < 0x80000000 is equivalent to >= 0.  */
>        else if (is_a <scalar_int_mode> (mode, &int_mode)
>                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> -              && ((unsigned HOST_WIDE_INT) const_op
> +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
>                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
>         {
>           const_op = 0;
> @@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        /* (unsigned) >= 0x80000000 is equivalent to < 0.  */
>        else if (is_a <scalar_int_mode> (mode, &int_mode)
>                && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
> -              && ((unsigned HOST_WIDE_INT) const_op
> +              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
>                    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
>         {
>           const_op = 0;
> 
> For example, while bootstrapping on x64 the optimization is missed since
> a LTU comparison in QImode is done and the constant equals
> 0xffffffffffffff80.
> 
> Sorry for inlining another patch, but I would really like to make sure
> that my understanding is correct, now, before I come up with another
> patch.  Thus it would be great if someone could shed some light on this.
> 
> gcc/ChangeLog:
> 
>         * combine.cc (simplify_compare_const): Properly handle unsigned
>         constants while narrowing comparison of memory and constants.
> ---
>  gcc/combine.cc | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index e46d202d0a7..468b7fde911 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>        && !MEM_VOLATILE_P (op0)
>        /* The optimization makes only sense for constants which are big enough
>          so that we have a chance to chop off something at all.  */
> -      && (unsigned HOST_WIDE_INT) const_op > 0xff
> -      /* Bail out, if the constant does not fit into INT_MODE.  */
> -      && (unsigned HOST_WIDE_INT) const_op
> -        < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1)
> +      && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) > 0xff
>        /* Ensure that we do not overflow during normalization.  */
> -      && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U))
> +      && (code != GTU
> +         || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
> +            < HOST_WIDE_INT_M1U)
> +      && trunc_int_for_mode (const_op, int_mode) == const_op)
>      {
> -      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> +      unsigned HOST_WIDE_INT n
> +       = (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
>        enum rtx_code adjusted_code;
>  
>        /* Normalize code to either LEU or GEU.  */
> @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
>                 HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
>                 HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
>                 GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
> -               (unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
> -               n);
> +               (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
> +               GET_RTX_NAME (adjusted_code), n);
>             }
>           poly_int64 offset = (BYTES_BIG_ENDIAN
>                                ? 0
>                                : (GET_MODE_SIZE (int_mode)
>                                   - GET_MODE_SIZE (narrow_mode_iter)));
>           *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
> -         *pop1 = GEN_INT (n);
> +         *pop1 = gen_int_mode (n, narrow_mode_iter);
>           return adjusted_code;
>         }
>      }

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant
@ 2023-08-10 13:04 Stefan Schulze Frielinghaus
  2023-08-12  1:04 ` Xi Ruoyao
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Stefan Schulze Frielinghaus @ 2023-08-10 13:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Stefan Schulze Frielinghaus

In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
completely missed the fact that the normal form of a generated constant for a
mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the
actual constant.  This even holds true for unsigned constants.

Fixed by masking out the upper bits for the incoming constant and sign
extending the resulting unsigned constant.

Bootstrapped and regtested on x64 and s390x.  Ok for mainline?

While reading existing optimizations in combine I stumbled across two
optimizations where either my intuition about the representation of
unsigned integers via a const_int rtx is wrong, which then in turn would
probably also mean that this patch is wrong, or that the optimizations
are missed sometimes.  In other words in the following I would assume
that the upper bits are masked out:

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 468b7fde911..80c4ff0fbaf 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
       /* (unsigned) < 0x80000000 is equivalent to >= 0.  */
       else if (is_a <scalar_int_mode> (mode, &int_mode)
               && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
-              && ((unsigned HOST_WIDE_INT) const_op
+              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
                   == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
        {
          const_op = 0;
@@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
       /* (unsigned) >= 0x80000000 is equivalent to < 0.  */
       else if (is_a <scalar_int_mode> (mode, &int_mode)
               && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
-              && ((unsigned HOST_WIDE_INT) const_op
+              && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
                   == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
        {
          const_op = 0;

For example, while bootstrapping on x64 the optimization is missed since
a LTU comparison in QImode is done and the constant equals
0xffffffffffffff80.

Sorry for inlining another patch, but I would really like to make sure
that my understanding is correct, now, before I come up with another
patch.  Thus it would be great if someone could shed some light on this.

gcc/ChangeLog:

	* combine.cc (simplify_compare_const): Properly handle unsigned
	constants while narrowing comparison of memory and constants.
---
 gcc/combine.cc | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc
index e46d202d0a7..468b7fde911 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
       && !MEM_VOLATILE_P (op0)
       /* The optimization makes only sense for constants which are big enough
 	 so that we have a chance to chop off something at all.  */
-      && (unsigned HOST_WIDE_INT) const_op > 0xff
-      /* Bail out, if the constant does not fit into INT_MODE.  */
-      && (unsigned HOST_WIDE_INT) const_op
-	 < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1)
+      && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) > 0xff
       /* Ensure that we do not overflow during normalization.  */
-      && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U))
+      && (code != GTU
+	  || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))
+	     < HOST_WIDE_INT_M1U)
+      && trunc_int_for_mode (const_op, int_mode) == const_op)
     {
-      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
+      unsigned HOST_WIDE_INT n
+	= (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
       enum rtx_code adjusted_code;
 
       /* Normalize code to either LEU or GEU.  */
@@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
 		HOST_WIDE_INT_PRINT_HEX ") to (MEM %s "
 		HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode),
 		GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code),
-		(unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code),
-		n);
+		(unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode),
+		GET_RTX_NAME (adjusted_code), n);
 	    }
 	  poly_int64 offset = (BYTES_BIG_ENDIAN
 			       ? 0
 			       : (GET_MODE_SIZE (int_mode)
 				  - GET_MODE_SIZE (narrow_mode_iter)));
 	  *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset);
-	  *pop1 = GEN_INT (n);
+	  *pop1 = gen_int_mode (n, narrow_mode_iter);
 	  return adjusted_code;
 	}
     }
-- 
2.41.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-10-01 14:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-30  1:48 [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant juzhe.zhong
  -- strict thread matches above, loose matches on Subject: below --
2023-08-10 13:04 Stefan Schulze Frielinghaus
2023-08-12  1:04 ` Xi Ruoyao
2023-08-14  6:39   ` Stefan Schulze Frielinghaus
2023-08-18 11:04 ` Stefan Schulze Frielinghaus
2023-08-29 10:24 ` Xi Ruoyao
2023-09-29 19:01 ` Jeff Law
2023-10-01 14:26   ` Stefan Schulze Frielinghaus
2023-10-01 14:36     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).