[PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
@ 2023-07-19  4:16 YunQiang Su
  2023-07-19  6:26 ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  4:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: pinskia, jeffreyalaw, ian, rguenther, YunQiang Su

PR #104914

When work with
  int val;
  ((unsigned char*)&val)[3] = *buf;
  if (val > 0) ...
The RTX mode is obtained from REG instead of SUBREG, which make
D<INS> is used instead of <INS>.  Thus something wrong happens
on sign-extend default architectures, like MIPS64.

Let's use str_rtx and mode of str_rtx as the parameters for
store_integral_bit_field if:
  modes of op0 and str_rtx are INT;
  length of op0 is greater than str_rtx.

This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
mips64el-linux-gnuabi64 without regression.

gcc/ChangeLog:
        PR: 104914.
        * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
	to store_integral_bit_field if the length of op0 is greater
	than str_rtx.

gcc/testsuite/ChangeLog:
        PR: 104914.
	* gcc.target/mips/pr104914.c: New testcase.
---
 gcc/expmed.cc                            | 20 +++++++++++++++++---
 gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index fbd4ce2d42f..5531c19e891 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
      since that case is valid for any mode.  The following cases are only
      valid for integral modes.  */
   opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
+  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
   scalar_int_mode imode;
   if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
     {
@@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
 	op0 = gen_lowpart (op0_mode.require (), op0);
     }
 
-  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
-				   bitregion_start, bitregion_end,
-				   fieldmode, value, reverse, fallback_p);
+  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
+     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
+     is an example.  For this case, we should use the mode of SUBREG, otherwise
+     bad code will generate for sign-extension ports, like MIPS.  */
+  bool use_str_mode = false;
+  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
+      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
+      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
+		   GET_MODE_SIZE (GET_MODE (str_rtx))))
+    use_str_mode = true;
+
+  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
+				   use_str_mode ? str_mode : op0_mode,
+				   ibitsize, ibitnum, bitregion_start,
+				   bitregion_end, fieldmode, value,
+				   reverse, fallback_p);
 }
 
 /* Subroutine of store_bit_field_1, with the same arguments, except
diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
new file mode 100644
index 00000000000..fd6ef6af446
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr104914.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-march=mips64r2 -mabi=64" } */
+
+/* { dg-final { scan-assembler-not "\tdins\t" } } */
+
+NOMIPS16 int test (const unsigned char *buf)
+{
+  int val;
+  ((unsigned char*)&val)[0] = *buf++;
+  ((unsigned char*)&val)[1] = *buf++;
+  ((unsigned char*)&val)[2] = *buf++;
+  ((unsigned char*)&val)[3] = *buf++;
+  if(val > 0)
+    return 1;
+  else
+    return 0;
+}
-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  4:16 [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible YunQiang Su
@ 2023-07-19  6:26 ` Richard Biener
  2023-07-19  6:58   ` YunQiang Su
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2023-07-19  6:26 UTC (permalink / raw)
  To: YunQiang Su; +Cc: gcc-patches, pinskia, jeffreyalaw, ian

On Wed, 19 Jul 2023, YunQiang Su wrote:

> PR #104914
> 
> When work with
>   int val;
>   ((unsigned char*)&val)[3] = *buf;
>   if (val > 0) ...
> The RTX mode is obtained from REG instead of SUBREG, which make
> D<INS> is used instead of <INS>.  Thus something wrong happens
> on sign-extend default architectures, like MIPS64.
> 
> Let's use str_rtx and mode of str_rtx as the parameters for
> store_integral_bit_field if:
>   modes of op0 and str_rtx are INT;
>   length of op0 is greater than str_rtx.
> 
> This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> mips64el-linux-gnuabi64 without regression.

I still think you are "fixing" this in the wrong place.  The bugzilla
audit trail points to combine and later notes an eventual expansion
issue (but for another testcase/target).

You have to explain in more detail on what is wrong with the initial
RTL on mips.

Richard.

> gcc/ChangeLog:
>         PR: 104914.
>         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> 	to store_integral_bit_field if the length of op0 is greater
> 	than str_rtx.
> 
> gcc/testsuite/ChangeLog:
>         PR: 104914.
> 	* gcc.target/mips/pr104914.c: New testcase.
> ---
>  gcc/expmed.cc                            | 20 +++++++++++++++++---
>  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
>  2 files changed, 34 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> 
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index fbd4ce2d42f..5531c19e891 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>       since that case is valid for any mode.  The following cases are only
>       valid for integral modes.  */
>    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
>    scalar_int_mode imode;
>    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
>      {
> @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>  	op0 = gen_lowpart (op0_mode.require (), op0);
>      }
>  
> -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> -				   bitregion_start, bitregion_end,
> -				   fieldmode, value, reverse, fallback_p);
> +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> +     bad code will generate for sign-extension ports, like MIPS.  */
> +  bool use_str_mode = false;
> +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> +		   GET_MODE_SIZE (GET_MODE (str_rtx))))
> +    use_str_mode = true;
> +
> +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> +				   use_str_mode ? str_mode : op0_mode,
> +				   ibitsize, ibitnum, bitregion_start,
> +				   bitregion_end, fieldmode, value,
> +				   reverse, fallback_p);
>  }
>  
>  /* Subroutine of store_bit_field_1, with the same arguments, except
> diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> new file mode 100644
> index 00000000000..fd6ef6af446
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=mips64r2 -mabi=64" } */
> +
> +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> +
> +NOMIPS16 int test (const unsigned char *buf)
> +{
> +  int val;
> +  ((unsigned char*)&val)[0] = *buf++;
> +  ((unsigned char*)&val)[1] = *buf++;
> +  ((unsigned char*)&val)[2] = *buf++;
> +  ((unsigned char*)&val)[3] = *buf++;
> +  if(val > 0)
> +    return 1;
> +  else
> +    return 0;
> +}
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  6:26 ` Richard Biener
@ 2023-07-19  6:58   ` YunQiang Su
  2023-07-19  7:22     ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  6:58 UTC (permalink / raw)
  To: Richard Biener; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年7月19日周三 14:27写道：
>
> On Wed, 19 Jul 2023, YunQiang Su wrote:
>
> > PR #104914
> >
> > When work with
> >   int val;
> >   ((unsigned char*)&val)[3] = *buf;
> >   if (val > 0) ...
> > The RTX mode is obtained from REG instead of SUBREG, which make
> > D<INS> is used instead of <INS>.  Thus something wrong happens
> > on sign-extend default architectures, like MIPS64.
> >
> > Let's use str_rtx and mode of str_rtx as the parameters for
> > store_integral_bit_field if:
> >   modes of op0 and str_rtx are INT;
> >   length of op0 is greater than str_rtx.
> >
> > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > mips64el-linux-gnuabi64 without regression.
>
> I still think you are "fixing" this in the wrong place.  The bugzilla
> audit trail points to combine and later notes an eventual expansion
> issue (but for another testcase/target).
>
> You have to explain in more detail on what is wrong with the initial
> RTL on mips.
>

In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like

(insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
            (const_int 8 [0x8])
            (const_int 0 [0]))
        (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
     (nil))

Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
instructions.
While in fact here, we expect an SImode operation, due to `val` in C
code is `int`.

With my patch, the RTX will be like:

(insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
            (const_int 8 [0x8])
            (const_int 0 [0]))
        (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
     (nil))

So the operation will be SImode, aka `INS` instruction for MIPS64.

The problem is based on 2 fact/root cause:
1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
    li $7, 0xff
    li $8, 0
    ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
The value of $8 will be 0xff ff ff ff ff 00 00 00.

    li $7, 0xff
    li $8, 0
    dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
The value of $8 will be 0x 00 00 00 00 ff 00 00 00.

2. Due to most of MIPS instructions work with 32bit value, aka instructions
without `d` as its first char (in fact with few exception), are sign-extension,
the MIPS backend just ignore `extendsidi2`, aka RTX

(insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
        (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
     (nil))



> Richard.
>
> > gcc/ChangeLog:
> >         PR: 104914.
> >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> >       to store_integral_bit_field if the length of op0 is greater
> >       than str_rtx.
> >
> > gcc/testsuite/ChangeLog:
> >         PR: 104914.
> >       * gcc.target/mips/pr104914.c: New testcase.
> > ---
> >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> >  2 files changed, 34 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> >
> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > index fbd4ce2d42f..5531c19e891 100644
> > --- a/gcc/expmed.cc
> > +++ b/gcc/expmed.cc
> > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> >       since that case is valid for any mode.  The following cases are only
> >       valid for integral modes.  */
> >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> >    scalar_int_mode imode;
> >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> >      {
> > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> >       op0 = gen_lowpart (op0_mode.require (), op0);
> >      }
> >
> > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > -                                bitregion_start, bitregion_end,
> > -                                fieldmode, value, reverse, fallback_p);
> > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > +     bad code will generate for sign-extension ports, like MIPS.  */
> > +  bool use_str_mode = false;
> > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > +    use_str_mode = true;
> > +
> > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > +                                use_str_mode ? str_mode : op0_mode,
> > +                                ibitsize, ibitnum, bitregion_start,
> > +                                bitregion_end, fieldmode, value,
> > +                                reverse, fallback_p);
> >  }
> >
> >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > new file mode 100644
> > index 00000000000..fd6ef6af446
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > +
> > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > +
> > +NOMIPS16 int test (const unsigned char *buf)
> > +{
> > +  int val;
> > +  ((unsigned char*)&val)[0] = *buf++;
> > +  ((unsigned char*)&val)[1] = *buf++;
> > +  ((unsigned char*)&val)[2] = *buf++;
> > +  ((unsigned char*)&val)[3] = *buf++;
> > +  if(val > 0)
> > +    return 1;
> > +  else
> > +    return 0;
> > +}
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)



-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  6:58   ` YunQiang Su
@ 2023-07-19  7:22     ` Richard Biener
  2023-07-19  8:21       ` YunQiang Su
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2023-07-19  7:22 UTC (permalink / raw)
  To: YunQiang Su; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

On Wed, 19 Jul 2023, YunQiang Su wrote:

> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> >
> > On Wed, 19 Jul 2023, YunQiang Su wrote:
> >
> > > PR #104914
> > >
> > > When work with
> > >   int val;
> > >   ((unsigned char*)&val)[3] = *buf;
> > >   if (val > 0) ...
> > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > on sign-extend default architectures, like MIPS64.
> > >
> > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > store_integral_bit_field if:
> > >   modes of op0 and str_rtx are INT;
> > >   length of op0 is greater than str_rtx.
> > >
> > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > mips64el-linux-gnuabi64 without regression.
> >
> > I still think you are "fixing" this in the wrong place.  The bugzilla
> > audit trail points to combine and later notes an eventual expansion
> > issue (but for another testcase/target).
> >
> > You have to explain in more detail on what is wrong with the initial
> > RTL on mips.
> >
> 
> In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> 
> (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
>             (const_int 8 [0x8])
>             (const_int 0 [0]))
>         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
>      (nil))
> 
> Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> instructions.
> While in fact here, we expect an SImode operation, due to `val` in C
> code is `int`.
> 
> With my patch, the RTX will be like:
> 
> (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
>             (const_int 8 [0x8])
>             (const_int 0 [0]))
>         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
>      (nil))

But if this RTL is correct then the above with DImode is correct as
well and the issue is in the backend definition of the instruction
defining 'DINS'?

> So the operation will be SImode, aka `INS` instruction for MIPS64.
> 
> The problem is based on 2 fact/root cause:
> 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
>     li $7, 0xff
>     li $8, 0
>     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> The value of $8 will be 0xff ff ff ff ff 00 00 00.

Bit that's wrong.  (set (zero_extract:SI ...) should not affect
bits outside of the indicated range.

@findex zero_extract
@item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
Like @code{sign_extract} but refers to an unsigned or zero-extended
bit-field.  The same sequence of bits are extracted, but they
are filled to an entire word with zeros instead of by sign-extension.

Unlike @code{sign_extract}, this type of expressions can be lvalues
in RTL; they may appear on the left side of an assignment, indicating
insertion of a value into the specified bit-field.
@end table


>     li $7, 0xff
>     li $8, 0
>     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> The value of $8 will be 0x 00 00 00 00 ff 00 00 00.

which isn't correct either.

If you look a few dumps further you'll see which instruction was
recognized, I suspect the machine description is simply wrong here?

> 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> without `d` as its first char (in fact with few exception), are sign-extension,
> the MIPS backend just ignore `extendsidi2`, aka RTX
> 
> (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
>      (nil))
> 
> 
> 
> > Richard.
> >
> > > gcc/ChangeLog:
> > >         PR: 104914.
> > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > >       to store_integral_bit_field if the length of op0 is greater
> > >       than str_rtx.
> > >
> > > gcc/testsuite/ChangeLog:
> > >         PR: 104914.
> > >       * gcc.target/mips/pr104914.c: New testcase.
> > > ---
> > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > >
> > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > index fbd4ce2d42f..5531c19e891 100644
> > > --- a/gcc/expmed.cc
> > > +++ b/gcc/expmed.cc
> > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > >       since that case is valid for any mode.  The following cases are only
> > >       valid for integral modes.  */
> > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > >    scalar_int_mode imode;
> > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > >      {
> > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > >      }
> > >
> > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > -                                bitregion_start, bitregion_end,
> > > -                                fieldmode, value, reverse, fallback_p);
> > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > +  bool use_str_mode = false;
> > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > +    use_str_mode = true;
> > > +
> > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > +                                use_str_mode ? str_mode : op0_mode,
> > > +                                ibitsize, ibitnum, bitregion_start,
> > > +                                bitregion_end, fieldmode, value,
> > > +                                reverse, fallback_p);
> > >  }
> > >
> > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > new file mode 100644
> > > index 00000000000..fd6ef6af446
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > @@ -0,0 +1,17 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > +
> > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > +
> > > +NOMIPS16 int test (const unsigned char *buf)
> > > +{
> > > +  int val;
> > > +  ((unsigned char*)&val)[0] = *buf++;
> > > +  ((unsigned char*)&val)[1] = *buf++;
> > > +  ((unsigned char*)&val)[2] = *buf++;
> > > +  ((unsigned char*)&val)[3] = *buf++;
> > > +  if(val > 0)
> > > +    return 1;
> > > +  else
> > > +    return 0;
> > > +}
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > HRB 36809 (AG Nuernberg)
> 
> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  7:22     ` Richard Biener
@ 2023-07-19  8:21       ` YunQiang Su
  2023-07-19  8:25         ` YunQiang Su
  2023-07-19  9:23         ` Richard Biener
  0 siblings, 2 replies; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  8:21 UTC (permalink / raw)
  To: Richard Biener; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

Richard Biener <rguenther@suse.de> 于2023年7月19日周三 15:22写道：
>
> On Wed, 19 Jul 2023, YunQiang Su wrote:
>
> > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > >
> > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > >
> > > > PR #104914
> > > >
> > > > When work with
> > > >   int val;
> > > >   ((unsigned char*)&val)[3] = *buf;
> > > >   if (val > 0) ...
> > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > on sign-extend default architectures, like MIPS64.
> > > >
> > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > store_integral_bit_field if:
> > > >   modes of op0 and str_rtx are INT;
> > > >   length of op0 is greater than str_rtx.
> > > >
> > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > mips64el-linux-gnuabi64 without regression.
> > >
> > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > audit trail points to combine and later notes an eventual expansion
> > > issue (but for another testcase/target).
> > >
> > > You have to explain in more detail on what is wrong with the initial
> > > RTL on mips.
> > >
> >
> > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> >
> > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> >             (const_int 8 [0x8])
> >             (const_int 0 [0]))
> >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> >      (nil))
> >
> > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > instructions.
> > While in fact here, we expect an SImode operation, due to `val` in C
> > code is `int`.
> >
> > With my patch, the RTX will be like:
> >
> > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> >             (const_int 8 [0x8])
> >             (const_int 0 [0]))
> >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> >      (nil))
>
> But if this RTL is correct then the above with DImode is correct as
> well and the issue is in the backend definition of the instruction
> defining 'DINS'?
>

I don't think so.

(insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
                                                     ^^
             (const_int 8 [0x8])
             (const_int 0 [0]))
         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
      (nil))

This RTL has only info about DI. It doesn't has any info about the
real length of
`val`. For backend, it has no other choice instead of `DINS`.

> > So the operation will be SImode, aka `INS` instruction for MIPS64.
> >
> > The problem is based on 2 fact/root cause:
> > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> >     li $7, 0xff
> >     li $8, 0
> >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > The value of $8 will be 0xff ff ff ff ff 00 00 00.
>
> Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> bits outside of the indicated range.
>

In fact, it is how sign-extension arch work.
No matter wrong or right, the ISA was/is defined like this.

In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
and the 32bit object can still workable on 64bit CPU.
That's a smart (or brain-damaged) design.

> @findex zero_extract
> @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> Like @code{sign_extract} but refers to an unsigned or zero-extended
> bit-field.  The same sequence of bits are extracted, but they
> are filled to an entire word with zeros instead of by sign-extension.
>

That's depending on the definition of `word` here.
For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
hardware register.
Anyway, it won't break ISA without sign-extension by default.

Due to the nature of sign-extension ISA, if we don't sign-extension the
`int` variable, it will make something wrong.

To make it clear: the word `sign extension` here means:
       the the value of 31bit will be copied to bits [32-63], and
       the value of bits[0-30] won't be copied.
Here is the examples:
    li $7, 0xff
    li $8, 0x00 00 ff 00
    ins $8,$7,16,8
                    ^^
The value of $8 will be: 0x 00 00 00 00 00 ff ff 00

    li $7, 0xff
    li $8, 0x00 00 ff 00
    ins $8,$7,24,8
                    ^^
The value of $8 will be: 0x ff ff ff ff ff 00 ff 00

> Unlike @code{sign_extract}, this type of expressions can be lvalues
> in RTL; they may appear on the left side of an assignment, indicating
> insertion of a value into the specified bit-field.
> @end table
>
>
> >     li $7, 0xff
> >     li $8, 0
> >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
>
> which isn't correct either.
>

It is not correct or not-correct: The ISA manual just state like this,
and the hardwares are working like this.

> If you look a few dumps further you'll see which instruction was
> recognized, I suspect the machine description is simply wrong here?
>

The design of initial RTL may has expect that the backend may expand

(insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
        (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
     (nil))

to an `SLL` instruction, which can fix what `DINS` do, aka
     0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00

I guess this is what you mean about the mistake of machine description.
While MIPS md believes that it's sign-extension by default, so it is
not needed at all.

> > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > without `d` as its first char (in fact with few exception), are sign-extension,
> > the MIPS backend just ignore `extendsidi2`, aka RTX
> >
> > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> >      (nil))
> >
> >

This is just background info about MIPS:

On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
with other arch.
On a MIPS64 hardware, the value of (int32_t)-1 is
     0x ff ff ff ff ff ff ff ff
which is same with (int64_t)-1.
So the single compare-and-branch instruction can work with both
int32_t and int64_t.

On none sign-extension arch, like ARM64, (int32_t)-1 is
   0x 00 00 00 00 ff ff ff ff
and (int64_t)-1 is
   0x ff ff ff ff ff ff ff ff
That's why the `CMP` instructions for X and W have different encoding:
the 31bit of the encoding: `sf` bit.


================================
For this problem, we have 2 choice to fix:
1. This patch
2.


> >
> > > Richard.
> > >
> > > > gcc/ChangeLog:
> > > >         PR: 104914.
> > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > >       to store_integral_bit_field if the length of op0 is greater
> > > >       than str_rtx.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >         PR: 104914.
> > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > ---
> > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > >
> > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > index fbd4ce2d42f..5531c19e891 100644
> > > > --- a/gcc/expmed.cc
> > > > +++ b/gcc/expmed.cc
> > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > >       since that case is valid for any mode.  The following cases are only
> > > >       valid for integral modes.  */
> > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > >    scalar_int_mode imode;
> > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > >      {
> > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > >      }
> > > >
> > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > -                                bitregion_start, bitregion_end,
> > > > -                                fieldmode, value, reverse, fallback_p);
> > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > +  bool use_str_mode = false;
> > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > +    use_str_mode = true;
> > > > +
> > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > +                                bitregion_end, fieldmode, value,
> > > > +                                reverse, fallback_p);
> > > >  }
> > > >
> > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > new file mode 100644
> > > > index 00000000000..fd6ef6af446
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > @@ -0,0 +1,17 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > +
> > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > +
> > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > +{
> > > > +  int val;
> > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > +  if(val > 0)
> > > > +    return 1;
> > > > +  else
> > > > +    return 0;
> > > > +}
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > HRB 36809 (AG Nuernberg)
> >
> >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)



-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  8:21       ` YunQiang Su
@ 2023-07-19  8:25         ` YunQiang Su
  2023-07-19  8:50           ` YunQiang Su
  2023-07-19  9:23         ` Richard Biener
  1 sibling, 1 reply; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  8:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

YunQiang Su <wzssyqa@gmail.com> 于2023年7月19日周三 16:21写道：
>
> Richard Biener <rguenther@suse.de> 于2023年7月19日周三 15:22写道：
> >
> > On Wed, 19 Jul 2023, YunQiang Su wrote:
> >
> > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > > >
> > > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > > >
> > > > > PR #104914
> > > > >
> > > > > When work with
> > > > >   int val;
> > > > >   ((unsigned char*)&val)[3] = *buf;
> > > > >   if (val > 0) ...
> > > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > > on sign-extend default architectures, like MIPS64.
> > > > >
> > > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > > store_integral_bit_field if:
> > > > >   modes of op0 and str_rtx are INT;
> > > > >   length of op0 is greater than str_rtx.
> > > > >
> > > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > > mips64el-linux-gnuabi64 without regression.
> > > >
> > > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > > audit trail points to combine and later notes an eventual expansion
> > > > issue (but for another testcase/target).
> > > >
> > > > You have to explain in more detail on what is wrong with the initial
> > > > RTL on mips.
> > > >
> > >
> > > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> > >
> > > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> > >             (const_int 8 [0x8])
> > >             (const_int 0 [0]))
> > >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> > >      (nil))
> > >
> > > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > > instructions.
> > > While in fact here, we expect an SImode operation, due to `val` in C
> > > code is `int`.
> > >
> > > With my patch, the RTX will be like:
> > >
> > > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> > >             (const_int 8 [0x8])
> > >             (const_int 0 [0]))
> > >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> > >      (nil))
> >
> > But if this RTL is correct then the above with DImode is correct as
> > well and the issue is in the backend definition of the instruction
> > defining 'DINS'?
> >
>
> I don't think so.
>
> (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
>                                                      ^^
>              (const_int 8 [0x8])
>              (const_int 0 [0]))
>          (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
>       (nil))
>
> This RTL has only info about DI. It doesn't has any info about the
> real length of
> `val`. For backend, it has no other choice instead of `DINS`.
>
> > > So the operation will be SImode, aka `INS` instruction for MIPS64.
> > >
> > > The problem is based on 2 fact/root cause:
> > > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> > >     li $7, 0xff
> > >     li $8, 0
> > >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > The value of $8 will be 0xff ff ff ff ff 00 00 00.
> >
> > Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> > bits outside of the indicated range.
> >
>
> In fact, it is how sign-extension arch work.
> No matter wrong or right, the ISA was/is defined like this.
>
> In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
> and the 32bit object can still workable on 64bit CPU.
> That's a smart (or brain-damaged) design.
>
> > @findex zero_extract
> > @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> > Like @code{sign_extract} but refers to an unsigned or zero-extended
> > bit-field.  The same sequence of bits are extracted, but they
> > are filled to an entire word with zeros instead of by sign-extension.
> >
>
> That's depending on the definition of `word` here.
> For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
> hardware register.
> Anyway, it won't break ISA without sign-extension by default.
>
> Due to the nature of sign-extension ISA, if we don't sign-extension the
> `int` variable, it will make something wrong.
>
> To make it clear: the word `sign extension` here means:
>        the the value of 31bit will be copied to bits [32-63], and
>        the value of bits[0-30] won't be copied.
> Here is the examples:
>     li $7, 0xff
>     li $8, 0x00 00 ff 00
>     ins $8,$7,16,8
>                     ^^
> The value of $8 will be: 0x 00 00 00 00 00 ff ff 00
>
>     li $7, 0xff
>     li $8, 0x00 00 ff 00
>     ins $8,$7,24,8
>                     ^^
> The value of $8 will be: 0x ff ff ff ff ff 00 ff 00
>
> > Unlike @code{sign_extract}, this type of expressions can be lvalues
> > in RTL; they may appear on the left side of an assignment, indicating
> > insertion of a value into the specified bit-field.
> > @end table
> >
> >
> > >     li $7, 0xff
> > >     li $8, 0
> > >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
> >
> > which isn't correct either.
> >
>
> It is not correct or not-correct: The ISA manual just state like this,
> and the hardwares are working like this.
>
> > If you look a few dumps further you'll see which instruction was
> > recognized, I suspect the machine description is simply wrong here?
> >
>
> The design of initial RTL may has expect that the backend may expand
>
> (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
>         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
>      (nil))
>
> to an `SLL` instruction, which can fix what `DINS` do, aka
>      0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00
>
> I guess this is what you mean about the mistake of machine description.
> While MIPS md believes that it's sign-extension by default, so it is
> not needed at all.
>
> > > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > > without `d` as its first char (in fact with few exception), are sign-extension,
> > > the MIPS backend just ignore `extendsidi2`, aka RTX
> > >
> > > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> > >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> > >      (nil))
> > >
> > >
>
> This is just background info about MIPS:
>
> On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
> with other arch.
> On a MIPS64 hardware, the value of (int32_t)-1 is
>      0x ff ff ff ff ff ff ff ff
> which is same with (int64_t)-1.
> So the single compare-and-branch instruction can work with both
> int32_t and int64_t.
>
> On none sign-extension arch, like ARM64, (int32_t)-1 is
>    0x 00 00 00 00 ff ff ff ff
> and (int64_t)-1 is
>    0x ff ff ff ff ff ff ff ff
> That's why the `CMP` instructions for X and W have different encoding:
> the 31bit of the encoding: `sf` bit.
>
>
> ================================
> For this problem, we have 2 choice to fix:
> 1. This patch
> 2.
>

================================
For this problem, we have 2 choice to fix:
1. This patch
2. make MIPS backend expand the bellow RTL to `SLL` always
(insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
        (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0)))
"xx.c":5:29 238 {extendsidi2}
     (nil))


>
> > >
> > > > Richard.
> > > >
> > > > > gcc/ChangeLog:
> > > > >         PR: 104914.
> > > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > > >       to store_integral_bit_field if the length of op0 is greater
> > > > >       than str_rtx.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >         PR: 104914.
> > > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > > ---
> > > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > > >
> > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > index fbd4ce2d42f..5531c19e891 100644
> > > > > --- a/gcc/expmed.cc
> > > > > +++ b/gcc/expmed.cc
> > > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > >       since that case is valid for any mode.  The following cases are only
> > > > >       valid for integral modes.  */
> > > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > > >    scalar_int_mode imode;
> > > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > > >      {
> > > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > > >      }
> > > > >
> > > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > > -                                bitregion_start, bitregion_end,
> > > > > -                                fieldmode, value, reverse, fallback_p);
> > > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > > +  bool use_str_mode = false;
> > > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > > +    use_str_mode = true;
> > > > > +
> > > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > > +                                bitregion_end, fieldmode, value,
> > > > > +                                reverse, fallback_p);
> > > > >  }
> > > > >
> > > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > new file mode 100644
> > > > > index 00000000000..fd6ef6af446
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > > +
> > > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > > +
> > > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > > +{
> > > > > +  int val;
> > > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > > +  if(val > 0)
> > > > > +    return 1;
> > > > > +  else
> > > > > +    return 0;
> > > > > +}
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > > HRB 36809 (AG Nuernberg)
> > >
> > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > HRB 36809 (AG Nuernberg)
>
>
>
> --
> YunQiang Su



-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  8:25         ` YunQiang Su
@ 2023-07-19  8:50           ` YunQiang Su
  0 siblings, 0 replies; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  8:50 UTC (permalink / raw)
  To: Richard Biener; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

I am not sure this patch is best, while I think that I am sure the
initial RTL is not correct,
the initial RTL of ARM64 is like

(insn 8 7 9 2 (set (zero_extract:SI (reg/v:SI 98 [ val ])
                                                  ^^
            (const_int 8 [0x8])
            (const_int 0 [0]))
        (reg:SI 102)) "xx.c":3:29 -1
     (nil))


YunQiang Su <wzssyqa@gmail.com> 于2023年7月19日周三 16:25写道：
>
> YunQiang Su <wzssyqa@gmail.com> 于2023年7月19日周三 16:21写道：
> >
> > Richard Biener <rguenther@suse.de> 于2023年7月19日周三 15:22写道：
> > >
> > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > > > >
> > > > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > > > >
> > > > > > PR #104914
> > > > > >
> > > > > > When work with
> > > > > >   int val;
> > > > > >   ((unsigned char*)&val)[3] = *buf;
> > > > > >   if (val > 0) ...
> > > > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > > > on sign-extend default architectures, like MIPS64.
> > > > > >
> > > > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > > > store_integral_bit_field if:
> > > > > >   modes of op0 and str_rtx are INT;
> > > > > >   length of op0 is greater than str_rtx.
> > > > > >
> > > > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > > > mips64el-linux-gnuabi64 without regression.
> > > > >
> > > > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > > > audit trail points to combine and later notes an eventual expansion
> > > > > issue (but for another testcase/target).
> > > > >
> > > > > You have to explain in more detail on what is wrong with the initial
> > > > > RTL on mips.
> > > > >
> > > >
> > > > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> > > >      (nil))
> > > >
> > > > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > > > instructions.
> > > > While in fact here, we expect an SImode operation, due to `val` in C
> > > > code is `int`.
> > > >
> > > > With my patch, the RTX will be like:
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> > > >      (nil))
> > >
> > > But if this RTL is correct then the above with DImode is correct as
> > > well and the issue is in the backend definition of the instruction
> > > defining 'DINS'?
> > >
> >
> > I don't think so.
> >
> > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> >                                                      ^^
> >              (const_int 8 [0x8])
> >              (const_int 0 [0]))
> >          (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> >       (nil))
> >
> > This RTL has only info about DI. It doesn't has any info about the
> > real length of
> > `val`. For backend, it has no other choice instead of `DINS`.
> >
> > > > So the operation will be SImode, aka `INS` instruction for MIPS64.
> > > >
> > > > The problem is based on 2 fact/root cause:
> > > > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0xff ff ff ff ff 00 00 00.
> > >
> > > Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> > > bits outside of the indicated range.
> > >
> >
> > In fact, it is how sign-extension arch work.
> > No matter wrong or right, the ISA was/is defined like this.
> >
> > In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
> > and the 32bit object can still workable on 64bit CPU.
> > That's a smart (or brain-damaged) design.
> >
> > > @findex zero_extract
> > > @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> > > Like @code{sign_extract} but refers to an unsigned or zero-extended
> > > bit-field.  The same sequence of bits are extracted, but they
> > > are filled to an entire word with zeros instead of by sign-extension.
> > >
> >
> > That's depending on the definition of `word` here.
> > For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
> > hardware register.
> > Anyway, it won't break ISA without sign-extension by default.
> >
> > Due to the nature of sign-extension ISA, if we don't sign-extension the
> > `int` variable, it will make something wrong.
> >
> > To make it clear: the word `sign extension` here means:
> >        the the value of 31bit will be copied to bits [32-63], and
> >        the value of bits[0-30] won't be copied.
> > Here is the examples:
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,16,8
> >                     ^^
> > The value of $8 will be: 0x 00 00 00 00 00 ff ff 00
> >
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,24,8
> >                     ^^
> > The value of $8 will be: 0x ff ff ff ff ff 00 ff 00
> >
> > > Unlike @code{sign_extract}, this type of expressions can be lvalues
> > > in RTL; they may appear on the left side of an assignment, indicating
> > > insertion of a value into the specified bit-field.
> > > @end table
> > >
> > >
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
> > >
> > > which isn't correct either.
> > >
> >
> > It is not correct or not-correct: The ISA manual just state like this,
> > and the hardwares are working like this.
> >
> > > If you look a few dumps further you'll see which instruction was
> > > recognized, I suspect the machine description is simply wrong here?
> > >
> >
> > The design of initial RTL may has expect that the backend may expand
> >
> > (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
> >         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
> >      (nil))
> >
> > to an `SLL` instruction, which can fix what `DINS` do, aka
> >      0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00
> >
> > I guess this is what you mean about the mistake of machine description.
> > While MIPS md believes that it's sign-extension by default, so it is
> > not needed at all.
> >
> > > > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > > > without `d` as its first char (in fact with few exception), are sign-extension,
> > > > the MIPS backend just ignore `extendsidi2`, aka RTX
> > > >
> > > > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> > > >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> > > >      (nil))
> > > >
> > > >
> >
> > This is just background info about MIPS:
> >
> > On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
> > with other arch.
> > On a MIPS64 hardware, the value of (int32_t)-1 is
> >      0x ff ff ff ff ff ff ff ff
> > which is same with (int64_t)-1.
> > So the single compare-and-branch instruction can work with both
> > int32_t and int64_t.
> >
> > On none sign-extension arch, like ARM64, (int32_t)-1 is
> >    0x 00 00 00 00 ff ff ff ff
> > and (int64_t)-1 is
> >    0x ff ff ff ff ff ff ff ff
> > That's why the `CMP` instructions for X and W have different encoding:
> > the 31bit of the encoding: `sf` bit.
> >
> >
> > ================================
> > For this problem, we have 2 choice to fix:
> > 1. This patch
> > 2.
> >
>
> ================================
> For this problem, we have 2 choice to fix:
> 1. This patch
> 2. make MIPS backend expand the bellow RTL to `SLL` always
> (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
>         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0)))
> "xx.c":5:29 238 {extendsidi2}
>      (nil))
>
>
> >
> > > >
> > > > > Richard.
> > > > >
> > > > > > gcc/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > > > >       to store_integral_bit_field if the length of op0 is greater
> > > > > >       than str_rtx.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > > > ---
> > > > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > >
> > > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > > index fbd4ce2d42f..5531c19e891 100644
> > > > > > --- a/gcc/expmed.cc
> > > > > > +++ b/gcc/expmed.cc
> > > > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       since that case is valid for any mode.  The following cases are only
> > > > > >       valid for integral modes.  */
> > > > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > > > >    scalar_int_mode imode;
> > > > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > > > >      {
> > > > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > > > >      }
> > > > > >
> > > > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > > > -                                bitregion_start, bitregion_end,
> > > > > > -                                fieldmode, value, reverse, fallback_p);
> > > > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > > > +  bool use_str_mode = false;
> > > > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > > > +    use_str_mode = true;
> > > > > > +
> > > > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > > > +                                bitregion_end, fieldmode, value,
> > > > > > +                                reverse, fallback_p);
> > > > > >  }
> > > > > >
> > > > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..fd6ef6af446
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > @@ -0,0 +1,17 @@
> > > > > > +/* { dg-do compile } */
> > > > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > > > +
> > > > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > > > +
> > > > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > > > +{
> > > > > > +  int val;
> > > > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > > > +  if(val > 0)
> > > > > > +    return 1;
> > > > > > +  else
> > > > > > +    return 0;
> > > > > > +}
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > > > HRB 36809 (AG Nuernberg)
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > HRB 36809 (AG Nuernberg)
> >
> >
> >
> > --
> > YunQiang Su
>
>
>
> --
> YunQiang Su



-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  8:21       ` YunQiang Su
  2023-07-19  8:25         ` YunQiang Su
@ 2023-07-19  9:23         ` Richard Biener
  2023-07-19  9:27           ` Richard Biener
                             ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Richard Biener @ 2023-07-19  9:23 UTC (permalink / raw)
  To: YunQiang Su; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

On Wed, 19 Jul 2023, YunQiang Su wrote:

> Richard Biener <rguenther@suse.de> ?2023?7?19??? 15:22???
> >
> > On Wed, 19 Jul 2023, YunQiang Su wrote:
> >
> > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > > >
> > > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > > >
> > > > > PR #104914
> > > > >
> > > > > When work with
> > > > >   int val;
> > > > >   ((unsigned char*)&val)[3] = *buf;
> > > > >   if (val > 0) ...
> > > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > > on sign-extend default architectures, like MIPS64.
> > > > >
> > > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > > store_integral_bit_field if:
> > > > >   modes of op0 and str_rtx are INT;
> > > > >   length of op0 is greater than str_rtx.
> > > > >
> > > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > > mips64el-linux-gnuabi64 without regression.
> > > >
> > > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > > audit trail points to combine and later notes an eventual expansion
> > > > issue (but for another testcase/target).
> > > >
> > > > You have to explain in more detail on what is wrong with the initial
> > > > RTL on mips.
> > > >
> > >
> > > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> > >
> > > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> > >             (const_int 8 [0x8])
> > >             (const_int 0 [0]))
> > >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> > >      (nil))
> > >
> > > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > > instructions.
> > > While in fact here, we expect an SImode operation, due to `val` in C
> > > code is `int`.
> > >
> > > With my patch, the RTX will be like:
> > >
> > > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> > >             (const_int 8 [0x8])
> > >             (const_int 0 [0]))
> > >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> > >      (nil))
> >
> > But if this RTL is correct then the above with DImode is correct as
> > well and the issue is in the backend definition of the instruction
> > defining 'DINS'?
> >
> 
> I don't think so.
> 
> (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
>                                                      ^^
>              (const_int 8 [0x8])
>              (const_int 0 [0]))
>          (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
>       (nil))
> 
> This RTL has only info about DI. It doesn't has any info about the
> real length of
> `val`. For backend, it has no other choice instead of `DINS`.
> 
> > > So the operation will be SImode, aka `INS` instruction for MIPS64.
> > >
> > > The problem is based on 2 fact/root cause:
> > > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> > >     li $7, 0xff
> > >     li $8, 0
> > >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > The value of $8 will be 0xff ff ff ff ff 00 00 00.
> >
> > Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> > bits outside of the indicated range.
> >
> 
> In fact, it is how sign-extension arch work.
> No matter wrong or right, the ISA was/is defined like this.
> 
> In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
> and the 32bit object can still workable on 64bit CPU.
> That's a smart (or brain-damaged) design.
> 
> > @findex zero_extract
> > @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> > Like @code{sign_extract} but refers to an unsigned or zero-extended
> > bit-field.  The same sequence of bits are extracted, but they
> > are filled to an entire word with zeros instead of by sign-extension.
> >
> 
> That's depending on the definition of `word` here.
> For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
> hardware register.
> Anyway, it won't break ISA without sign-extension by default.
> 
> Due to the nature of sign-extension ISA, if we don't sign-extension the
> `int` variable, it will make something wrong.
> 
> To make it clear: the word `sign extension` here means:
>        the the value of 31bit will be copied to bits [32-63], and
>        the value of bits[0-30] won't be copied.
> Here is the examples:
>     li $7, 0xff
>     li $8, 0x00 00 ff 00
>     ins $8,$7,16,8
>                     ^^
> The value of $8 will be: 0x 00 00 00 00 00 ff ff 00
> 
>     li $7, 0xff
>     li $8, 0x00 00 ff 00
>     ins $8,$7,24,8
>                     ^^
> The value of $8 will be: 0x ff ff ff ff ff 00 ff 00

But that's INS.

> > Unlike @code{sign_extract}, this type of expressions can be lvalues
> > in RTL; they may appear on the left side of an assignment, indicating
> > insertion of a value into the specified bit-field.
> > @end table

Note ^^^ applies for (zero_extract ..) as a SET destination.  The
issue is probably that MIPS is WORD_REGISTER_OPERATIONS which has
proved "interesting" (and under-documented).  I suppose word_mode
is DImode which is likely why the bitfield operations work on it.

I see theres DINS, DINSM and DINSU in some mips ISA document
I found on the internet but I didn't see where that would
cause sign or zero extension off bit 31.

> >
> > >     li $7, 0xff
> > >     li $8, 0
> > >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
> >
> > which isn't correct either.
> >
>
> It is not correct or not-correct: The ISA manual just state like this,
> and the hardwares are working like this.

I don't see that.  That's definitely not what GCC expects here,
the left-most word of the doubleword should be unchanged.

Your testcase should be a dg-do-run and probably more like

NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
{
  int val;
  ((unsigned char*)&val)[0] = *buf++;
  ((unsigned char*)&val)[1] = *buf++;
  ((unsigned char*)&val)[2] = *buf++;
  ((unsigned char*)&val)[3] = *buf++;
  return val;
}
int main()
{
  int val = 0x01020304;
  val = test (&val);
  if (val != 0x01020304)
    abort ();
}

not sure if I got endianess correct.  Now, the question is what
WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
the MIPS ABI says for returning SImode.

Others might be able to answer this.

> > If you look a few dumps further you'll see which instruction was
> > recognized, I suspect the machine description is simply wrong here?
> >
> 
> The design of initial RTL may has expect that the backend may expand
> 
> (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
>         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
>      (nil))
> 
> to an `SLL` instruction, which can fix what `DINS` do, aka
>      0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00
> 
> I guess this is what you mean about the mistake of machine description.
> While MIPS md believes that it's sign-extension by default, so it is
> not needed at all.
> 
> > > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > > without `d` as its first char (in fact with few exception), are sign-extension,
> > > the MIPS backend just ignore `extendsidi2`, aka RTX
> > >
> > > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> > >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> > >      (nil))
> > >
> > >
> 
> This is just background info about MIPS:
> 
> On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
> with other arch.
> On a MIPS64 hardware, the value of (int32_t)-1 is
>      0x ff ff ff ff ff ff ff ff
> which is same with (int64_t)-1.
> So the single compare-and-branch instruction can work with both
> int32_t and int64_t.
> 
> On none sign-extension arch, like ARM64, (int32_t)-1 is
>    0x 00 00 00 00 ff ff ff ff
> and (int64_t)-1 is
>    0x ff ff ff ff ff ff ff ff
> That's why the `CMP` instructions for X and W have different encoding:
> the 31bit of the encoding: `sf` bit.
> 
> 
> ================================
> For this problem, we have 2 choice to fix:
> 1. This patch
> 2.

2. disallow INS affecting bit 31 of a double-word - there's
mips_use_ins_ext_p that could be used (maybe also pass in whether it's
an INS).

Richard.

> 
> > >
> > > > Richard.
> > > >
> > > > > gcc/ChangeLog:
> > > > >         PR: 104914.
> > > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > > >       to store_integral_bit_field if the length of op0 is greater
> > > > >       than str_rtx.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >         PR: 104914.
> > > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > > ---
> > > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > > >
> > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > index fbd4ce2d42f..5531c19e891 100644
> > > > > --- a/gcc/expmed.cc
> > > > > +++ b/gcc/expmed.cc
> > > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > >       since that case is valid for any mode.  The following cases are only
> > > > >       valid for integral modes.  */
> > > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > > >    scalar_int_mode imode;
> > > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > > >      {
> > > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > > >      }
> > > > >
> > > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > > -                                bitregion_start, bitregion_end,
> > > > > -                                fieldmode, value, reverse, fallback_p);
> > > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > > +  bool use_str_mode = false;
> > > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > > +    use_str_mode = true;
> > > > > +
> > > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > > +                                bitregion_end, fieldmode, value,
> > > > > +                                reverse, fallback_p);
> > > > >  }
> > > > >
> > > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > new file mode 100644
> > > > > index 00000000000..fd6ef6af446
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > @@ -0,0 +1,17 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > > +
> > > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > > +
> > > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > > +{
> > > > > +  int val;
> > > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > > +  if(val > 0)
> > > > > +    return 1;
> > > > > +  else
> > > > > +    return 0;
> > > > > +}
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > > HRB 36809 (AG Nuernberg)
> > >
> > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > HRB 36809 (AG Nuernberg)
> 
> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  9:23         ` Richard Biener
@ 2023-07-19  9:27           ` Richard Biener
  2023-07-19  9:43           ` YunQiang Su
  2023-07-19  9:45           ` Eric Botcazou
  2 siblings, 0 replies; 19+ messages in thread
From: Richard Biener @ 2023-07-19  9:27 UTC (permalink / raw)
  To: YunQiang Su
  Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian, ebotcazou

On Wed, 19 Jul 2023, Richard Biener wrote:

> On Wed, 19 Jul 2023, YunQiang Su wrote:
> 
> > Richard Biener <rguenther@suse.de> ?2023?7?19??? 15:22???
> > >
> > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > > > >
> > > > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > > > >
> > > > > > PR #104914
> > > > > >
> > > > > > When work with
> > > > > >   int val;
> > > > > >   ((unsigned char*)&val)[3] = *buf;
> > > > > >   if (val > 0) ...
> > > > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > > > on sign-extend default architectures, like MIPS64.
> > > > > >
> > > > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > > > store_integral_bit_field if:
> > > > > >   modes of op0 and str_rtx are INT;
> > > > > >   length of op0 is greater than str_rtx.
> > > > > >
> > > > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > > > mips64el-linux-gnuabi64 without regression.
> > > > >
> > > > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > > > audit trail points to combine and later notes an eventual expansion
> > > > > issue (but for another testcase/target).
> > > > >
> > > > > You have to explain in more detail on what is wrong with the initial
> > > > > RTL on mips.
> > > > >
> > > >
> > > > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> > > >      (nil))
> > > >
> > > > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > > > instructions.
> > > > While in fact here, we expect an SImode operation, due to `val` in C
> > > > code is `int`.
> > > >
> > > > With my patch, the RTX will be like:
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> > > >      (nil))
> > >
> > > But if this RTL is correct then the above with DImode is correct as
> > > well and the issue is in the backend definition of the instruction
> > > defining 'DINS'?
> > >
> > 
> > I don't think so.
> > 
> > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> >                                                      ^^
> >              (const_int 8 [0x8])
> >              (const_int 0 [0]))
> >          (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> >       (nil))
> > 
> > This RTL has only info about DI. It doesn't has any info about the
> > real length of
> > `val`. For backend, it has no other choice instead of `DINS`.
> > 
> > > > So the operation will be SImode, aka `INS` instruction for MIPS64.
> > > >
> > > > The problem is based on 2 fact/root cause:
> > > > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0xff ff ff ff ff 00 00 00.
> > >
> > > Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> > > bits outside of the indicated range.
> > >
> > 
> > In fact, it is how sign-extension arch work.
> > No matter wrong or right, the ISA was/is defined like this.
> > 
> > In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
> > and the 32bit object can still workable on 64bit CPU.
> > That's a smart (or brain-damaged) design.
> > 
> > > @findex zero_extract
> > > @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> > > Like @code{sign_extract} but refers to an unsigned or zero-extended
> > > bit-field.  The same sequence of bits are extracted, but they
> > > are filled to an entire word with zeros instead of by sign-extension.
> > >
> > 
> > That's depending on the definition of `word` here.
> > For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
> > hardware register.
> > Anyway, it won't break ISA without sign-extension by default.
> > 
> > Due to the nature of sign-extension ISA, if we don't sign-extension the
> > `int` variable, it will make something wrong.
> > 
> > To make it clear: the word `sign extension` here means:
> >        the the value of 31bit will be copied to bits [32-63], and
> >        the value of bits[0-30] won't be copied.
> > Here is the examples:
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,16,8
> >                     ^^
> > The value of $8 will be: 0x 00 00 00 00 00 ff ff 00
> > 
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,24,8
> >                     ^^
> > The value of $8 will be: 0x ff ff ff ff ff 00 ff 00
> 
> But that's INS.
> 
> > > Unlike @code{sign_extract}, this type of expressions can be lvalues
> > > in RTL; they may appear on the left side of an assignment, indicating
> > > insertion of a value into the specified bit-field.
> > > @end table
> 
> Note ^^^ applies for (zero_extract ..) as a SET destination.  The
> issue is probably that MIPS is WORD_REGISTER_OPERATIONS which has
> proved "interesting" (and under-documented).  I suppose word_mode
> is DImode which is likely why the bitfield operations work on it.
> 
> I see theres DINS, DINSM and DINSU in some mips ISA document
> I found on the internet but I didn't see where that would
> cause sign or zero extension off bit 31.
> 
> > >
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
> > >
> > > which isn't correct either.
> > >
> >
> > It is not correct or not-correct: The ISA manual just state like this,
> > and the hardwares are working like this.
> 
> I don't see that.  That's definitely not what GCC expects here,
> the left-most word of the doubleword should be unchanged.
> 
> Your testcase should be a dg-do-run and probably more like
> 
> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> {
>   int val;
>   ((unsigned char*)&val)[0] = *buf++;
>   ((unsigned char*)&val)[1] = *buf++;
>   ((unsigned char*)&val)[2] = *buf++;
>   ((unsigned char*)&val)[3] = *buf++;
>   return val;
> }
> int main()
> {
>   int val = 0x01020304;
>   val = test (&val);
>   if (val != 0x01020304)
>     abort ();
> }
> 
> not sure if I got endianess correct.  Now, the question is what
> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> the MIPS ABI says for returning SImode.
> 
> Others might be able to answer this.

CCing Eric who's the only one to know given SPARC is 
WORD_REGISTER_OPERATIONS as well.

Richard.

> > > If you look a few dumps further you'll see which instruction was
> > > recognized, I suspect the machine description is simply wrong here?
> > >
> > 
> > The design of initial RTL may has expect that the backend may expand
> > 
> > (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
> >         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
> >      (nil))
> > 
> > to an `SLL` instruction, which can fix what `DINS` do, aka
> >      0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00
> > 
> > I guess this is what you mean about the mistake of machine description.
> > While MIPS md believes that it's sign-extension by default, so it is
> > not needed at all.
> > 
> > > > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > > > without `d` as its first char (in fact with few exception), are sign-extension,
> > > > the MIPS backend just ignore `extendsidi2`, aka RTX
> > > >
> > > > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> > > >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> > > >      (nil))
> > > >
> > > >
> > 
> > This is just background info about MIPS:
> > 
> > On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
> > with other arch.
> > On a MIPS64 hardware, the value of (int32_t)-1 is
> >      0x ff ff ff ff ff ff ff ff
> > which is same with (int64_t)-1.
> > So the single compare-and-branch instruction can work with both
> > int32_t and int64_t.
> > 
> > On none sign-extension arch, like ARM64, (int32_t)-1 is
> >    0x 00 00 00 00 ff ff ff ff
> > and (int64_t)-1 is
> >    0x ff ff ff ff ff ff ff ff
> > That's why the `CMP` instructions for X and W have different encoding:
> > the 31bit of the encoding: `sf` bit.
> > 
> > 
> > ================================
> > For this problem, we have 2 choice to fix:
> > 1. This patch
> > 2.
> 
> 2. disallow INS affecting bit 31 of a double-word - there's
> mips_use_ins_ext_p that could be used (maybe also pass in whether it's
> an INS).
> 
> Richard.
> 
> > 
> > > >
> > > > > Richard.
> > > > >
> > > > > > gcc/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > > > >       to store_integral_bit_field if the length of op0 is greater
> > > > > >       than str_rtx.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > > > ---
> > > > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > >
> > > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > > index fbd4ce2d42f..5531c19e891 100644
> > > > > > --- a/gcc/expmed.cc
> > > > > > +++ b/gcc/expmed.cc
> > > > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       since that case is valid for any mode.  The following cases are only
> > > > > >       valid for integral modes.  */
> > > > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > > > >    scalar_int_mode imode;
> > > > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > > > >      {
> > > > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > > > >      }
> > > > > >
> > > > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > > > -                                bitregion_start, bitregion_end,
> > > > > > -                                fieldmode, value, reverse, fallback_p);
> > > > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > > > +  bool use_str_mode = false;
> > > > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > > > +    use_str_mode = true;
> > > > > > +
> > > > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > > > +                                bitregion_end, fieldmode, value,
> > > > > > +                                reverse, fallback_p);
> > > > > >  }
> > > > > >
> > > > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..fd6ef6af446
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > @@ -0,0 +1,17 @@
> > > > > > +/* { dg-do compile } */
> > > > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > > > +
> > > > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > > > +
> > > > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > > > +{
> > > > > > +  int val;
> > > > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > > > +  if(val > 0)
> > > > > > +    return 1;
> > > > > > +  else
> > > > > > +    return 0;
> > > > > > +}
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > > > HRB 36809 (AG Nuernberg)
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > HRB 36809 (AG Nuernberg)
> > 
> > 
> > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  9:23         ` Richard Biener
  2023-07-19  9:27           ` Richard Biener
@ 2023-07-19  9:43           ` YunQiang Su
  2023-07-19  9:45           ` Eric Botcazou
  2 siblings, 0 replies; 19+ messages in thread
From: YunQiang Su @ 2023-07-19  9:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: YunQiang Su, gcc-patches, pinskia, jeffreyalaw, ian

Richard Biener <rguenther@suse.de> 于2023年7月19日周三 17:23写道：
>
> On Wed, 19 Jul 2023, YunQiang Su wrote:
>
> > Richard Biener <rguenther@suse.de> ?2023?7?19??? 15:22???
> > >
> > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> ?2023?7?19??? 14:27???
> > > > >
> > > > > On Wed, 19 Jul 2023, YunQiang Su wrote:
> > > > >
> > > > > > PR #104914
> > > > > >
> > > > > > When work with
> > > > > >   int val;
> > > > > >   ((unsigned char*)&val)[3] = *buf;
> > > > > >   if (val > 0) ...
> > > > > > The RTX mode is obtained from REG instead of SUBREG, which make
> > > > > > D<INS> is used instead of <INS>.  Thus something wrong happens
> > > > > > on sign-extend default architectures, like MIPS64.
> > > > > >
> > > > > > Let's use str_rtx and mode of str_rtx as the parameters for
> > > > > > store_integral_bit_field if:
> > > > > >   modes of op0 and str_rtx are INT;
> > > > > >   length of op0 is greater than str_rtx.
> > > > > >
> > > > > > This patch has been tested on aarch64-linux-gnu, x86_64-linux-gnu,
> > > > > > mips64el-linux-gnuabi64 without regression.
> > > > >
> > > > > I still think you are "fixing" this in the wrong place.  The bugzilla
> > > > > audit trail points to combine and later notes an eventual expansion
> > > > > issue (but for another testcase/target).
> > > > >
> > > > > You have to explain in more detail on what is wrong with the initial
> > > > > RTL on mips.
> > > > >
> > > >
> > > > In the first RTL file, aka xx.c.256r.expand, the zero_extract RTX is like
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> > > >      (nil))
> > > >
> > > > Not, all of the REG are in DImode. On MIPS64, it will expand to `DINS`
> > > > instructions.
> > > > While in fact here, we expect an SImode operation, due to `val` in C
> > > > code is `int`.
> > > >
> > > > With my patch, the RTX will be like:
> > > >
> > > > (insn 10 9 11 2 (set (zero_extract:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
> > > >             (const_int 8 [0x8])
> > > >             (const_int 0 [0]))
> > > >         (subreg:SI (reg:QI 202) 0)) "xx.c":4:29 -1
> > > >      (nil))
> > >
> > > But if this RTL is correct then the above with DImode is correct as
> > > well and the issue is in the backend definition of the instruction
> > > defining 'DINS'?
> > >
> >
> > I don't think so.
> >
> > (insn 10 9 11 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
> >                                                      ^^
> >              (const_int 8 [0x8])
> >              (const_int 0 [0]))
> >          (subreg:DI (reg:QI 202) 0)) "../xx.c":4:29 -1
> >       (nil))
> >
> > This RTL has only info about DI. It doesn't has any info about the
> > real length of
> > `val`. For backend, it has no other choice instead of `DINS`.
> >
> > > > So the operation will be SImode, aka `INS` instruction for MIPS64.
> > > >
> > > > The problem is based on 2 fact/root cause:
> > > > 1. MIPS's `INS` instruction will be always to sign-extension, while `DINS` won't
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     ins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0xff ff ff ff ff 00 00 00.
> > >
> > > Bit that's wrong.  (set (zero_extract:SI ...) should not affect
> > > bits outside of the indicated range.
> > >
> >
> > In fact, it is how sign-extension arch work.
> > No matter wrong or right, the ISA was/is defined like this.
> >
> > In fact, one MIPS 32 ABI, the same C code will generate the RTL like this,
> > and the 32bit object can still workable on 64bit CPU.
> > That's a smart (or brain-damaged) design.
> >
> > > @findex zero_extract
> > > @item (zero_extract:@var{m} @var{loc} @var{size} @var{pos})
> > > Like @code{sign_extract} but refers to an unsigned or zero-extended
> > > bit-field.  The same sequence of bits are extracted, but they
> > > are filled to an entire word with zeros instead of by sign-extension.
> > >
> >
> > That's depending on the definition of `word` here.
> > For `(zero_extract:SI`, I think that the word is limit to the low 32bit of
> > hardware register.
> > Anyway, it won't break ISA without sign-extension by default.
> >
> > Due to the nature of sign-extension ISA, if we don't sign-extension the
> > `int` variable, it will make something wrong.
> >
> > To make it clear: the word `sign extension` here means:
> >        the the value of 31bit will be copied to bits [32-63], and
> >        the value of bits[0-30] won't be copied.
> > Here is the examples:
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,16,8
> >                     ^^
> > The value of $8 will be: 0x 00 00 00 00 00 ff ff 00
> >
> >     li $7, 0xff
> >     li $8, 0x00 00 ff 00
> >     ins $8,$7,24,8
> >                     ^^
> > The value of $8 will be: 0x ff ff ff ff ff 00 ff 00
>
> But that's INS.
>
> > > Unlike @code{sign_extract}, this type of expressions can be lvalues
> > > in RTL; they may appear on the left side of an assignment, indicating
> > > insertion of a value into the specified bit-field.
> > > @end table
>
> Note ^^^ applies for (zero_extract ..) as a SET destination.  The
> issue is probably that MIPS is WORD_REGISTER_OPERATIONS which has
> proved "interesting" (and under-documented).  I suppose word_mode
> is DImode which is likely why the bitfield operations work on it.
>
> I see theres DINS, DINSM and DINSU in some mips ISA document
> I found on the internet but I didn't see where that would
> cause sign or zero extension off bit 31.
>

Yes. DINS/DINSM/DINSU is not sign-extension, as they worked on 64bit values.
If MIPS is extend to an 128bit arch, they will be sign-extension to 128bits ;)

INS is sign-extension. See
https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00087-2B-MIPS64BIS-AFP-6.06.pdf
Page 267,
    GPR[rt] <- sign_extend(GPR[rt]31..msb+1 || GPR[rs]msb-lsb..0 ||
GPR[rt]lsb-1..0)
For the definition about `sign_extend`,  please see Page 38.

> > >
> > > >     li $7, 0xff
> > > >     li $8, 0
> > > >     dins $8,$7,24,8  # set the 24-32 bits of $8 to 0xff.
> > > > The value of $8 will be 0x 00 00 00 00 ff 00 00 00.
> > >
> > > which isn't correct either.
> > >
> >
> > It is not correct or not-correct: The ISA manual just state like this,
> > and the hardwares are working like this.
>
> I don't see that.  That's definitely not what GCC expects here,
> the left-most word of the doubleword should be unchanged.
>

See the above documents, you can note that almost all instructions has
`sign_extend `.
Note, please read the MIPS64 documents instead of MIPS32 ones,
as MIPS32 assumes the the GPR is 32bit, so there is no `sign_extend`.

https://www.mips.com/products/architectures/mips64/

> Your testcase should be a dg-do-run and probably more like
>
> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> {
>   int val;
>   ((unsigned char*)&val)[0] = *buf++;
>   ((unsigned char*)&val)[1] = *buf++;
>   ((unsigned char*)&val)[2] = *buf++;
>   ((unsigned char*)&val)[3] = *buf++;
>   return val;
> }
> int main()
> {
>   int val = 0x01020304;
>   val = test (&val);
>   if (val != 0x01020304)
>     abort ();
> }
>

No, the reason is that you code will always work well, with or without my patch.

> not sure if I got endianess correct.  Now, the question is what
> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> the MIPS ABI says for returning SImode.
>
> Others might be able to answer this.
>
> > > If you look a few dumps further you'll see which instruction was
> > > recognized, I suspect the machine description is simply wrong here?
> > >
> >
> > The design of initial RTL may has expect that the backend may expand
> >
> > (insn 14 13 15 2 (set (reg/v:DI 201 [ val ])
> >         (sign_extend:DI (subreg:SI (reg/v:DI 201 [ val ]) 0))) "xx.c":5:29 -1
> >      (nil))
> >
> > to an `SLL` instruction, which can fix what `DINS` do, aka
> >      0x 00 00 00 00 ff 00 00 00 ---> 0x ff ff ff ff ff 00 00 00
> >
> > I guess this is what you mean about the mistake of machine description.
> > While MIPS md believes that it's sign-extension by default, so it is
> > not needed at all.
> >
> > > > 2. Due to most of MIPS instructions work with 32bit value, aka instructions
> > > > without `d` as its first char (in fact with few exception), are sign-extension,
> > > > the MIPS backend just ignore `extendsidi2`, aka RTX
> > > >
> > > > (insn 14 13 15 2 (set (reg/v:DI 200 [ val ])
> > > >         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "xx.c":5:29 -1
> > > >      (nil))
> > > >
> > > >
> >
> > This is just background info about MIPS:
> >
> > On a MIPS32 hardware, the value -1 is  0x ff ff ff ff, which is same
> > with other arch.
> > On a MIPS64 hardware, the value of (int32_t)-1 is
> >      0x ff ff ff ff ff ff ff ff
> > which is same with (int64_t)-1.
> > So the single compare-and-branch instruction can work with both
> > int32_t and int64_t.
> >
> > On none sign-extension arch, like ARM64, (int32_t)-1 is
> >    0x 00 00 00 00 ff ff ff ff
> > and (int64_t)-1 is
> >    0x ff ff ff ff ff ff ff ff
> > That's why the `CMP` instructions for X and W have different encoding:
> > the 31bit of the encoding: `sf` bit.
> >
> >
> > ================================
> > For this problem, we have 2 choice to fix:
> > 1. This patch
> > 2.
>
> 2. disallow INS affecting bit 31 of a double-word - there's
> mips_use_ins_ext_p that could be used (maybe also pass in whether it's
> an INS).
>
> Richard.
>
> >
> > > >
> > > > > Richard.
> > > > >
> > > > > > gcc/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >         * expmed.cc(store_bit_field_1): Pass str_rtx and its mode
> > > > > >       to store_integral_bit_field if the length of op0 is greater
> > > > > >       than str_rtx.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >         PR: 104914.
> > > > > >       * gcc.target/mips/pr104914.c: New testcase.
> > > > > > ---
> > > > > >  gcc/expmed.cc                            | 20 +++++++++++++++++---
> > > > > >  gcc/testsuite/gcc.target/mips/pr104914.c | 17 +++++++++++++++++
> > > > > >  2 files changed, 34 insertions(+), 3 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > >
> > > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > > index fbd4ce2d42f..5531c19e891 100644
> > > > > > --- a/gcc/expmed.cc
> > > > > > +++ b/gcc/expmed.cc
> > > > > > @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       since that case is valid for any mode.  The following cases are only
> > > > > >       valid for integral modes.  */
> > > > > >    opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0));
> > > > > > +  opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx));
> > > > > >    scalar_int_mode imode;
> > > > > >    if (!op0_mode.exists (&imode) || imode != GET_MODE (op0))
> > > > > >      {
> > > > > > @@ -881,9 +882,22 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >       op0 = gen_lowpart (op0_mode.require (), op0);
> > > > > >      }
> > > > > >
> > > > > > -  return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum,
> > > > > > -                                bitregion_start, bitregion_end,
> > > > > > -                                fieldmode, value, reverse, fallback_p);
> > > > > > +  /* If MODEs of str_rtx and op0 are INT, and the length of op0 is greater than
> > > > > > +     str_rtx, it means that str_rtx has a shorter SUBREG: int32 on 64 mach/ABI
> > > > > > +     is an example.  For this case, we should use the mode of SUBREG, otherwise
> > > > > > +     bad code will generate for sign-extension ports, like MIPS.  */
> > > > > > +  bool use_str_mode = false;
> > > > > > +  if (GET_MODE_CLASS (GET_MODE (str_rtx)) == MODE_INT
> > > > > > +      && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
> > > > > > +      && known_gt (GET_MODE_SIZE (GET_MODE (op0)),
> > > > > > +                GET_MODE_SIZE (GET_MODE (str_rtx))))
> > > > > > +    use_str_mode = true;
> > > > > > +
> > > > > > +  return store_integral_bit_field (use_str_mode ? str_rtx : op0,
> > > > > > +                                use_str_mode ? str_mode : op0_mode,
> > > > > > +                                ibitsize, ibitnum, bitregion_start,
> > > > > > +                                bitregion_end, fieldmode, value,
> > > > > > +                                reverse, fallback_p);
> > > > > >  }
> > > > > >
> > > > > >  /* Subroutine of store_bit_field_1, with the same arguments, except
> > > > > > diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..fd6ef6af446
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.target/mips/pr104914.c
> > > > > > @@ -0,0 +1,17 @@
> > > > > > +/* { dg-do compile } */
> > > > > > +/* { dg-options "-march=mips64r2 -mabi=64" } */
> > > > > > +
> > > > > > +/* { dg-final { scan-assembler-not "\tdins\t" } } */
> > > > > > +
> > > > > > +NOMIPS16 int test (const unsigned char *buf)
> > > > > > +{
> > > > > > +  int val;
> > > > > > +  ((unsigned char*)&val)[0] = *buf++;
> > > > > > +  ((unsigned char*)&val)[1] = *buf++;
> > > > > > +  ((unsigned char*)&val)[2] = *buf++;
> > > > > > +  ((unsigned char*)&val)[3] = *buf++;
> > > > > > +  if(val > 0)
> > > > > > +    return 1;
> > > > > > +  else
> > > > > > +    return 0;
> > > > > > +}
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > > > HRB 36809 (AG Nuernberg)
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > HRB 36809 (AG Nuernberg)
> >
> >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)



-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  9:23         ` Richard Biener
  2023-07-19  9:27           ` Richard Biener
  2023-07-19  9:43           ` YunQiang Su
@ 2023-07-19  9:45           ` Eric Botcazou
  2023-07-19 10:12             ` YunQiang Su
  2 siblings, 1 reply; 19+ messages in thread
From: Eric Botcazou @ 2023-07-19  9:45 UTC (permalink / raw)
  To: Richard Biener
  Cc: YunQiang Su, gcc-patches, YunQiang Su, gcc-patches, pinskia,
	jeffreyalaw, ian

> I don't see that.  That's definitely not what GCC expects here,
> the left-most word of the doubleword should be unchanged.
> 
> Your testcase should be a dg-do-run and probably more like
> 
> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> {
>   int val;
>   ((unsigned char*)&val)[0] = *buf++;
>   ((unsigned char*)&val)[1] = *buf++;
>   ((unsigned char*)&val)[2] = *buf++;
>   ((unsigned char*)&val)[3] = *buf++;
>   return val;
> }
> int main()
> {
>   int val = 0x01020304;
>   val = test (&val);
>   if (val != 0x01020304)
>     abort ();
> }
> 
> not sure if I got endianess correct.  Now, the question is what
> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> the MIPS ABI says for returning SImode.

WORD_REGISTER_OPERATIONS must *not* be taken account for bit-fields, see e;g. 
word_register_operation_p:

/* Return true if X is an operation that always operates on the full
   registers for WORD_REGISTER_OPERATIONS architectures.  */

inline bool
word_register_operation_p (const_rtx x)
{
  switch (GET_CODE (x))
    {
    case CONST_INT:
    case ROTATE:
    case ROTATERT:
    case SIGN_EXTRACT:
    case ZERO_EXTRACT:
      return false;
    
    default:
      return true;
    }
}

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19  9:45           ` Eric Botcazou
@ 2023-07-19 10:12             ` YunQiang Su
  2023-07-19 10:25               ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: YunQiang Su @ 2023-07-19 10:12 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Richard Biener, gcc-patches, YunQiang Su, pinskia, jeffreyalaw, ian

Eric Botcazou <botcazou@adacore.com> 于2023年7月19日周三 17:45写道：
>
> > I don't see that.  That's definitely not what GCC expects here,
> > the left-most word of the doubleword should be unchanged.
> >
> > Your testcase should be a dg-do-run and probably more like
> >
> > NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> > {
> >   int val;
> >   ((unsigned char*)&val)[0] = *buf++;
> >   ((unsigned char*)&val)[1] = *buf++;
> >   ((unsigned char*)&val)[2] = *buf++;
> >   ((unsigned char*)&val)[3] = *buf++;
> >   return val;
> > }
> > int main()
> > {
> >   int val = 0x01020304;
> >   val = test (&val);
> >   if (val != 0x01020304)
> >     abort ();
> > }
> >
> > not sure if I got endianess correct.  Now, the question is what
> > WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> > the MIPS ABI says for returning SImode.
>

MIPS N64 ABI uses 2 GPR for integer return values.
If the return value is SImode, the first v0 register is used, and it
must be sign-extended,
aka the bits[64-31] are all same.

Yes, it is same for signed and unsigned int32.

https://irix7.com/techpubs/007-2816-004.pdf
Page 6:
32-bit integer (int) parameters are always sign-extended when passed
in registers,
whether of signed or unsigned type. [This issue does not arise in the
o32-bit ABI.]


> WORD_REGISTER_OPERATIONS must *not* be taken account for bit-fields, see e;g.
> word_register_operation_p:
>
> /* Return true if X is an operation that always operates on the full
>    registers for WORD_REGISTER_OPERATIONS architectures.  */
>
> inline bool
> word_register_operation_p (const_rtx x)
> {
>   switch (GET_CODE (x))
>     {
>     case CONST_INT:
>     case ROTATE:
>     case ROTATERT:
>     case SIGN_EXTRACT:
>     case ZERO_EXTRACT:
>       return false;
>
>     default:
>       return true;
>     }
> }
>
> --
> Eric Botcazou
>
>


-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19 10:12             ` YunQiang Su
@ 2023-07-19 10:25               ` Richard Biener
  2023-07-19 12:22                 ` Jeff Law
  2023-12-22  4:45                 ` YunQiang Su
  0 siblings, 2 replies; 19+ messages in thread
From: Richard Biener @ 2023-07-19 10:25 UTC (permalink / raw)
  To: YunQiang Su
  Cc: Eric Botcazou, gcc-patches, YunQiang Su, pinskia, jeffreyalaw, ian

On Wed, 19 Jul 2023, YunQiang Su wrote:

> Eric Botcazou <botcazou@adacore.com> ?2023?7?19??? 17:45???
> >
> > > I don't see that.  That's definitely not what GCC expects here,
> > > the left-most word of the doubleword should be unchanged.
> > >
> > > Your testcase should be a dg-do-run and probably more like
> > >
> > > NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> > > {
> > >   int val;
> > >   ((unsigned char*)&val)[0] = *buf++;
> > >   ((unsigned char*)&val)[1] = *buf++;
> > >   ((unsigned char*)&val)[2] = *buf++;
> > >   ((unsigned char*)&val)[3] = *buf++;
> > >   return val;
> > > }
> > > int main()
> > > {
> > >   int val = 0x01020304;
> > >   val = test (&val);
> > >   if (val != 0x01020304)
> > >     abort ();
> > > }
> > >
> > > not sure if I got endianess correct.  Now, the question is what
> > > WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> > > the MIPS ABI says for returning SImode.
> >
> 
> MIPS N64 ABI uses 2 GPR for integer return values.
> If the return value is SImode, the first v0 register is used, and it
> must be sign-extended,
> aka the bits[64-31] are all same.
> 
> Yes, it is same for signed and unsigned int32.
> 
> https://irix7.com/techpubs/007-2816-004.pdf
> Page 6:
> 32-bit integer (int) parameters are always sign-extended when passed
> in registers,
> whether of signed or unsigned type. [This issue does not arise in the
> o32-bit ABI.]

Note I think Andrews comment#7 in the PR is spot-on then, the issue
isn't the bitfield inserts but the compare where combine elides
the sign_extend in favor of a subreg.  That's likely some wrongdoing
in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.

Richard.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19 10:25               ` Richard Biener
@ 2023-07-19 12:22                 ` Jeff Law
  2023-07-20  7:09                   ` Richard Sandiford
  2023-12-22  4:45                 ` YunQiang Su
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff Law @ 2023-07-19 12:22 UTC (permalink / raw)
  To: Richard Biener, YunQiang Su
  Cc: Eric Botcazou, gcc-patches, YunQiang Su, pinskia, ian



On 7/19/23 04:25, Richard Biener wrote:
> On Wed, 19 Jul 2023, YunQiang Su wrote:
> 
>> Eric Botcazou <botcazou@adacore.com> ?2023?7?19??? 17:45???
>>>
>>>> I don't see that.  That's definitely not what GCC expects here,
>>>> the left-most word of the doubleword should be unchanged.
>>>>
>>>> Your testcase should be a dg-do-run and probably more like
>>>>
>>>> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
>>>> {
>>>>    int val;
>>>>    ((unsigned char*)&val)[0] = *buf++;
>>>>    ((unsigned char*)&val)[1] = *buf++;
>>>>    ((unsigned char*)&val)[2] = *buf++;
>>>>    ((unsigned char*)&val)[3] = *buf++;
>>>>    return val;
>>>> }
>>>> int main()
>>>> {
>>>>    int val = 0x01020304;
>>>>    val = test (&val);
>>>>    if (val != 0x01020304)
>>>>      abort ();
>>>> }
>>>>
>>>> not sure if I got endianess correct.  Now, the question is what
>>>> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
>>>> the MIPS ABI says for returning SImode.
>>>
>>
>> MIPS N64 ABI uses 2 GPR for integer return values.
>> If the return value is SImode, the first v0 register is used, and it
>> must be sign-extended,
>> aka the bits[64-31] are all same.
>>
>> Yes, it is same for signed and unsigned int32.
>>
>> https://irix7.com/techpubs/007-2816-004.pdf
>> Page 6:
>> 32-bit integer (int) parameters are always sign-extended when passed
>> in registers,
>> whether of signed or unsigned type. [This issue does not arise in the
>> o32-bit ABI.]
> 
> Note I think Andrews comment#7 in the PR is spot-on then, the issue
> isn't the bitfield inserts but the compare where combine elides
> the sign_extend in favor of a subreg.  That's likely some wrongdoing
> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
And I think it raises a real question about the use of GPR (which maps 
to SImode and DImode for 64bit MIPS targets) on the conditional 
branching patterns in mips.md.

So while this code works:

> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1
>      (nil))
> (jump_insn 23 20 24 2 (set (pc)
>         (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4)
>                 (const_int 0 [0]))
>             (label_ref 32)
>             (pc))) "/app/example.cpp":8:5 -1
>      (int_list:REG_BR_PROB 440234148 (nil))
>  -> 32)


Normally the narrowing SUBREG in insn 23 would indicate we don't care 
about the bits outside SImode.  But on a W_R_O targets we very much care 
because the hardware is going to ultimately do the comparison in 64 bits.

As Andrew/Richi have indicated this very much points to combine as 
incorrectly eliminating the explict sign extension.  Most likely because 
something saw the SUBREG and concluded those upper bits set by insn 20 
were "don't care" bits.

But it may ultimately be be better for the MIPS port to not expose a 
SImode comparison.  Thus reducing the reliance on W_R_O and its 
under-specified semantics and ultimately having the RTL map more closely 
to what the hardware actually does/supports.

That's the model we're working towards on the RISC-V port as well.  I 
wouldn't be surprised if we eventually get to the point where we 
eliminate WORD_REGISTER_OPERATIONS entirely.

And yes, bitfield operations are one of the nasty sticking points.  The 
thinking for them is that we want to support bit manipulations where the 
bit position is variable.  To do that we will emit an explicit sign 
extension after such operations.  Then rely on improved REE to identify 
and remove those redundant extensions.

Jeff

Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19 12:22                 ` Jeff Law
@ 2023-07-20  7:09                   ` Richard Sandiford
  2023-07-20  7:23                     ` Richard Biener
  2023-12-22  6:11                     ` YunQiang Su
  0 siblings, 2 replies; 19+ messages in thread
From: Richard Sandiford @ 2023-07-20  7:09 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches
  Cc: Richard Biener, YunQiang Su, Jeff Law, Eric Botcazou,
	YunQiang Su, pinskia, ian

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 7/19/23 04:25, Richard Biener wrote:
>> On Wed, 19 Jul 2023, YunQiang Su wrote:
>> 
>>> Eric Botcazou <botcazou@adacore.com> ?2023?7?19??? 17:45???
>>>>
>>>>> I don't see that.  That's definitely not what GCC expects here,
>>>>> the left-most word of the doubleword should be unchanged.
>>>>>
>>>>> Your testcase should be a dg-do-run and probably more like
>>>>>
>>>>> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
>>>>> {
>>>>>    int val;
>>>>>    ((unsigned char*)&val)[0] = *buf++;
>>>>>    ((unsigned char*)&val)[1] = *buf++;
>>>>>    ((unsigned char*)&val)[2] = *buf++;
>>>>>    ((unsigned char*)&val)[3] = *buf++;
>>>>>    return val;
>>>>> }
>>>>> int main()
>>>>> {
>>>>>    int val = 0x01020304;
>>>>>    val = test (&val);
>>>>>    if (val != 0x01020304)
>>>>>      abort ();
>>>>> }
>>>>>
>>>>> not sure if I got endianess correct.  Now, the question is what
>>>>> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
>>>>> the MIPS ABI says for returning SImode.
>>>>
>>>
>>> MIPS N64 ABI uses 2 GPR for integer return values.
>>> If the return value is SImode, the first v0 register is used, and it
>>> must be sign-extended,
>>> aka the bits[64-31] are all same.
>>>
>>> Yes, it is same for signed and unsigned int32.
>>>
>>> https://irix7.com/techpubs/007-2816-004.pdf
>>> Page 6:
>>> 32-bit integer (int) parameters are always sign-extended when passed
>>> in registers,
>>> whether of signed or unsigned type. [This issue does not arise in the
>>> o32-bit ABI.]
>> 
>> Note I think Andrews comment#7 in the PR is spot-on then, the issue
>> isn't the bitfield inserts but the compare where combine elides
>> the sign_extend in favor of a subreg.  That's likely some wrongdoing
>> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
> And I think it raises a real question about the use of GPR (which maps 
> to SImode and DImode for 64bit MIPS targets) on the conditional 
> branching patterns in mips.md.
>
> So while this code works:
>
>> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
>>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1
>>      (nil))

Haven't had chance to compile and look at it properly, but this subreg
seems suspicious for MIPS, given the definition of TRULY_NOOP_TRUNCATION.
We should instead use a truncdisi2 to narrow reg:DI 200 to an SI register,
and then sign_extend it.

This is easily missed in target-independent code because so few targets
define TRULY_NOOP_TRUNCATION.

Where is the subreg being generated?

Richard

>> (jump_insn 23 20 24 2 (set (pc)
>>         (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4)
>>                 (const_int 0 [0]))
>>             (label_ref 32)
>>             (pc))) "/app/example.cpp":8:5 -1
>>      (int_list:REG_BR_PROB 440234148 (nil))
>>  -> 32)
>
>
> Normally the narrowing SUBREG in insn 23 would indicate we don't care 
> about the bits outside SImode.  But on a W_R_O targets we very much care 
> because the hardware is going to ultimately do the comparison in 64 bits.
>
> As Andrew/Richi have indicated this very much points to combine as 
> incorrectly eliminating the explict sign extension.  Most likely because 
> something saw the SUBREG and concluded those upper bits set by insn 20 
> were "don't care" bits.
>
> But it may ultimately be be better for the MIPS port to not expose a 
> SImode comparison.  Thus reducing the reliance on W_R_O and its 
> under-specified semantics and ultimately having the RTL map more closely 
> to what the hardware actually does/supports.
>
> That's the model we're working towards on the RISC-V port as well.  I 
> wouldn't be surprised if we eventually get to the point where we 
> eliminate WORD_REGISTER_OPERATIONS entirely.
>
> And yes, bitfield operations are one of the nasty sticking points.  The 
> thinking for them is that we want to support bit manipulations where the 
> bit position is variable.  To do that we will emit an explicit sign 
> extension after such operations.  Then rely on improved REE to identify 
> and remove those redundant extensions.
>
> Jeff
>
> Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-20  7:09                   ` Richard Sandiford
@ 2023-07-20  7:23                     ` Richard Biener
  2023-07-20  9:22                       ` Richard Sandiford
  2023-12-22  6:11                     ` YunQiang Su
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Biener @ 2023-07-20  7:23 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Jeff Law via Gcc-patches, YunQiang Su, Jeff Law, Eric Botcazou,
	YunQiang Su, pinskia, ian

On Thu, 20 Jul 2023, Richard Sandiford wrote:

> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On 7/19/23 04:25, Richard Biener wrote:
> >> On Wed, 19 Jul 2023, YunQiang Su wrote:
> >> 
> >>> Eric Botcazou <botcazou@adacore.com> ?2023?7?19??? 17:45???
> >>>>
> >>>>> I don't see that.  That's definitely not what GCC expects here,
> >>>>> the left-most word of the doubleword should be unchanged.
> >>>>>
> >>>>> Your testcase should be a dg-do-run and probably more like
> >>>>>
> >>>>> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
> >>>>> {
> >>>>>    int val;
> >>>>>    ((unsigned char*)&val)[0] = *buf++;
> >>>>>    ((unsigned char*)&val)[1] = *buf++;
> >>>>>    ((unsigned char*)&val)[2] = *buf++;
> >>>>>    ((unsigned char*)&val)[3] = *buf++;
> >>>>>    return val;
> >>>>> }
> >>>>> int main()
> >>>>> {
> >>>>>    int val = 0x01020304;
> >>>>>    val = test (&val);
> >>>>>    if (val != 0x01020304)
> >>>>>      abort ();
> >>>>> }
> >>>>>
> >>>>> not sure if I got endianess correct.  Now, the question is what
> >>>>> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
> >>>>> the MIPS ABI says for returning SImode.
> >>>>
> >>>
> >>> MIPS N64 ABI uses 2 GPR for integer return values.
> >>> If the return value is SImode, the first v0 register is used, and it
> >>> must be sign-extended,
> >>> aka the bits[64-31] are all same.
> >>>
> >>> Yes, it is same for signed and unsigned int32.
> >>>
> >>> https://irix7.com/techpubs/007-2816-004.pdf
> >>> Page 6:
> >>> 32-bit integer (int) parameters are always sign-extended when passed
> >>> in registers,
> >>> whether of signed or unsigned type. [This issue does not arise in the
> >>> o32-bit ABI.]
> >> 
> >> Note I think Andrews comment#7 in the PR is spot-on then, the issue
> >> isn't the bitfield inserts but the compare where combine elides
> >> the sign_extend in favor of a subreg.  That's likely some wrongdoing
> >> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
> > And I think it raises a real question about the use of GPR (which maps 
> > to SImode and DImode for 64bit MIPS targets) on the conditional 
> > branching patterns in mips.md.
> >
> > So while this code works:
> >
> >> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
> >>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1
> >>      (nil))
> 
> Haven't had chance to compile and look at it properly, but this subreg
> seems suspicious for MIPS, given the definition of TRULY_NOOP_TRUNCATION.
> We should instead use a truncdisi2 to narrow reg:DI 200 to an SI register,
> and then sign_extend it.
> 
> This is easily missed in target-independent code because so few targets
> define TRULY_NOOP_TRUNCATION.

Can we easily get rid of it?

Richard.

> Where is the subreg being generated?
> 
> Richard
> 
> >> (jump_insn 23 20 24 2 (set (pc)
> >>         (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4)
> >>                 (const_int 0 [0]))
> >>             (label_ref 32)
> >>             (pc))) "/app/example.cpp":8:5 -1
> >>      (int_list:REG_BR_PROB 440234148 (nil))
> >>  -> 32)
> >
> >
> > Normally the narrowing SUBREG in insn 23 would indicate we don't care 
> > about the bits outside SImode.  But on a W_R_O targets we very much care 
> > because the hardware is going to ultimately do the comparison in 64 bits.
> >
> > As Andrew/Richi have indicated this very much points to combine as 
> > incorrectly eliminating the explict sign extension.  Most likely because 
> > something saw the SUBREG and concluded those upper bits set by insn 20 
> > were "don't care" bits.
> >
> > But it may ultimately be be better for the MIPS port to not expose a 
> > SImode comparison.  Thus reducing the reliance on W_R_O and its 
> > under-specified semantics and ultimately having the RTL map more closely 
> > to what the hardware actually does/supports.
> >
> > That's the model we're working towards on the RISC-V port as well.  I 
> > wouldn't be surprised if we eventually get to the point where we 
> > eliminate WORD_REGISTER_OPERATIONS entirely.
> >
> > And yes, bitfield operations are one of the nasty sticking points.  The 
> > thinking for them is that we want to support bit manipulations where the 
> > bit position is variable.  To do that we will emit an explicit sign 
> > extension after such operations.  Then rely on improved REE to identify 
> > and remove those redundant extensions.
> >
> > Jeff
> >
> > Jeff
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-20  7:23                     ` Richard Biener
@ 2023-07-20  9:22                       ` Richard Sandiford
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Sandiford @ 2023-07-20  9:22 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law via Gcc-patches, YunQiang Su, Jeff Law, Eric Botcazou,
	YunQiang Su, pinskia, ian

Richard Biener <rguenther@suse.de> writes:
#> On Thu, 20 Jul 2023, Richard Sandiford wrote:
>
>> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > On 7/19/23 04:25, Richard Biener wrote:
>> >> On Wed, 19 Jul 2023, YunQiang Su wrote:
>> >> 
>> >>> Eric Botcazou <botcazou@adacore.com> ?2023?7?19??? 17:45???
>> >>>>
>> >>>>> I don't see that.  That's definitely not what GCC expects here,
>> >>>>> the left-most word of the doubleword should be unchanged.
>> >>>>>
>> >>>>> Your testcase should be a dg-do-run and probably more like
>> >>>>>
>> >>>>> NOMIPS16 int __attribute__((noipa)) test (const unsigned char *buf)
>> >>>>> {
>> >>>>>    int val;
>> >>>>>    ((unsigned char*)&val)[0] = *buf++;
>> >>>>>    ((unsigned char*)&val)[1] = *buf++;
>> >>>>>    ((unsigned char*)&val)[2] = *buf++;
>> >>>>>    ((unsigned char*)&val)[3] = *buf++;
>> >>>>>    return val;
>> >>>>> }
>> >>>>> int main()
>> >>>>> {
>> >>>>>    int val = 0x01020304;
>> >>>>>    val = test (&val);
>> >>>>>    if (val != 0x01020304)
>> >>>>>      abort ();
>> >>>>> }
>> >>>>>
>> >>>>> not sure if I got endianess correct.  Now, the question is what
>> >>>>> WORD_REGISTER_OPERATIONS implies for a bitfield insert and what
>> >>>>> the MIPS ABI says for returning SImode.
>> >>>>
>> >>>
>> >>> MIPS N64 ABI uses 2 GPR for integer return values.
>> >>> If the return value is SImode, the first v0 register is used, and it
>> >>> must be sign-extended,
>> >>> aka the bits[64-31] are all same.
>> >>>
>> >>> Yes, it is same for signed and unsigned int32.
>> >>>
>> >>> https://irix7.com/techpubs/007-2816-004.pdf
>> >>> Page 6:
>> >>> 32-bit integer (int) parameters are always sign-extended when passed
>> >>> in registers,
>> >>> whether of signed or unsigned type. [This issue does not arise in the
>> >>> o32-bit ABI.]
>> >> 
>> >> Note I think Andrews comment#7 in the PR is spot-on then, the issue
>> >> isn't the bitfield inserts but the compare where combine elides
>> >> the sign_extend in favor of a subreg.  That's likely some wrongdoing
>> >> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
>> > And I think it raises a real question about the use of GPR (which maps 
>> > to SImode and DImode for 64bit MIPS targets) on the conditional 
>> > branching patterns in mips.md.
>> >
>> > So while this code works:
>> >
>> >> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
>> >>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1
>> >>      (nil))
>> 
>> Haven't had chance to compile and look at it properly, but this subreg
>> seems suspicious for MIPS, given the definition of TRULY_NOOP_TRUNCATION.
>> We should instead use a truncdisi2 to narrow reg:DI 200 to an SI register,
>> and then sign_extend it.
>> 
>> This is easily missed in target-independent code because so few targets
>> define TRULY_NOOP_TRUNCATION.
>
> Can we easily get rid of it?

Not easily.  The problem is that the original 64-bit ISA says that the
behaviour of a 32-bit arithmetic instruction is undefined if the inputs
aren't in sign-extended form.  (A bit like GCC CONST_INTs :))  So:

  // $2 = 0x0_8000_0000
  addiu $3,$2,$2

has undefined behaviour, it must instead be:

  // $2 = 0xffff_ffff_8000_0000

The purpose of TRULY_NOOP_TRUNCATION is to ensure that we never form an
SImode register by taking a lowpart subreg of a DImode (or wider) register.

In other words, things are inverted to that truncating DI to SI is a
sign-extend operation (represented in RTL as a trunc) while sign-extending
from SI to DI is free (lowered to nothing after RA).

Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-19 10:25               ` Richard Biener
  2023-07-19 12:22                 ` Jeff Law
@ 2023-12-22  4:45                 ` YunQiang Su
  1 sibling, 0 replies; 19+ messages in thread
From: YunQiang Su @ 2023-12-22  4:45 UTC (permalink / raw)
  To: Richard Biener
  Cc: Eric Botcazou, gcc-patches, YunQiang Su, pinskia, jeffreyalaw, ian

>
> Note I think Andrews comment#7 in the PR is spot-on then, the issue
> isn't the bitfield inserts but the compare where combine elides
> the sign_extend in favor of a subreg.  That's likely some wrongdoing
> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
>

Yes. There are 2 problems here. Any one of them can make this problem.
1) jump_insn eats sign_extend (and truncate) in
      /* Simplify X, an IF_THEN_ELSE expression.  Return the new expression.  */
      static rtx simplify_if_then_else (rtx x)

2) MIPS claims sign_extend to be deleted.
(define_insn_and_split "extendsidi2"
  [(set (match_operand:DI 0 "register_operand" "=d,l,d")
        (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,0,m")))]
  "TARGET_64BIT"
  "@
   #
   #
   lw\t%0,%1"
  "&& reload_completed && register_operand (operands[1], VOIDmode)"
  [(const_int 0)]
{
  emit_note (NOTE_INSN_DELETED);
  DONE;
}
  [(set_attr "move_type" "move,move,load")
   (set_attr "mode" "DI")])

-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible
  2023-07-20  7:09                   ` Richard Sandiford
  2023-07-20  7:23                     ` Richard Biener
@ 2023-12-22  6:11                     ` YunQiang Su
  1 sibling, 0 replies; 19+ messages in thread
From: YunQiang Su @ 2023-12-22  6:11 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches, Richard Biener, YunQiang Su, Jeff Law,
	Eric Botcazou, YunQiang Su, pinskia, ian, richard.sandiford

> >
> >> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
> >>         (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1
> >>      (nil))
>
> Haven't had chance to compile and look at it properly, but this subreg
> seems suspicious for MIPS, given the definition of TRULY_NOOP_TRUNCATION.
> We should instead use a truncdisi2 to narrow reg:DI 200 to an SI register,
> and then sign_extend it.
>
> This is easily missed in target-independent code because so few targets
> define TRULY_NOOP_TRUNCATION.
>
> Where is the subreg being generated?
>

It's from expand_assignment(tree to, tree from, bool nontemporal) in expr.cc.
    to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE);




-- 
YunQiang Su

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-12-22  6:11 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-19  4:16 [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible YunQiang Su
2023-07-19  6:26 ` Richard Biener
2023-07-19  6:58   ` YunQiang Su
2023-07-19  7:22     ` Richard Biener
2023-07-19  8:21       ` YunQiang Su
2023-07-19  8:25         ` YunQiang Su
2023-07-19  8:50           ` YunQiang Su
2023-07-19  9:23         ` Richard Biener
2023-07-19  9:27           ` Richard Biener
2023-07-19  9:43           ` YunQiang Su
2023-07-19  9:45           ` Eric Botcazou
2023-07-19 10:12             ` YunQiang Su
2023-07-19 10:25               ` Richard Biener
2023-07-19 12:22                 ` Jeff Law
2023-07-20  7:09                   ` Richard Sandiford
2023-07-20  7:23                     ` Richard Biener
2023-07-20  9:22                       ` Richard Sandiford
2023-12-22  6:11                     ` YunQiang Su
2023-12-22  4:45                 ` YunQiang Su

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).