public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, ivopt] Try aligned offset when get_address_cost
@ 2014-08-04  6:30 Zhenqiang Chen
  2014-08-04  8:40 ` Bin.Cheng
  2015-02-04 10:35 ` Eric Botcazou
  0 siblings, 2 replies; 8+ messages in thread
From: Zhenqiang Chen @ 2014-08-04  6:30 UTC (permalink / raw)
  To: gcc-patches

Hi,

For some TARGET, like ARM THUMB1, the offset in load/store should be nature
aligned. But in function get_address_cost, when computing max_offset, it
only tries byte-aligned offsets:

  ((unsigned HOST_WIDE_INT) 1 << i) - 1

which can not meet thumb_legitimate_offset_p check called from
thumb1_legitimate_address_p for HImode and SImode.

The patch adds additional try for aligned offset:

  ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).

Bootstrap and no make check regression on X86-64.
No make check regression on qemu for Cortex-m0 and Cortex-m3.
For Cortex-m0, no performance changes with coremark and dhrystone. Coremark
code size is ~0.44 smaller. And eembcv2 code size is ~0.22 smaller. CSiBE
code size is ~0.05% smaller.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog
2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>

	* tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.

testsuite/ChangeLog:
2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>

	* gcc.target/arm/get_address_cost_aligned_max_offset.c: New test.

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 3b4a6cd..562122a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
var_present,
 	  XEXP (addr, 1) = gen_int_mode (off, address_mode);
 	  if (memory_address_addr_space_p (mem_mode, addr, as))
 	    break;
+	  /* For some TARGET, like ARM THUMB1, the offset should be nature
+	     aligned.  Try an aligned offset if address_mode is not QImode.
*/
+	  off = (address_mode == QImode)
+		? 0
+		: ((unsigned HOST_WIDE_INT) 1 << i)
+		    - GET_MODE_SIZE (address_mode);
+	  if (off > 0)
+	    {
+	      XEXP (addr, 1) = gen_int_mode (off, address_mode);
+	      if (memory_address_addr_space_p (mem_mode, addr, as))
+		break;
+	    }
 	}
       if (i == -1)
         off = 0;
diff --git
a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
new file mode 100644
index 0000000..cc3e2f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb -O2" }  */
+/* { dg-require-effective-target arm_thumb1_ok } */
+
+unsigned int
+test (const short p16[6 * 64])
+{
+  unsigned int i = 6;
+  unsigned int ret = 0;
+
+  do
+    {
+      unsigned long long *p64 = (unsigned long long*) p16;
+      unsigned int *p32 = (unsigned int*) p16;
+      ret += ret;
+      if (p16[1] || p32[1])
+	ret++;
+      else if (p64[1] | p64[2] | p64[3])
+	ret++;
+      p16 += 64;
+      i--;
+    } while (i != 0);
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler-not "#22" } } */
+/* { dg-final { scan-assembler-not "#14" } } */



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-04  6:30 [PATCH, ivopt] Try aligned offset when get_address_cost Zhenqiang Chen
@ 2014-08-04  8:40 ` Bin.Cheng
  2014-08-04  9:10   ` Zhenqiang Chen
  2015-02-04 10:35 ` Eric Botcazou
  1 sibling, 1 reply; 8+ messages in thread
From: Bin.Cheng @ 2014-08-04  8:40 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches List

On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
> Hi,
>
> For some TARGET, like ARM THUMB1, the offset in load/store should be nature
> aligned. But in function get_address_cost, when computing max_offset, it
> only tries byte-aligned offsets:
>
>   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>
> which can not meet thumb_legitimate_offset_p check called from
> thumb1_legitimate_address_p for HImode and SImode.
>
> The patch adds additional try for aligned offset:
>
>   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>
> Bootstrap and no make check regression on X86-64.
> No make check regression on qemu for Cortex-m0 and Cortex-m3.
> For Cortex-m0, no performance changes with coremark and dhrystone. Coremark
> code size is ~0.44 smaller. And eembcv2 code size is ~0.22 smaller. CSiBE
> code size is ~0.05% smaller.
>
> OK for trunk?
>
> Thanks!
> -Zhenqiang
>
> ChangeLog
> 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
>
> testsuite/ChangeLog:
> 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New test.
>
> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> index 3b4a6cd..562122a 100644
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
> var_present,
>           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>           if (memory_address_addr_space_p (mem_mode, addr, as))
>             break;
> +         /* For some TARGET, like ARM THUMB1, the offset should be nature
> +            aligned.  Try an aligned offset if address_mode is not QImode.
> */
> +         off = (address_mode == QImode)
> +               ? 0
> +               : ((unsigned HOST_WIDE_INT) 1 << i)
> +                   - GET_MODE_SIZE (address_mode);
> +         if (off > 0)
> +           {
> +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
> +             if (memory_address_addr_space_p (mem_mode, addr, as))
> +               break;
> +           }
Hi, Why not just check "address_mode != QImode"? Set off to 0 then
check it seems unnecessary.

Thanks,
bin
>         }
>        if (i == -1)
>          off = 0;
> diff --git
> a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
> new file mode 100644
> index 0000000..cc3e2f7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mthumb -O2" }  */
> +/* { dg-require-effective-target arm_thumb1_ok } */
> +
> +unsigned int
> +test (const short p16[6 * 64])
> +{
> +  unsigned int i = 6;
> +  unsigned int ret = 0;
> +
> +  do
> +    {
> +      unsigned long long *p64 = (unsigned long long*) p16;
> +      unsigned int *p32 = (unsigned int*) p16;
> +      ret += ret;
> +      if (p16[1] || p32[1])
> +       ret++;
> +      else if (p64[1] | p64[2] | p64[3])
> +       ret++;
> +      p16 += 64;
> +      i--;
> +    } while (i != 0);
> +
> +  return ret;
> +}
> +
> +/* { dg-final { scan-assembler-not "#22" } } */
> +/* { dg-final { scan-assembler-not "#14" } } */
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-04  8:40 ` Bin.Cheng
@ 2014-08-04  9:10   ` Zhenqiang Chen
  2014-08-05 14:00     ` Richard Biener
  0 siblings, 1 reply; 8+ messages in thread
From: Zhenqiang Chen @ 2014-08-04  9:10 UTC (permalink / raw)
  To: 'Bin.Cheng'; +Cc: gcc-patches List



> -----Original Message-----
> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
> Sent: Monday, August 04, 2014 4:41 PM
> To: Zhenqiang Chen
> Cc: gcc-patches List
> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost
> 
> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen
> <zhenqiang.chen@arm.com> wrote:
> > Hi,
> >
> > For some TARGET, like ARM THUMB1, the offset in load/store should be
> > nature aligned. But in function get_address_cost, when computing
> > max_offset, it only tries byte-aligned offsets:
> >
> >   ((unsigned HOST_WIDE_INT) 1 << i) - 1
> >
> > which can not meet thumb_legitimate_offset_p check called from
> > thumb1_legitimate_address_p for HImode and SImode.
> >
> > The patch adds additional try for aligned offset:
> >
> >   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
> >
> > Bootstrap and no make check regression on X86-64.
> > No make check regression on qemu for Cortex-m0 and Cortex-m3.
> > For Cortex-m0, no performance changes with coremark and dhrystone.
> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22
> > smaller. CSiBE code size is ~0.05% smaller.
> >
> > OK for trunk?
> >
> > Thanks!
> > -Zhenqiang
> >
> > ChangeLog
> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
> >
> > testsuite/ChangeLog:
> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New
test.
> >
> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> > index 3b4a6cd..562122a 100644
> > --- a/gcc/tree-ssa-loop-ivopts.c
> > +++ b/gcc/tree-ssa-loop-ivopts.c
> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
> > var_present,
> >           XEXP (addr, 1) = gen_int_mode (off, address_mode);
> >           if (memory_address_addr_space_p (mem_mode, addr, as))
> >             break;
> > +         /* For some TARGET, like ARM THUMB1, the offset should be
nature
> > +            aligned.  Try an aligned offset if address_mode is not
QImode.
> > */
> > +         off = (address_mode == QImode)
> > +               ? 0
> > +               : ((unsigned HOST_WIDE_INT) 1 << i)
> > +                   - GET_MODE_SIZE (address_mode);
> > +         if (off > 0)
> > +           {
> > +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
> > +             if (memory_address_addr_space_p (mem_mode, addr, as))
> > +               break;
> > +           }
> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check
it
> seems unnecessary.

Thanks for the comments.

((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a
negative value except QImode. A negative value can not be max_offset. So we
do not need to check it. 

For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE
(address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already
checked. So no need to check it again.

I think the compiler can optimize the patch like

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 3b4a6cd..213598a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool
var_present,
 	  XEXP (addr, 1) = gen_int_mode (off, address_mode);
 	  if (memory_address_addr_space_p (mem_mode, addr, as))
 	    break;
+	  /* For some TARGET, like ARM THUMB1, the offset should be nature
+	     aligned.  Try an aligned offset if address_mode is not QImode.
*/
+	  if (address_mode != QImode)
+	    {
+	      off = ((unsigned HOST_WIDE_INT) 1 << i)
+		      - GET_MODE_SIZE (address_mode);
+	      if (off > 0)
+		{
+		  XEXP (addr, 1) = gen_int_mode (off, address_mode);
+		  if (memory_address_addr_space_p (mem_mode, addr, as))
+		    break;
+		}
+	    }
 	}
       if (i == -1)
         off = 0;

> Thanks,
> bin
> >         }
> >        if (i == -1)
> >          off = 0;
> > diff --git
> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
> > new file mode 100644
> > index 0000000..cc3e2f7
> > --- /dev/null
> > +++
> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset
> > +++ .c
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-mthumb -O2" }  */
> > +/* { dg-require-effective-target arm_thumb1_ok } */
> > +
> > +unsigned int
> > +test (const short p16[6 * 64])
> > +{
> > +  unsigned int i = 6;
> > +  unsigned int ret = 0;
> > +
> > +  do
> > +    {
> > +      unsigned long long *p64 = (unsigned long long*) p16;
> > +      unsigned int *p32 = (unsigned int*) p16;
> > +      ret += ret;
> > +      if (p16[1] || p32[1])
> > +       ret++;
> > +      else if (p64[1] | p64[2] | p64[3])
> > +       ret++;
> > +      p16 += 64;
> > +      i--;
> > +    } while (i != 0);
> > +
> > +  return ret;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "#22" } } */
> > +/* { dg-final { scan-assembler-not "#14" } } */
> >
> >
> >




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-04  9:10   ` Zhenqiang Chen
@ 2014-08-05 14:00     ` Richard Biener
  2014-08-06  6:34       ` Zhenqiang Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Biener @ 2014-08-05 14:00 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: Bin.Cheng, gcc-patches List

On Mon, Aug 4, 2014 at 11:09 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>
>
>> -----Original Message-----
>> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
>> Sent: Monday, August 04, 2014 4:41 PM
>> To: Zhenqiang Chen
>> Cc: gcc-patches List
>> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost
>>
>> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen
>> <zhenqiang.chen@arm.com> wrote:
>> > Hi,
>> >
>> > For some TARGET, like ARM THUMB1, the offset in load/store should be
>> > nature aligned. But in function get_address_cost, when computing
>> > max_offset, it only tries byte-aligned offsets:
>> >
>> >   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>> >
>> > which can not meet thumb_legitimate_offset_p check called from
>> > thumb1_legitimate_address_p for HImode and SImode.
>> >
>> > The patch adds additional try for aligned offset:
>> >
>> >   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>> >
>> > Bootstrap and no make check regression on X86-64.
>> > No make check regression on qemu for Cortex-m0 and Cortex-m3.
>> > For Cortex-m0, no performance changes with coremark and dhrystone.
>> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22
>> > smaller. CSiBE code size is ~0.05% smaller.
>> >
>> > OK for trunk?
>> >
>> > Thanks!
>> > -Zhenqiang
>> >
>> > ChangeLog
>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>> >
>> >         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
>> >
>> > testsuite/ChangeLog:
>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>> >
>> >         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New
> test.
>> >
>> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>> > index 3b4a6cd..562122a 100644
>> > --- a/gcc/tree-ssa-loop-ivopts.c
>> > +++ b/gcc/tree-ssa-loop-ivopts.c
>> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
>> > var_present,
>> >           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>> >           if (memory_address_addr_space_p (mem_mode, addr, as))
>> >             break;
>> > +         /* For some TARGET, like ARM THUMB1, the offset should be
> nature
>> > +            aligned.  Try an aligned offset if address_mode is not
> QImode.
>> > */
>> > +         off = (address_mode == QImode)
>> > +               ? 0
>> > +               : ((unsigned HOST_WIDE_INT) 1 << i)
>> > +                   - GET_MODE_SIZE (address_mode);
>> > +         if (off > 0)
>> > +           {
>> > +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
>> > +             if (memory_address_addr_space_p (mem_mode, addr, as))
>> > +               break;
>> > +           }
>> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check
> it
>> seems unnecessary.
>
> Thanks for the comments.
>
> ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a
> negative value except QImode. A negative value can not be max_offset. So we
> do not need to check it.
>
> For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE
> (address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already
> checked. So no need to check it again.
>
> I think the compiler can optimize the patch like
>
> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> index 3b4a6cd..213598a 100644
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool
> var_present,
>           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>           if (memory_address_addr_space_p (mem_mode, addr, as))
>             break;
> +         /* For some TARGET, like ARM THUMB1, the offset should be nature
> +            aligned.  Try an aligned offset if address_mode is not QImode.
> */
> +         if (address_mode != QImode)
> +           {
> +             off = ((unsigned HOST_WIDE_INT) 1 << i)
> +                     - GET_MODE_SIZE (address_mode);
> +             if (off > 0)
> +               {
> +                 XEXP (addr, 1) = gen_int_mode (off, address_mode);
> +                 if (memory_address_addr_space_p (mem_mode, addr, as))
> +                   break;
> +               }
> +           }
>         }
>        if (i == -1)
>          off = 0;

But is off now guaranteed to be the max value? (1 << (i-1) ) - 1 for
small i is larger than (1 << i) - GET_MODE_SIZE (address_mode).

That is, I think you want to guard this with 1 << (i - 1) >
GET_MODE_SIZE (address_mode)?

You don't adjust the negative offset side - why?

Richard.

>
>> Thanks,
>> bin
>> >         }
>> >        if (i == -1)
>> >          off = 0;
>> > diff --git
>> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>> > new file mode 100644
>> > index 0000000..cc3e2f7
>> > --- /dev/null
>> > +++
>> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset
>> > +++ .c
>> > @@ -0,0 +1,28 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-mthumb -O2" }  */
>> > +/* { dg-require-effective-target arm_thumb1_ok } */
>> > +
>> > +unsigned int
>> > +test (const short p16[6 * 64])
>> > +{
>> > +  unsigned int i = 6;
>> > +  unsigned int ret = 0;
>> > +
>> > +  do
>> > +    {
>> > +      unsigned long long *p64 = (unsigned long long*) p16;
>> > +      unsigned int *p32 = (unsigned int*) p16;
>> > +      ret += ret;
>> > +      if (p16[1] || p32[1])
>> > +       ret++;
>> > +      else if (p64[1] | p64[2] | p64[3])
>> > +       ret++;
>> > +      p16 += 64;
>> > +      i--;
>> > +    } while (i != 0);
>> > +
>> > +  return ret;
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-not "#22" } } */
>> > +/* { dg-final { scan-assembler-not "#14" } } */
>> >
>> >
>> >
>
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-05 14:00     ` Richard Biener
@ 2014-08-06  6:34       ` Zhenqiang Chen
  2014-08-06 12:25         ` Richard Biener
  0 siblings, 1 reply; 8+ messages in thread
From: Zhenqiang Chen @ 2014-08-06  6:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: Zhenqiang Chen, Bin.Cheng, gcc-patches List

On 5 August 2014 21:59, Richard Biener <richard.guenther@gmail.com> wrote:
> On Mon, Aug 4, 2014 at 11:09 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>>
>>
>>> -----Original Message-----
>>> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
>>> Sent: Monday, August 04, 2014 4:41 PM
>>> To: Zhenqiang Chen
>>> Cc: gcc-patches List
>>> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost
>>>
>>> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen
>>> <zhenqiang.chen@arm.com> wrote:
>>> > Hi,
>>> >
>>> > For some TARGET, like ARM THUMB1, the offset in load/store should be
>>> > nature aligned. But in function get_address_cost, when computing
>>> > max_offset, it only tries byte-aligned offsets:
>>> >
>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>>> >
>>> > which can not meet thumb_legitimate_offset_p check called from
>>> > thumb1_legitimate_address_p for HImode and SImode.
>>> >
>>> > The patch adds additional try for aligned offset:
>>> >
>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>>> >
>>> > Bootstrap and no make check regression on X86-64.
>>> > No make check regression on qemu for Cortex-m0 and Cortex-m3.
>>> > For Cortex-m0, no performance changes with coremark and dhrystone.
>>> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22
>>> > smaller. CSiBE code size is ~0.05% smaller.
>>> >
>>> > OK for trunk?
>>> >
>>> > Thanks!
>>> > -Zhenqiang
>>> >
>>> > ChangeLog
>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>> >
>>> >         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
>>> >
>>> > testsuite/ChangeLog:
>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>> >
>>> >         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New
>> test.
>>> >
>>> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>>> > index 3b4a6cd..562122a 100644
>>> > --- a/gcc/tree-ssa-loop-ivopts.c
>>> > +++ b/gcc/tree-ssa-loop-ivopts.c
>>> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
>>> > var_present,
>>> >           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>> >           if (memory_address_addr_space_p (mem_mode, addr, as))
>>> >             break;
>>> > +         /* For some TARGET, like ARM THUMB1, the offset should be
>> nature
>>> > +            aligned.  Try an aligned offset if address_mode is not
>> QImode.
>>> > */
>>> > +         off = (address_mode == QImode)
>>> > +               ? 0
>>> > +               : ((unsigned HOST_WIDE_INT) 1 << i)
>>> > +                   - GET_MODE_SIZE (address_mode);
>>> > +         if (off > 0)
>>> > +           {
>>> > +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>> > +             if (memory_address_addr_space_p (mem_mode, addr, as))
>>> > +               break;
>>> > +           }
>>> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check
>> it
>>> seems unnecessary.
>>
>> Thanks for the comments.
>>
>> ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a
>> negative value except QImode. A negative value can not be max_offset. So we
>> do not need to check it.
>>
>> For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE
>> (address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already
>> checked. So no need to check it again.
>>
>> I think the compiler can optimize the patch like
>>
>> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>> index 3b4a6cd..213598a 100644
>> --- a/gcc/tree-ssa-loop-ivopts.c
>> +++ b/gcc/tree-ssa-loop-ivopts.c
>> @@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool
>> var_present,
>>           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>           if (memory_address_addr_space_p (mem_mode, addr, as))
>>             break;
>> +         /* For some TARGET, like ARM THUMB1, the offset should be nature
>> +            aligned.  Try an aligned offset if address_mode is not QImode.
>> */
>> +         if (address_mode != QImode)
>> +           {
>> +             off = ((unsigned HOST_WIDE_INT) 1 << i)
>> +                     - GET_MODE_SIZE (address_mode);
>> +             if (off > 0)
>> +               {
>> +                 XEXP (addr, 1) = gen_int_mode (off, address_mode);
>> +                 if (memory_address_addr_space_p (mem_mode, addr, as))
>> +                   break;
>> +               }
>> +           }
>>         }
>>        if (i == -1)
>>          off = 0;
>
> But is off now guaranteed to be the max value? (1 << (i-1) ) - 1 for
> small i is larger than (1 << i) - GET_MODE_SIZE (address_mode).
>
> That is, I think you want to guard this with 1 << (i - 1) >
> GET_MODE_SIZE (address_mode)?

Yes. Without off > 0, it can not guarantee the off is the max value.
With off > 0, it can guarantee that

(1 << i) - GET_MODE_SIZE (address_mode) is greater than (1 << (i-1) ) - 1.

> You don't adjust the negative offset side - why?

-((unsigned HOST_WIDE_INT) 1 << i) is already the min aligned offset.

Thanks!
-Zhenqiang

> Richard.
>
>>
>>> Thanks,
>>> bin
>>> >         }
>>> >        if (i == -1)
>>> >          off = 0;
>>> > diff --git
>>> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>> > new file mode 100644
>>> > index 0000000..cc3e2f7
>>> > --- /dev/null
>>> > +++
>>> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset
>>> > +++ .c
>>> > @@ -0,0 +1,28 @@
>>> > +/* { dg-do compile } */
>>> > +/* { dg-options "-mthumb -O2" }  */
>>> > +/* { dg-require-effective-target arm_thumb1_ok } */
>>> > +
>>> > +unsigned int
>>> > +test (const short p16[6 * 64])
>>> > +{
>>> > +  unsigned int i = 6;
>>> > +  unsigned int ret = 0;
>>> > +
>>> > +  do
>>> > +    {
>>> > +      unsigned long long *p64 = (unsigned long long*) p16;
>>> > +      unsigned int *p32 = (unsigned int*) p16;
>>> > +      ret += ret;
>>> > +      if (p16[1] || p32[1])
>>> > +       ret++;
>>> > +      else if (p64[1] | p64[2] | p64[3])
>>> > +       ret++;
>>> > +      p16 += 64;
>>> > +      i--;
>>> > +    } while (i != 0);
>>> > +
>>> > +  return ret;
>>> > +}
>>> > +
>>> > +/* { dg-final { scan-assembler-not "#22" } } */
>>> > +/* { dg-final { scan-assembler-not "#14" } } */
>>> >
>>> >
>>> >
>>
>>
>>
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-06  6:34       ` Zhenqiang Chen
@ 2014-08-06 12:25         ` Richard Biener
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Biener @ 2014-08-06 12:25 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: Zhenqiang Chen, Bin.Cheng, gcc-patches List

On Wed, Aug 6, 2014 at 8:34 AM, Zhenqiang Chen
<zhenqiang.chen@linaro.org> wrote:
> On 5 August 2014 21:59, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Mon, Aug 4, 2014 at 11:09 AM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
>>>> Sent: Monday, August 04, 2014 4:41 PM
>>>> To: Zhenqiang Chen
>>>> Cc: gcc-patches List
>>>> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost
>>>>
>>>> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen
>>>> <zhenqiang.chen@arm.com> wrote:
>>>> > Hi,
>>>> >
>>>> > For some TARGET, like ARM THUMB1, the offset in load/store should be
>>>> > nature aligned. But in function get_address_cost, when computing
>>>> > max_offset, it only tries byte-aligned offsets:
>>>> >
>>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>>>> >
>>>> > which can not meet thumb_legitimate_offset_p check called from
>>>> > thumb1_legitimate_address_p for HImode and SImode.
>>>> >
>>>> > The patch adds additional try for aligned offset:
>>>> >
>>>> >   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>>>> >
>>>> > Bootstrap and no make check regression on X86-64.
>>>> > No make check regression on qemu for Cortex-m0 and Cortex-m3.
>>>> > For Cortex-m0, no performance changes with coremark and dhrystone.
>>>> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22
>>>> > smaller. CSiBE code size is ~0.05% smaller.
>>>> >
>>>> > OK for trunk?
>>>> >
>>>> > Thanks!
>>>> > -Zhenqiang
>>>> >
>>>> > ChangeLog
>>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>>> >
>>>> >         * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset.
>>>> >
>>>> > testsuite/ChangeLog:
>>>> > 2014-08-04  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>>>> >
>>>> >         * gcc.target/arm/get_address_cost_aligned_max_offset.c: New
>>> test.
>>>> >
>>>> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>>>> > index 3b4a6cd..562122a 100644
>>>> > --- a/gcc/tree-ssa-loop-ivopts.c
>>>> > +++ b/gcc/tree-ssa-loop-ivopts.c
>>>> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool
>>>> > var_present,
>>>> >           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>> >           if (memory_address_addr_space_p (mem_mode, addr, as))
>>>> >             break;
>>>> > +         /* For some TARGET, like ARM THUMB1, the offset should be
>>> nature
>>>> > +            aligned.  Try an aligned offset if address_mode is not
>>> QImode.
>>>> > */
>>>> > +         off = (address_mode == QImode)
>>>> > +               ? 0
>>>> > +               : ((unsigned HOST_WIDE_INT) 1 << i)
>>>> > +                   - GET_MODE_SIZE (address_mode);
>>>> > +         if (off > 0)
>>>> > +           {
>>>> > +             XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>> > +             if (memory_address_addr_space_p (mem_mode, addr, as))
>>>> > +               break;
>>>> > +           }
>>>> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check
>>> it
>>>> seems unnecessary.
>>>
>>> Thanks for the comments.
>>>
>>> ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a
>>> negative value except QImode. A negative value can not be max_offset. So we
>>> do not need to check it.
>>>
>>> For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE
>>> (address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already
>>> checked. So no need to check it again.
>>>
>>> I think the compiler can optimize the patch like
>>>
>>> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
>>> index 3b4a6cd..213598a 100644
>>> --- a/gcc/tree-ssa-loop-ivopts.c
>>> +++ b/gcc/tree-ssa-loop-ivopts.c
>>> @@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool
>>> var_present,
>>>           XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>>           if (memory_address_addr_space_p (mem_mode, addr, as))
>>>             break;
>>> +         /* For some TARGET, like ARM THUMB1, the offset should be nature
>>> +            aligned.  Try an aligned offset if address_mode is not QImode.
>>> */
>>> +         if (address_mode != QImode)
>>> +           {
>>> +             off = ((unsigned HOST_WIDE_INT) 1 << i)
>>> +                     - GET_MODE_SIZE (address_mode);
>>> +             if (off > 0)
>>> +               {
>>> +                 XEXP (addr, 1) = gen_int_mode (off, address_mode);
>>> +                 if (memory_address_addr_space_p (mem_mode, addr, as))
>>> +                   break;
>>> +               }
>>> +           }
>>>         }
>>>        if (i == -1)
>>>          off = 0;
>>
>> But is off now guaranteed to be the max value? (1 << (i-1) ) - 1 for
>> small i is larger than (1 << i) - GET_MODE_SIZE (address_mode).
>>
>> That is, I think you want to guard this with 1 << (i - 1) >
>> GET_MODE_SIZE (address_mode)?
>
> Yes. Without off > 0, it can not guarantee the off is the max value.
> With off > 0, it can guarantee that
>
> (1 << i) - GET_MODE_SIZE (address_mode) is greater than (1 << (i-1) ) - 1.
>
>> You don't adjust the negative offset side - why?
>
> -((unsigned HOST_WIDE_INT) 1 << i) is already the min aligned offset.

Ok.

The patch is ok then.

Thanks,
Richard.

> Thanks!
> -Zhenqiang
>
>> Richard.
>>
>>>
>>>> Thanks,
>>>> bin
>>>> >         }
>>>> >        if (i == -1)
>>>> >          off = 0;
>>>> > diff --git
>>>> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>>> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c
>>>> > new file mode 100644
>>>> > index 0000000..cc3e2f7
>>>> > --- /dev/null
>>>> > +++
>>>> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset
>>>> > +++ .c
>>>> > @@ -0,0 +1,28 @@
>>>> > +/* { dg-do compile } */
>>>> > +/* { dg-options "-mthumb -O2" }  */
>>>> > +/* { dg-require-effective-target arm_thumb1_ok } */
>>>> > +
>>>> > +unsigned int
>>>> > +test (const short p16[6 * 64])
>>>> > +{
>>>> > +  unsigned int i = 6;
>>>> > +  unsigned int ret = 0;
>>>> > +
>>>> > +  do
>>>> > +    {
>>>> > +      unsigned long long *p64 = (unsigned long long*) p16;
>>>> > +      unsigned int *p32 = (unsigned int*) p16;
>>>> > +      ret += ret;
>>>> > +      if (p16[1] || p32[1])
>>>> > +       ret++;
>>>> > +      else if (p64[1] | p64[2] | p64[3])
>>>> > +       ret++;
>>>> > +      p16 += 64;
>>>> > +      i--;
>>>> > +    } while (i != 0);
>>>> > +
>>>> > +  return ret;
>>>> > +}
>>>> > +
>>>> > +/* { dg-final { scan-assembler-not "#22" } } */
>>>> > +/* { dg-final { scan-assembler-not "#14" } } */
>>>> >
>>>> >
>>>> >
>>>
>>>
>>>
>>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2014-08-04  6:30 [PATCH, ivopt] Try aligned offset when get_address_cost Zhenqiang Chen
  2014-08-04  8:40 ` Bin.Cheng
@ 2015-02-04 10:35 ` Eric Botcazou
  2015-02-04 12:47   ` Richard Biener
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2015-02-04 10:35 UTC (permalink / raw)
  To: Zhenqiang Chen; +Cc: gcc-patches

> For some TARGET, like ARM THUMB1, the offset in load/store should be nature
> aligned. But in function get_address_cost, when computing max_offset, it
> only tries byte-aligned offsets:
> 
>   ((unsigned HOST_WIDE_INT) 1 << i) - 1
> 
> which can not meet thumb_legitimate_offset_p check called from
> thumb1_legitimate_address_p for HImode and SImode.
> 
> The patch adds additional try for aligned offset:
> 
>   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).

I think that the change is not fully correct, this must be mem_mode instead of 
address_mode since the alignment of the MEM is given by mem_mode:

Index: tree-ssa-loop-ivopts.c
===================================================================
--- tree-ssa-loop-ivopts.c      (revision 220343)
+++ tree-ssa-loop-ivopts.c      (working copy)
@@ -3324,12 +3324,12 @@ get_address_cost (bool symbol_present, b
          XEXP (addr, 1) = gen_int_mode (off, address_mode);
          if (memory_address_addr_space_p (mem_mode, addr, as))
            break;
-         /* For some TARGET, like ARM THUMB1, the offset should be nature
-            aligned. Try an aligned offset if address_mode is not QImode.  */
-         off = (address_mode == QImode)
+         /* For some strict-alignment targets, the offset must be naturally
+            aligned.  Try an aligned offset if mem_mode is not QImode.  */
+         off = mem_mode == QImode
                ? 0
                : ((unsigned HOST_WIDE_INT) 1 << i)
-                   - GET_MODE_SIZE (address_mode);
+                   - GET_MODE_SIZE (mem_mode);
          if (off > 0)
            {
              XEXP (addr, 1) = gen_int_mode (off, address_mode);


This fixes unexpected differences in the result of the function between SPARC 
and SPARC64 for the same source code.  OK for mainline after testing?


	* tree-ssa-loop-ivopts.c (get_address_cost): Use proper mode for offset.


-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, ivopt] Try aligned offset when get_address_cost
  2015-02-04 10:35 ` Eric Botcazou
@ 2015-02-04 12:47   ` Richard Biener
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Biener @ 2015-02-04 12:47 UTC (permalink / raw)
  To: Eric Botcazou, Zhenqiang Chen; +Cc: gcc-patches

On February 4, 2015 11:32:54 AM CET, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> For some TARGET, like ARM THUMB1, the offset in load/store should be
>nature
>> aligned. But in function get_address_cost, when computing max_offset,
>it
>> only tries byte-aligned offsets:
>> 
>>   ((unsigned HOST_WIDE_INT) 1 << i) - 1
>> 
>> which can not meet thumb_legitimate_offset_p check called from
>> thumb1_legitimate_address_p for HImode and SImode.
>> 
>> The patch adds additional try for aligned offset:
>> 
>>   ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode).
>
>I think that the change is not fully correct, this must be mem_mode
>instead of 
>address_mode since the alignment of the MEM is given by mem_mode:
>
>Index: tree-ssa-loop-ivopts.c
>===================================================================
>--- tree-ssa-loop-ivopts.c      (revision 220343)
>+++ tree-ssa-loop-ivopts.c      (working copy)
>@@ -3324,12 +3324,12 @@ get_address_cost (bool symbol_present, b
>          XEXP (addr, 1) = gen_int_mode (off, address_mode);
>          if (memory_address_addr_space_p (mem_mode, addr, as))
>            break;
>-         /* For some TARGET, like ARM THUMB1, the offset should be
>nature
>-            aligned. Try an aligned offset if address_mode is not
>QImode.  */
>-         off = (address_mode == QImode)
>+         /* For some strict-alignment targets, the offset must be
>naturally
>+            aligned.  Try an aligned offset if mem_mode is not QImode.
> */
>+         off = mem_mode == QImode
>                ? 0
>                : ((unsigned HOST_WIDE_INT) 1 << i)
>-                   - GET_MODE_SIZE (address_mode);
>+                   - GET_MODE_SIZE (mem_mode);
>          if (off > 0)
>            {
>              XEXP (addr, 1) = gen_int_mode (off, address_mode);
>
>
>This fixes unexpected differences in the result of the function between
>SPARC 
>and SPARC64 for the same source code.  OK for mainline after testing?

OK.

Thanks,
Richard.

>
>	* tree-ssa-loop-ivopts.c (get_address_cost): Use proper mode for
>offset.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-02-04 12:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-04  6:30 [PATCH, ivopt] Try aligned offset when get_address_cost Zhenqiang Chen
2014-08-04  8:40 ` Bin.Cheng
2014-08-04  9:10   ` Zhenqiang Chen
2014-08-05 14:00     ` Richard Biener
2014-08-06  6:34       ` Zhenqiang Chen
2014-08-06 12:25         ` Richard Biener
2015-02-04 10:35 ` Eric Botcazou
2015-02-04 12:47   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).