public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
@ 2019-09-09 17:03 Wilco Dijkstra
  2019-10-10 17:42 ` Wilco Dijkstra
  2019-10-10 22:47 ` Ramana Radhakrishnan
  0 siblings, 2 replies; 5+ messages in thread
From: Wilco Dijkstra @ 2019-09-09 17:03 UTC (permalink / raw)
  To: GCC Patches, Kyrylo Tkachov, Richard Earnshaw; +Cc: nd

Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
legitimization code for Thumb-2 to get significant codesize and performance
gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 0.2%.

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-09  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/arm/arm.c (arm_legitimize_address): Remove Thumb-2 bailout.

--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a5a6a0fab1b4b7ef07931522e7d47e59842d7f27..2601708e7e0716e4668b79e015e366d2164562fd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8652,13 +8652,8 @@ arm_legitimize_address (rtx x, rtx orig_x, machine_mode mode)
 	return x;
     }
 
-  if (!TARGET_ARM)
-    {
-      /* TODO: legitimize_address for Thumb2.  */
-      if (TARGET_THUMB2)
-        return x;
-      return thumb_legitimize_address (x, orig_x, mode);
-    }
+  if (TARGET_THUMB1)
+    return thumb_legitimize_address (x, orig_x, mode);
 
   if (GET_CODE (x) == PLUS)
     {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
  2019-09-09 17:03 [PATCH][ARM] Enable arm_legitimize_address for Thumb-2 Wilco Dijkstra
@ 2019-10-10 17:42 ` Wilco Dijkstra
  2019-10-10 22:47 ` Ramana Radhakrishnan
  1 sibling, 0 replies; 5+ messages in thread
From: Wilco Dijkstra @ 2019-10-10 17:42 UTC (permalink / raw)
  To: GCC Patches, Kyrylo Tkachov, Richard Earnshaw, Richard Sandiford; +Cc: nd

ping

Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
legitimization code for Thumb-2 to get significant codesize and performance
gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 0.2%.

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-09  Wilco Dijkstra  <wdijkstr@arm.com>

        * config/arm/arm.c (arm_legitimize_address): Remove Thumb-2 bailout.

--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a5a6a0fab1b4b7ef07931522e7d47e59842d7f27..2601708e7e0716e4668b79e015e366d2164562fd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8652,13 +8652,8 @@ arm_legitimize_address (rtx x, rtx orig_x, machine_mode mode)
         return x;
     }
 
-  if (!TARGET_ARM)
-    {
-      /* TODO: legitimize_address for Thumb2.  */
-      if (TARGET_THUMB2)
-        return x;
-      return thumb_legitimize_address (x, orig_x, mode);
-    }
+  if (TARGET_THUMB1)
+    return thumb_legitimize_address (x, orig_x, mode);
 
   if (GET_CODE (x) == PLUS)
     {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
  2019-09-09 17:03 [PATCH][ARM] Enable arm_legitimize_address for Thumb-2 Wilco Dijkstra
  2019-10-10 17:42 ` Wilco Dijkstra
@ 2019-10-10 22:47 ` Ramana Radhakrishnan
  2019-10-11 16:25   ` Wilco Dijkstra
  1 sibling, 1 reply; 5+ messages in thread
From: Ramana Radhakrishnan @ 2019-10-10 22:47 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GCC Patches, Kyrylo Tkachov, Richard Earnshaw, nd

On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>
> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
> inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
> legitimization code for Thumb-2 to get significant codesize and performance
> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 0.2%.
>
What were the sort of code size gains  ? It did end up piquing my
curiosity as to how we missed something so basic.  For instance ldr
r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
there was an illegitimate address given here, would we end up
producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
Scratching my head a bit , it's late at night.

Orthogonally it looks like you can clean up the MINUS handling here
and in legitimate_address_p , I'm not sure what the status of LRA with
MINUS is either and thus we should now look to clean this up or look
to turn this on and see what happens. However that's a subject of a
future patch.

> Bootstrap OK, OK for commit?
>

For the record, bootstrap with Thumb2  presumably and the testruns were clean ?

regards
Ramana



Ramana


> ChangeLog:
> 2019-09-09  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         * config/arm/arm.c (arm_legitimize_address): Remove Thumb-2 bailout.
>
> --
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index a5a6a0fab1b4b7ef07931522e7d47e59842d7f27..2601708e7e0716e4668b79e015e366d2164562fd 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -8652,13 +8652,8 @@ arm_legitimize_address (rtx x, rtx orig_x, machine_mode mode)
>         return x;
>      }
>
> -  if (!TARGET_ARM)
> -    {
> -      /* TODO: legitimize_address for Thumb2.  */
> -      if (TARGET_THUMB2)
> -        return x;
> -      return thumb_legitimize_address (x, orig_x, mode);
> -    }
> +  if (TARGET_THUMB1)
> +    return thumb_legitimize_address (x, orig_x, mode);
>
>    if (GET_CODE (x) == PLUS)
>      {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
  2019-10-10 22:47 ` Ramana Radhakrishnan
@ 2019-10-11 16:25   ` Wilco Dijkstra
  2019-10-11 18:58     ` Ramana Radhakrishnan
  0 siblings, 1 reply; 5+ messages in thread
From: Wilco Dijkstra @ 2019-10-11 16:25 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: GCC Patches, Kyrylo Tkachov, Richard Earnshaw, nd

Hi Ramana,

>On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>>
>> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
>> inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
>> legitimization code for Thumb-2 to get significant codesize and performance
>> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 0.2%.
>
> What were the sort of code size gains  ? It did end up piquing my
> curiosity as to how we missed something so basic.  For instance ldr

h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB
smaller (~0.1%).

> r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
>  there was an illegitimate address given here, would we end up
> producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
> Scratching my head a bit , it's late at night.

If the proposed offsets are not legal, GCC tries something different that is
legal, hence there is no requirement to only propose legal splits. In any
case we don't optimize the negative range even on Arm - the offsets are
always split into 4KB ranges (not 8KB as would be possible).

The negative offset splitting doesn't appear to have changed on Thumb-2
so there is apparently a different mechanism that deals with negative offsets.
So only positive offsets are improved with my patch:

int f(int *p) { return p[1025] + p[1026]; }

before:
	movw	r3, #4104
	movw	r2, #4100
	ldr	r2, [r0, r2]
	ldr	r0, [r0, r3]
	add	r0, r0, r2
	bx	lr

after:
	add	r3, r0, #4096
	ldrd	r0, r3, [r3, #4]
	add	r0, r0, r3
	bx	lr

> Orthogonally it looks like you can clean up the MINUS handling here
> and in legitimate_address_p , I'm not sure what the status of LRA with
> MINUS is either and thus we should now look to clean this up or look
> to turn this on and see what happens. However that's a subject of a
> future patch.

Well there are lots of cases that aren't handled correctly or optimally
yet - I'd copy the way I wrote aarch64_legitimize_address_displacement,
but that's for a future patch indeed.

> For the record, bootstrap with Thumb2  presumably and the testruns were clean ?

Yes, at the time I ran them.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
  2019-10-11 16:25   ` Wilco Dijkstra
@ 2019-10-11 18:58     ` Ramana Radhakrishnan
  0 siblings, 0 replies; 5+ messages in thread
From: Ramana Radhakrishnan @ 2019-10-11 18:58 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GCC Patches, Kyrylo Tkachov, Richard Earnshaw, nd

On Fri, Oct 11, 2019 at 5:17 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>
> Hi Ramana,
>
> >On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> >>
> >> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
> >> inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
> >> legitimization code for Thumb-2 to get significant codesize and performance
> >> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 0.2%.
> >
> > What were the sort of code size gains  ? It did end up piquing my
> > curiosity as to how we missed something so basic.  For instance ldr
>
> h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB
> smaller (~0.1%).
>
> > r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
> >  there was an illegitimate address given here, would we end up
> > producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
> > Scratching my head a bit , it's late at night.
>
> If the proposed offsets are not legal, GCC tries something different that is
> legal, hence there is no requirement to only propose legal splits. In any
> case we don't optimize the negative range even on Arm - the offsets are
> always split into 4KB ranges (not 8KB as would be possible).
>
> The negative offset splitting doesn't appear to have changed on Thumb-2
> so there is apparently a different mechanism that deals with negative offsets.
> So only positive offsets are improved with my patch:
>
> int f(int *p) { return p[1025] + p[1026]; }
>
> before:
>         movw    r3, #4104
>         movw    r2, #4100
>         ldr     r2, [r0, r2]
>         ldr     r0, [r0, r3]
>         add     r0, r0, r2
>         bx      lr
>
> after:
>         add     r3, r0, #4096
>         ldrd    r0, r3, [r3, #4]
>         add     r0, r0, r3
>         bx      lr
>
> > Orthogonally it looks like you can clean up the MINUS handling here
> > and in legitimate_address_p , I'm not sure what the status of LRA with
> > MINUS is either and thus we should now look to clean this up or look
> > to turn this on and see what happens. However that's a subject of a
> > future patch.
>
> Well there are lots of cases that aren't handled correctly or optimally
> yet - I'd copy the way I wrote aarch64_legitimize_address_displacement,
> but that's for a future patch indeed.
>
> > For the record, bootstrap with Thumb2  presumably and the testruns were clean ?
>
> Yes, at the time I ran them.

Yeah it dropped about 2660 bytes out of a Thumb2 bootstrap - so it
seems worth while.

Ok please apply but as usual watch out for any fallout from this.

regards
Ramana
>
> Cheers,
> Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-11 18:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 17:03 [PATCH][ARM] Enable arm_legitimize_address for Thumb-2 Wilco Dijkstra
2019-10-10 17:42 ` Wilco Dijkstra
2019-10-10 22:47 ` Ramana Radhakrishnan
2019-10-11 16:25   ` Wilco Dijkstra
2019-10-11 18:58     ` Ramana Radhakrishnan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).