public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
@ 2014-04-29  5:08 bin.cheng
  2014-05-01 14:03 ` Richard Earnshaw
  0 siblings, 1 reply; 7+ messages in thread
From: bin.cheng @ 2014-04-29  5:08 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

Hi,
Function output_move_neon now generates vld1.64 for memory ref like "dx <-
[r1:SI]", this is bogus because it requires at least 64-bit alignment for
32-bit aligned memory ref.  It works now because GCC doesn't generate such
insns in the first place, but things are going to change if memset/memcpy
calls are inlined by using neon instructions.

This patch fixes the issue by generating ldr for such instructions.

Bootstrapped on cortex-a15 with neon.
Is it OK?

Thanks,
bin


2014-04-29  Bin Cheng  <bin.cheng@arm.com>

	* config/arm/arm.c (output_move_neon): Handle REG explicitly.

[-- Attachment #2: output_move_neon-20140429.txt --]
[-- Type: text/plain, Size: 1085 bytes --]

Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 209852)
+++ gcc/config/arm/arm.c	(working copy)
@@ -18427,6 +18453,20 @@ output_move_neon (rtx *operands)
       /* FIXME: Not currently enabled in neon_vector_mem_operand.  */
       gcc_unreachable ();
 
+    case REG:
+      /* We have to use vldm / vstm for too-large modes.  */
+      if (nregs > 1)
+	{
+	  if (nregs > 4)
+	    templ = "v%smia%%?\t%%m0, %%h1";
+	  else
+	    templ = "v%s1.64\t%%h1, %%A0";
+
+	  ops[0] = mem;
+	  ops[1] = reg;
+	  break;
+	}
+      /* Fall through.  */
     case LABEL_REF:
     case PLUS:
       {
@@ -18460,14 +18500,7 @@ output_move_neon (rtx *operands)
       }
 
     default:
-      /* We have to use vldm / vstm for too-large modes.  */
-      if (nregs > 4)
-	templ = "v%smia%%?\t%%m0, %%h1";
-      else
-	templ = "v%s1.64\t%%h1, %%A0";
-
-      ops[0] = mem;
-      ops[1] = reg;
+      gcc_unreachable ();
     }
 
   sprintf (buff, templ, load ? "ld" : "st");

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-04-29  5:08 [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly bin.cheng
@ 2014-05-01 14:03 ` Richard Earnshaw
  2014-05-05  7:21   ` bin.cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Earnshaw @ 2014-05-01 14:03 UTC (permalink / raw)
  To: bin.cheng; +Cc: gcc-patches

On 29/04/14 04:02, bin.cheng wrote:
> Hi,
> Function output_move_neon now generates vld1.64 for memory ref like "dx <-
> [r1:SI]", this is bogus because it requires at least 64-bit alignment for
> 32-bit aligned memory ref.  It works now because GCC doesn't generate such
> insns in the first place, but things are going to change if memset/memcpy
> calls are inlined by using neon instructions.
> 

V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
enabled.  We normally assume that not to be the case.  The exception to
this is when an explicit alignment check is used in the address
expression (the :64 suffix), which causes the address to be checked for
strict alignment at all times.

Do you have a testcase?

R.

> This patch fixes the issue by generating ldr for such instructions.
> 
> Bootstrapped on cortex-a15 with neon.
> Is it OK?
> 
> Thanks,
> bin
> 
> 
> 2014-04-29  Bin Cheng  <bin.cheng@arm.com>
> 
> 	* config/arm/arm.c (output_move_neon): Handle REG explicitly.
> 
> 
> output_move_neon-20140429.txt
> 
> 
> Index: gcc/config/arm/arm.c
> ===================================================================
> --- gcc/config/arm/arm.c	(revision 209852)
> +++ gcc/config/arm/arm.c	(working copy)
> @@ -18427,6 +18453,20 @@ output_move_neon (rtx *operands)
>        /* FIXME: Not currently enabled in neon_vector_mem_operand.  */
>        gcc_unreachable ();
>  
> +    case REG:
> +      /* We have to use vldm / vstm for too-large modes.  */
> +      if (nregs > 1)
> +	{
> +	  if (nregs > 4)
> +	    templ = "v%smia%%?\t%%m0, %%h1";
> +	  else
> +	    templ = "v%s1.64\t%%h1, %%A0";
> +
> +	  ops[0] = mem;
> +	  ops[1] = reg;
> +	  break;
> +	}
> +      /* Fall through.  */
>      case LABEL_REF:
>      case PLUS:
>        {
> @@ -18460,14 +18500,7 @@ output_move_neon (rtx *operands)
>        }
>  
>      default:
> -      /* We have to use vldm / vstm for too-large modes.  */
> -      if (nregs > 4)
> -	templ = "v%smia%%?\t%%m0, %%h1";
> -      else
> -	templ = "v%s1.64\t%%h1, %%A0";
> -
> -      ops[0] = mem;
> -      ops[1] = reg;
> +      gcc_unreachable ();
>      }
>  
>    sprintf (buff, templ, load ? "ld" : "st");
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-05-01 14:03 ` Richard Earnshaw
@ 2014-05-05  7:21   ` bin.cheng
  2014-07-02  6:48     ` Ramana Radhakrishnan
  0 siblings, 1 reply; 7+ messages in thread
From: bin.cheng @ 2014-05-05  7:21 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches



> -----Original Message-----
> From: Richard Earnshaw
> Sent: Thursday, May 01, 2014 10:03 PM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> output_move_neon explicitly
> 
> On 29/04/14 04:02, bin.cheng wrote:
> > Hi,
> > Function output_move_neon now generates vld1.64 for memory ref like
> > "dx <- [r1:SI]", this is bogus because it requires at least 64-bit
> > alignment for 32-bit aligned memory ref.  It works now because GCC
> > doesn't generate such insns in the first place, but things are going
> > to change if memset/memcpy calls are inlined by using neon instructions.
> >
> 
> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
enabled.  We
> normally assume that not to be the case.  The exception to this is when an
theoretically, this doesn't make the problem go away, right?

> explicit alignment check is used in the address expression (the :64
suffix),
> which causes the address to be checked for strict alignment at all times.
> 
> Do you have a testcase?
I can't provide a test case without the memset inlining patch.

Thanks,
bin




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-05-05  7:21   ` bin.cheng
@ 2014-07-02  6:48     ` Ramana Radhakrishnan
  2014-07-02 12:46       ` Bin.Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Ramana Radhakrishnan @ 2014-07-02  6:48 UTC (permalink / raw)
  To: bin.cheng; +Cc: Richard Earnshaw, gcc-patches

On Mon, May 5, 2014 at 8:21 AM, bin.cheng <bin.cheng@arm.com> wrote:
>
>
>> -----Original Message-----
>> From: Richard Earnshaw
>> Sent: Thursday, May 01, 2014 10:03 PM
>> To: Bin Cheng
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH ARM]Handle REG addressing mode in
>> output_move_neon explicitly
>>
>> On 29/04/14 04:02, bin.cheng wrote:
>> > Hi,
>> > Function output_move_neon now generates vld1.64 for memory ref like
>> > "dx <- [r1:SI]", this is bogus because it requires at least 64-bit
>> > alignment for 32-bit aligned memory ref.  It works now because GCC
>> > doesn't generate such insns in the first place, but things are going
>> > to change if memset/memcpy calls are inlined by using neon instructions.
>> >
>>
>> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
> enabled.  We
>> normally assume that not to be the case.  The exception to this is when an
> theoretically, this doesn't make the problem go away, right?
>
>> explicit alignment check is used in the address expression (the :64
> suffix),
>> which causes the address to be checked for strict alignment at all times.
>>
>> Do you have a testcase?
> I can't provide a test case without the memset inlining patch.
>
Are the tests in the memset inlining patch now sufficient to expose
the problem or do we need another test ?

Ramana
> Thanks,
> bin
>
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-07-02  6:48     ` Ramana Radhakrishnan
@ 2014-07-02 12:46       ` Bin.Cheng
  2014-07-04 12:20         ` Bin Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Bin.Cheng @ 2014-07-02 12:46 UTC (permalink / raw)
  To: ramrad01; +Cc: bin.cheng, Richard Earnshaw, gcc-patches

On Wed, Jul 2, 2014 at 7:48 AM, Ramana Radhakrishnan
<ramana.gcc@googlemail.com> wrote:
> On Mon, May 5, 2014 at 8:21 AM, bin.cheng <bin.cheng@arm.com> wrote:
>>
>>
>>> -----Original Message-----
>>> From: Richard Earnshaw
>>> Sent: Thursday, May 01, 2014 10:03 PM
>>> To: Bin Cheng
>>> Cc: gcc-patches@gcc.gnu.org
>>> Subject: Re: [PATCH ARM]Handle REG addressing mode in
>>> output_move_neon explicitly
>>>
>>> On 29/04/14 04:02, bin.cheng wrote:
>>> > Hi,
>>> > Function output_move_neon now generates vld1.64 for memory ref like
>>> > "dx <- [r1:SI]", this is bogus because it requires at least 64-bit
>>> > alignment for 32-bit aligned memory ref.  It works now because GCC
>>> > doesn't generate such insns in the first place, but things are going
>>> > to change if memset/memcpy calls are inlined by using neon instructions.
>>> >
>>>
>>> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
>> enabled.  We
>>> normally assume that not to be the case.  The exception to this is when an
>> theoretically, this doesn't make the problem go away, right?
>>
>>> explicit alignment check is used in the address expression (the :64
>> suffix),
>>> which causes the address to be checked for strict alignment at all times.
>>>
>>> Do you have a testcase?
>> I can't provide a test case without the memset inlining patch.
>>
> Are the tests in the memset inlining patch now sufficient to expose
> the problem or do we need another test ?

Yes, it's covered by the 4th/7th test cases in memset inlining patch.

Thanks,
bin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-07-02 12:46       ` Bin.Cheng
@ 2014-07-04 12:20         ` Bin Cheng
  2014-07-04 13:16           ` Ramana Radhakrishnan
  0 siblings, 1 reply; 7+ messages in thread
From: Bin Cheng @ 2014-07-04 12:20 UTC (permalink / raw)
  To: 'Bin.Cheng', Ramana Radhakrishnan; +Cc: Richard Earnshaw, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2061 bytes --]



> -----Original Message-----
> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
> Sent: Wednesday, July 02, 2014 1:46 PM
> To: Ramana Radhakrishnan
> Cc: Bin Cheng; Richard Earnshaw; gcc-patches
> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> output_move_neon explicitly
> 
> On Wed, Jul 2, 2014 at 7:48 AM, Ramana Radhakrishnan
> <ramana.gcc@googlemail.com> wrote:
> > On Mon, May 5, 2014 at 8:21 AM, bin.cheng <bin.cheng@arm.com> wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Richard Earnshaw
> >>> Sent: Thursday, May 01, 2014 10:03 PM
> >>> To: Bin Cheng
> >>> Cc: gcc-patches@gcc.gnu.org
> >>> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> >>> output_move_neon explicitly
> >>>
> >>> On 29/04/14 04:02, bin.cheng wrote:
> >>> > Hi,
> >>> > Function output_move_neon now generates vld1.64 for memory ref
> >>> > like "dx <- [r1:SI]", this is bogus because it requires at least
> >>> > 64-bit alignment for 32-bit aligned memory ref.  It works now
> >>> > because GCC doesn't generate such insns in the first place, but
> >>> > things are going to change if memset/memcpy calls are inlined by
using
> neon instructions.
> >>> >
> >>>
> >>> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
> >> enabled.  We
> >>> normally assume that not to be the case.  The exception to this is
> >>> when an
> >> theoretically, this doesn't make the problem go away, right?
> >>
> >>> explicit alignment check is used in the address expression (the :64
> >> suffix),
> >>> which causes the address to be checked for strict alignment at all
times.
> >>>
> >>> Do you have a testcase?
> >> I can't provide a test case without the memset inlining patch.
> >>
> > Are the tests in the memset inlining patch now sufficient to expose
> > the problem or do we need another test ?
> 
> Yes, it's covered by the 4th/7th test cases in memset inlining patch.
> 
This is the rebased patch, though the original one doesn't conflict with
latest trunk.  Is it OK?

Thanks,
bin

[-- Attachment #2: output_move_neon-20140704.txt --]
[-- Type: text/plain, Size: 1085 bytes --]

Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 212295)
+++ gcc/config/arm/arm.c	(working copy)
@@ -18454,6 +18498,20 @@ output_move_neon (rtx *operands)
       /* FIXME: Not currently enabled in neon_vector_mem_operand.  */
       gcc_unreachable ();
 
+    case REG:
+      /* We have to use vldm / vstm for too-large modes.  */
+      if (nregs > 1)
+	{
+	  if (nregs > 4)
+	    templ = "v%smia%%?\t%%m0, %%h1";
+	  else
+	    templ = "v%s1.64\t%%h1, %%A0";
+
+	  ops[0] = mem;
+	  ops[1] = reg;
+	  break;
+	}
+      /* Fall through.  */
     case LABEL_REF:
     case PLUS:
       {
@@ -18487,14 +18545,7 @@ output_move_neon (rtx *operands)
       }
 
     default:
-      /* We have to use vldm / vstm for too-large modes.  */
-      if (nregs > 4)
-	templ = "v%smia%%?\t%%m0, %%h1";
-      else
-	templ = "v%s1.64\t%%h1, %%A0";
-
-      ops[0] = mem;
-      ops[1] = reg;
+      gcc_unreachable ();
     }
 
   sprintf (buff, templ, load ? "ld" : "st");

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly
  2014-07-04 12:20         ` Bin Cheng
@ 2014-07-04 13:16           ` Ramana Radhakrishnan
  0 siblings, 0 replies; 7+ messages in thread
From: Ramana Radhakrishnan @ 2014-07-04 13:16 UTC (permalink / raw)
  To: Bin Cheng; +Cc: Bin.Cheng, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

>>
> This is the rebased patch, though the original one doesn't conflict with
> latest trunk.  Is it OK?

Ok.

Ramana
>
> Thanks,
> bin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-04 13:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-29  5:08 [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly bin.cheng
2014-05-01 14:03 ` Richard Earnshaw
2014-05-05  7:21   ` bin.cheng
2014-07-02  6:48     ` Ramana Radhakrishnan
2014-07-02 12:46       ` Bin.Cheng
2014-07-04 12:20         ` Bin Cheng
2014-07-04 13:16           ` Ramana Radhakrishnan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).