public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
@ 2015-01-29 15:51 Uros Bizjak
  2015-01-29 16:06 ` Jakub Jelinek
  2015-01-30  9:21 ` Jeff Law
  0 siblings, 2 replies; 16+ messages in thread
From: Uros Bizjak @ 2015-01-29 15:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law

Hello!

> So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.

@@ -0,0 +1,33 @@
+/* PR 15184 first two tests, plus two addition ones.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -m32 -march=pentiumpro" } */

No, we don't want -m32 in dg-options. Please write this part as:

/* { dg-do compile { target ia32 } } */
/* { dg-options "-O2 -march=pentiumpro" } */

Uros.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 15:51 [PATCH][PR target/15184] Fix for direct byte access on x86 Uros Bizjak
@ 2015-01-29 16:06 ` Jakub Jelinek
  2015-01-29 16:08   ` Uros Bizjak
  2015-01-30  9:21 ` Jeff Law
  1 sibling, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2015-01-29 16:06 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Jeff Law

On Thu, Jan 29, 2015 at 03:44:25PM +0100, Uros Bizjak wrote:
> Hello!
> 
> > So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.
> 
> @@ -0,0 +1,33 @@
> +/* PR 15184 first two tests, plus two addition ones.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
> 
> No, we don't want -m32 in dg-options. Please write this part as:
> 
> /* { dg-do compile { target ia32 } } */
> /* { dg-options "-O2 -march=pentiumpro" } */

Or
/* { dg-do compile } */
/* { dg-options "-O2" } */
/* { dg-additional-options "-march=pentiumpro" { target ia32 } } */
if there is nothing ia32 specific on the testcase, just that you want
to use that -march option if you are compiling it as 32-bit.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 16:06 ` Jakub Jelinek
@ 2015-01-29 16:08   ` Uros Bizjak
  2015-01-29 16:23     ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Uros Bizjak @ 2015-01-29 16:08 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jeff Law

On Thu, Jan 29, 2015 at 3:47 PM, Jakub Jelinek <jakub@redhat.com> wrote:

>> > So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.
>>
>> @@ -0,0 +1,33 @@
>> +/* PR 15184 first two tests, plus two addition ones.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
>>
>> No, we don't want -m32 in dg-options. Please write this part as:
>>
>> /* { dg-do compile { target ia32 } } */
>> /* { dg-options "-O2 -march=pentiumpro" } */
>
> Or
> /* { dg-do compile } */
> /* { dg-options "-O2" } */
> /* { dg-additional-options "-march=pentiumpro" { target ia32 } } */
> if there is nothing ia32 specific on the testcase, just that you want
> to use that -march option if you are compiling it as 32-bit.

In this case, you will need to add -mregparm=3 to
dg-additional-options to avoid "attribute ... ignored" awarning. There
are some examples in dg.target/i386/ testsuite directory.

Uros.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 16:08   ` Uros Bizjak
@ 2015-01-29 16:23     ` Jakub Jelinek
  2015-01-29 21:42       ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2015-01-29 16:23 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Jeff Law

On Thu, Jan 29, 2015 at 03:54:34PM +0100, Uros Bizjak wrote:
> On Thu, Jan 29, 2015 at 3:47 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> >> > So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.
> >>
> >> @@ -0,0 +1,33 @@
> >> +/* PR 15184 first two tests, plus two addition ones.  */
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
> >>
> >> No, we don't want -m32 in dg-options. Please write this part as:
> >>
> >> /* { dg-do compile { target ia32 } } */
> >> /* { dg-options "-O2 -march=pentiumpro" } */
> >
> > Or
> > /* { dg-do compile } */
> > /* { dg-options "-O2" } */
> > /* { dg-additional-options "-march=pentiumpro" { target ia32 } } */
> > if there is nothing ia32 specific on the testcase, just that you want
> > to use that -march option if you are compiling it as 32-bit.
> 
> In this case, you will need to add -mregparm=3 to
> dg-additional-options to avoid "attribute ... ignored" awarning. There
> are some examples in dg.target/i386/ testsuite directory.

Ah, then better limit that to just ia32 in dg-do.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 16:23     ` Jakub Jelinek
@ 2015-01-29 21:42       ` H.J. Lu
  0 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2015-01-29 21:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches, Jeff Law

On Thu, Jan 29, 2015 at 6:56 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Jan 29, 2015 at 03:54:34PM +0100, Uros Bizjak wrote:
>> On Thu, Jan 29, 2015 at 3:47 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> >> > So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.
>> >>
>> >> @@ -0,0 +1,33 @@
>> >> +/* PR 15184 first two tests, plus two addition ones.  */
>> >> +/* { dg-do compile } */
>> >> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
>> >>
>> >> No, we don't want -m32 in dg-options. Please write this part as:
>> >>
>> >> /* { dg-do compile { target ia32 } } */
>> >> /* { dg-options "-O2 -march=pentiumpro" } */
>> >
>> > Or
>> > /* { dg-do compile } */
>> > /* { dg-options "-O2" } */
>> > /* { dg-additional-options "-march=pentiumpro" { target ia32 } } */
>> > if there is nothing ia32 specific on the testcase, just that you want
>> > to use that -march option if you are compiling it as 32-bit.
>>
>> In this case, you will need to add -mregparm=3 to
>> dg-additional-options to avoid "attribute ... ignored" awarning. There
>> are some examples in dg.target/i386/ testsuite directory.
>
> Ah, then better limit that to just ia32 in dg-do.
>

Yea, please.  I have seen:

New failures:
FAIL: gcc.target/i386/pr15184-1.c (test for excess errors)
FAIL: gcc.target/i386/pr15184-2.c (test for excess errors)

for -mx32.

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 15:51 [PATCH][PR target/15184] Fix for direct byte access on x86 Uros Bizjak
  2015-01-29 16:06 ` Jakub Jelinek
@ 2015-01-30  9:21 ` Jeff Law
  2015-01-30 11:02   ` Uros Bizjak
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Law @ 2015-01-30  9:21 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches

On 01/29/15 07:44, Uros Bizjak wrote:
> Hello!
>
>> So here's the updated patch which handles all 4 testcases from the PR as well as a couple of my own.
>
> @@ -0,0 +1,33 @@
> +/* PR 15184 first two tests, plus two addition ones.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
>
> No, we don't want -m32 in dg-options. Please write this part as:
>
> /* { dg-do compile { target ia32 } } */
> /* { dg-options "-O2 -march=pentiumpro" } */
Will update after sniff testing.  Sorry for the noise.

jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30  9:21 ` Jeff Law
@ 2015-01-30 11:02   ` Uros Bizjak
  2015-01-30 11:21     ` Jakub Jelinek
  2015-01-30 20:37     ` Jeff Law
  0 siblings, 2 replies; 16+ messages in thread
From: Uros Bizjak @ 2015-01-30 11:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law, Jakub Jelinek, Dominique Dhumieres

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On Fri, Jan 30, 2015 at 7:12 AM, Jeff Law <law@redhat.com> wrote:

>> Hello!
>>
>>> So here's the updated patch which handles all 4 testcases from the PR as
>>> well as a couple of my own.
>>
>>
>> @@ -0,0 +1,33 @@
>> +/* PR 15184 first two tests, plus two addition ones.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
>>
>> No, we don't want -m32 in dg-options. Please write this part as:
>>
>> /* { dg-do compile { target ia32 } } */
>> /* { dg-options "-O2 -march=pentiumpro" } */
>
> Will update after sniff testing.  Sorry for the noise.

I'll commit the attached patch that fixes this and -fpic problem as
soon as regtest finishes.

2015-01-30  Uros Bizjak  <ubizjak@gmail.com>

    * gcc.target/i386/pr15184-1.c: Compile for ia32 target only.
    (dg-options): Remove -m32.
    (dg-final): Scan for "movb %al" only.
    * gcc.target/i386/pr15184-2.c: Ditto.

Uros.

[-- Attachment #2: t.diff.txt --]
[-- Type: text/plain, Size: 1852 bytes --]

Index: gcc.target/i386/pr15184-1.c
===================================================================
--- gcc.target/i386/pr15184-1.c	(revision 220273)
+++ gcc.target/i386/pr15184-1.c	(working copy)
@@ -1,11 +1,10 @@
 /* PR 15184 first two tests, plus two addition ones.  */
-/* { dg-do compile } */
-/* { dg-options "-O2 -m32 -march=pentiumpro" } */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=pentiumpro" } */
 
-#define regparm __attribute__((__regparm__(3)))
+#define regparm __attribute__((__regparm__(1)))
 
 extern unsigned int x;
-extern unsigned short y;
 
 void regparm f0(unsigned char c)
 {
@@ -29,5 +28,5 @@
 
 /* Each function should compile down to a byte move from
    the input register into x, possibly at an offset within x.  */
-/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
+/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 4 } } */
 
Index: gcc.target/i386/pr15184-2.c
===================================================================
--- gcc.target/i386/pr15184-2.c	(revision 220273)
+++ gcc.target/i386/pr15184-2.c	(working copy)
@@ -1,10 +1,9 @@
 /* PR 15184 second two tests
-/* { dg-do compile } */
-/* { dg-options "-O2 -m32 -march=pentiumpro" } */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=pentiumpro" } */
 
-#define regparm __attribute__((__regparm__(3)))
+#define regparm __attribute__((__regparm__(1)))
 
-extern unsigned int x;
 extern unsigned short y;
 
 void regparm g0(unsigned char c)
@@ -18,6 +17,6 @@
 }
 
 /* Each function should compile down to a byte move from
-   the input register into x, possibly at an offset within x.  */
-/* { dg-final { scan-assembler-times "\tmovb\t%al, y" 2 } } */
+   the input register into y, possibly at an offset within y.  */
+/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 2 } } */
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30 11:21     ` Jakub Jelinek
@ 2015-01-30 11:21       ` Uros Bizjak
  2015-01-30 11:46         ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Uros Bizjak @ 2015-01-30 11:21 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jeff Law, Dominique Dhumieres

On Fri, Jan 30, 2015 at 11:13 AM, Jakub Jelinek <jakub@redhat.com> wrote:

>>  /* Each function should compile down to a byte move from
>>     the input register into x, possibly at an offset within x.  */
>> -/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
>> +/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 4 } } */
>
> Shouldn't that better be movb\[^\n\r\]+%al, so that it doesn't
> fail with -masm=intel ?

Unfortunately, -masm=intel emits "mov ..., al". And since there is
already plenty of scan-asms for mov[lq], I just took the easy shortcut
;)

Uros.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30 11:02   ` Uros Bizjak
@ 2015-01-30 11:21     ` Jakub Jelinek
  2015-01-30 11:21       ` Uros Bizjak
  2015-01-30 20:37     ` Jeff Law
  1 sibling, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2015-01-30 11:21 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Jeff Law, Dominique Dhumieres

On Fri, Jan 30, 2015 at 11:07:45AM +0100, Uros Bizjak wrote:
> --- gcc.target/i386/pr15184-1.c	(revision 220273)
> +++ gcc.target/i386/pr15184-1.c	(working copy)
> @@ -1,11 +1,10 @@
>  /* PR 15184 first two tests, plus two addition ones.  */
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -m32 -march=pentiumpro" } */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=pentiumpro" } */
>  
> -#define regparm __attribute__((__regparm__(3)))
> +#define regparm __attribute__((__regparm__(1)))
>  
>  extern unsigned int x;
> -extern unsigned short y;
>  
>  void regparm f0(unsigned char c)
>  {
> @@ -29,5 +28,5 @@
>  
>  /* Each function should compile down to a byte move from
>     the input register into x, possibly at an offset within x.  */
> -/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
> +/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 4 } } */

Shouldn't that better be movb\[^\n\r\]+%al, so that it doesn't
fail with -masm=intel ?

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30 11:21       ` Uros Bizjak
@ 2015-01-30 11:46         ` Jakub Jelinek
  2015-01-30 20:36           ` Jeff Law
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2015-01-30 11:46 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Jeff Law, Dominique Dhumieres

On Fri, Jan 30, 2015 at 11:23:38AM +0100, Uros Bizjak wrote:
> On Fri, Jan 30, 2015 at 11:13 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> >>  /* Each function should compile down to a byte move from
> >>     the input register into x, possibly at an offset within x.  */
> >> -/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
> >> +/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 4 } } */
> >
> > Shouldn't that better be movb\[^\n\r\]+%al, so that it doesn't
> > fail with -masm=intel ?
> 
> Unfortunately, -masm=intel emits "mov ..., al". And since there is
> already plenty of scan-asms for mov[lq], I just took the easy shortcut
> ;)

Ok, let's consider --target_board=unix/-masm=intel as unsupportable then.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30 11:46         ` Jakub Jelinek
@ 2015-01-30 20:36           ` Jeff Law
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2015-01-30 20:36 UTC (permalink / raw)
  To: Jakub Jelinek, Uros Bizjak; +Cc: gcc-patches, Dominique Dhumieres

On 01/30/15 03:24, Jakub Jelinek wrote:
> On Fri, Jan 30, 2015 at 11:23:38AM +0100, Uros Bizjak wrote:
>> On Fri, Jan 30, 2015 at 11:13 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>
>>>>   /* Each function should compile down to a byte move from
>>>>      the input register into x, possibly at an offset within x.  */
>>>> -/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
>>>> +/* { dg-final { scan-assembler-times "movb\[ \\t\]+%al" 4 } } */
>>>
>>> Shouldn't that better be movb\[^\n\r\]+%al, so that it doesn't
>>> fail with -masm=intel ?
>>
>> Unfortunately, -masm=intel emits "mov ..., al". And since there is
>> already plenty of scan-asms for mov[lq], I just took the easy shortcut
>> ;)
>
> Ok, let's consider --target_board=unix/-masm=intel as unsupportable then.
Yes, definitely.

jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-30 11:02   ` Uros Bizjak
  2015-01-30 11:21     ` Jakub Jelinek
@ 2015-01-30 20:37     ` Jeff Law
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff Law @ 2015-01-30 20:37 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Jakub Jelinek, Dominique Dhumieres

On 01/30/15 03:07, Uros Bizjak wrote:
> On Fri, Jan 30, 2015 at 7:12 AM, Jeff Law <law@redhat.com> wrote:
>
>>> Hello!
>>>
>>>> So here's the updated patch which handles all 4 testcases from the PR as
>>>> well as a couple of my own.
>>>
>>>
>>> @@ -0,0 +1,33 @@
>>> +/* PR 15184 first two tests, plus two addition ones.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -m32 -march=pentiumpro" } */
>>>
>>> No, we don't want -m32 in dg-options. Please write this part as:
>>>
>>> /* { dg-do compile { target ia32 } } */
>>> /* { dg-options "-O2 -march=pentiumpro" } */
>>
>> Will update after sniff testing.  Sorry for the noise.
>
> I'll commit the attached patch that fixes this and -fpic problem as
> soon as regtest finishes.
>
> 2015-01-30  Uros Bizjak  <ubizjak@gmail.com>
>
>      * gcc.target/i386/pr15184-1.c: Compile for ia32 target only.
>      (dg-options): Remove -m32.
>      (dg-final): Scan for "movb %al" only.
>      * gcc.target/i386/pr15184-2.c: Ditto.
Again, sorry for making work for you.  There's clearly more to the x86 
target tests than meets the eye.  I'll run further additions through you 
until I think I've got my brain wrapped around all things that have to 
be considered for x86 specific tests.

jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
@ 2015-01-30 11:01 Dominique Dhumieres
  0 siblings, 0 replies; 16+ messages in thread
From: Dominique Dhumieres @ 2015-01-30 11:01 UTC (permalink / raw)
  To: law; +Cc: gcc-patches

Hi Jeff,

The tests gcc.target/i386/pr15184-[12].c fail on darwin. Grepping for movb, I get

	movb	%al, (%edx)
	movb	%al, 1(%edx)
	movb	%al, 2(%edx)
	movb	%al, 3(%edx)

TIA

Dominique

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 20:50 ` Segher Boessenkool
@ 2015-01-30  8:57   ` Jeff Law
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2015-01-30  8:57 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches@gcc.gnu.org >> gcc-patches

On 01/29/15 13:13, Segher Boessenkool wrote:
> On Thu, Jan 29, 2015 at 07:30:31AM -0700, Jeff Law wrote:
>> @@ -2643,6 +2644,34 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
>>   		   || GET_CODE (src) == LSHIFTRT)
>>   	    nshift++;
>>   	}
>> +
>> +      /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
>> +	 are likely manipulating its value.  Ideally we'll be able to combine
>> +	 all four insns into a bitfield insertion of some kind.
>
> "I1 and I2"?
>
> Very simple patch, nice :-)
Thanks.  Fixed in the obvious way.

I'm also looking at 15596 which is in the same area that affects PPC.  I 
haven't come up with anything good for that one yet.  But it feels like 
it ought to have a similar flow (make_field_assignment creates some 
interesting RTL, we choose an appropriate split point and the right 
things should just happen).



Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][PR target/15184] Fix for direct byte access on x86
  2015-01-29 15:21 Jeff Law
@ 2015-01-29 20:50 ` Segher Boessenkool
  2015-01-30  8:57   ` Jeff Law
  0 siblings, 1 reply; 16+ messages in thread
From: Segher Boessenkool @ 2015-01-29 20:50 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches@gcc.gnu.org >> gcc-patches

On Thu, Jan 29, 2015 at 07:30:31AM -0700, Jeff Law wrote:
> @@ -2643,6 +2644,34 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
>  		   || GET_CODE (src) == LSHIFTRT)
>  	    nshift++;
>  	}
> +
> +      /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
> +	 are likely manipulating its value.  Ideally we'll be able to combine
> +	 all four insns into a bitfield insertion of some kind. 

"I1 and I2"?

Very simple patch, nice :-)


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH][PR target/15184] Fix for direct byte access on x86
@ 2015-01-29 15:21 Jeff Law
  2015-01-29 20:50 ` Segher Boessenkool
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Law @ 2015-01-29 15:21 UTC (permalink / raw)
  To: gcc-patches@gcc.gnu.org >> gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1307 bytes --]


So here's the updated patch which handles all 4 testcases from the PR as 
well as a couple of my own.

As discussed earlier this week, we have to twiddle the heuristic for 
when to enable 4 insn combinations to make 4-insn combos where the first 
insn is a load and the last a store profitable.  And we still need to 
handle the MEMs being different modes.

The new part of the patch is extending make_field_assignment to handle 
cases where SUBREGs appear in the source of the potential field assignment.

In particular make_field_assignment has some nice code to detect a field 
assignment that looks like

(set (x) (ior (and (x) (const_int C)) y)))

There's obviously a variety of tests, but that's the overall structure 
of the RTL.

To handle the cases in PR 15184, we just need to be able to handle a 
couple annoying SUBREGs:

(set (x)
  (subreg (ior (and (subreg (x)) (const_int C) y)

The outer subreg is going to be a narrowing subreg to narrow the 
logicals to whatever mode the destination uses.  The inner subreg widens 
the source to whatever mode is needed for the logicals.

Once the subregs are handled, the everything "just works" as I outlined 
in my earlier message.

Bootstrapped & regression tested on x86_64-unknown-linux and 
powerpc64-unknown-linux.

Applying to the trunk.

Jeff


[-- Attachment #2: P --]
[-- Type: text/plain, Size: 7606 bytes --]

commit 90e5056d53d680fcc47693a118bfd75f7af0b435
Author: Jeff Law <law@redhat.com>
Date:   Tue Jan 27 16:32:06 2015 -0700

    	PR target/15184
    	* combine.c (try_combine): If I0 is a memory load and I3 a store
    	to a related address, increase the "goodness" of doing a 4-insn
    	combination with I0-I3.
    	(make_field_assignment): Handle SUBREGs in the ior+and case.
    
    	PR target/15184
    	* gcc.target/i386/pr15184-1.c: New test.
    	* gcc.target/i386/pr15184-2.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7424d9f..0c02d43 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2015-01-29  Jeff Law  <law@redhat.com>
+
+	PR target/15184 
+	* combine.c (try_combine): If I0 is a memory load and I3 a store
+	to a related address, increase the "goodness" of doing a 4-insn
+	combination with I0-I3.
+	(make_field_assignment): Handle SUBREGs in the ior+and case.
+
 2015-01-29  Yuri Rumyantsev  <ysrumyan@gmail.com>
 
 	PR tree-optimization/64746
diff --git a/gcc/combine.c b/gcc/combine.c
index 5c763b4..24418df 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2620,6 +2620,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
       int i;
       int ngood = 0;
       int nshift = 0;
+      rtx set0, set3;
 
       if (!flag_expensive_optimizations)
 	return 0;
@@ -2643,6 +2644,34 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 		   || GET_CODE (src) == LSHIFTRT)
 	    nshift++;
 	}
+
+      /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
+	 are likely manipulating its value.  Ideally we'll be able to combine
+	 all four insns into a bitfield insertion of some kind. 
+
+	 Note the source in I0 might be inside a sign/zero extension and the
+	 memory modes in I0 and I3 might be different.  So extract the address
+	 from the destination of I3 and search for it in the source of I0.
+
+	 In the event that there's a match but the source/dest do not actually
+	 refer to the same memory, the worst that happens is we try some
+	 combinations that we wouldn't have otherwise.  */
+      if ((set0 = single_set (i0))
+	  /* Ensure the source of SET0 is a MEM, possibly buried inside
+	     an extension.  */
+	  && (GET_CODE (SET_SRC (set0)) == MEM
+	      || ((GET_CODE (SET_SRC (set0)) == ZERO_EXTEND
+		   || GET_CODE (SET_SRC (set0)) == SIGN_EXTEND)
+		  && GET_CODE (XEXP (SET_SRC (set0), 0)) == MEM))
+	  && (set3 = single_set (i3))
+	  /* Ensure the destination of SET3 is a MEM.  */
+	  && GET_CODE (SET_DEST (set3)) == MEM
+	  /* Would it be better to extract the base address for the MEM
+	     in SET3 and look for that?  I don't have cases where it matters
+	     but I could envision such cases.  */
+	  && rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0)))
+	ngood += 2;
+
       if (ngood < 2 && nshift < 2)
 	return 0;
     }
@@ -9272,6 +9301,13 @@ make_field_assignment (rtx x)
      to the appropriate position, force it to the required mode, and
      make the extraction.  Check for the AND in both operands.  */
 
+  /* One or more SUBREGs might obscure the constant-position field
+     assignment.  The first one we are likely to encounter is an outer
+     narrowing SUBREG, which we can just strip for the purposes of
+     identifying the constant-field assignment.  */
+  if (GET_CODE (src) == SUBREG && subreg_lowpart_p (src))
+    src = SUBREG_REG (src);
+
   if (GET_CODE (src) != IOR && GET_CODE (src) != XOR)
     return x;
 
@@ -9282,10 +9318,38 @@ make_field_assignment (rtx x)
       && CONST_INT_P (XEXP (rhs, 1))
       && rtx_equal_for_field_assignment_p (XEXP (rhs, 0), dest))
     c1 = INTVAL (XEXP (rhs, 1)), other = lhs;
+  /* The second SUBREG that might get in the way is a paradoxical
+     SUBREG around the first operand of the AND.  We want to 
+     pretend the operand is as wide as the destination here.   We
+     do this by creating a new MEM in the wider mode for the sole
+     purpose of the call to rtx_equal_for_field_assignment_p.   Also
+     note this trick only works for MEMs.  */
+  else if (GET_CODE (rhs) == AND
+	   && paradoxical_subreg_p (XEXP (rhs, 0))
+	   && GET_CODE (SUBREG_REG (XEXP (rhs, 0))) == MEM
+	   && CONST_INT_P (XEXP (rhs, 1))
+	   && rtx_equal_for_field_assignment_p (gen_rtx_MEM (GET_MODE (dest),
+							     XEXP (SUBREG_REG (XEXP (rhs, 0)), 0)),
+						dest))
+    c1 = INTVAL (XEXP (rhs, 1)), other = lhs;
   else if (GET_CODE (lhs) == AND
 	   && CONST_INT_P (XEXP (lhs, 1))
 	   && rtx_equal_for_field_assignment_p (XEXP (lhs, 0), dest))
     c1 = INTVAL (XEXP (lhs, 1)), other = rhs;
+  /* The second SUBREG that might get in the way is a paradoxical
+     SUBREG around the first operand of the AND.  We want to 
+     pretend the operand is as wide as the destination here.   We
+     do this by creating a new MEM in the wider mode for the sole
+     purpose of the call to rtx_equal_for_field_assignment_p.   Also
+     note this trick only works for MEMs.  */
+  else if (GET_CODE (lhs) == AND
+	   && paradoxical_subreg_p (XEXP (lhs, 0))
+	   && GET_CODE (SUBREG_REG (XEXP (lhs, 0))) == MEM
+	   && CONST_INT_P (XEXP (lhs, 1))
+	   && rtx_equal_for_field_assignment_p (gen_rtx_MEM (GET_MODE (dest),
+							     XEXP (SUBREG_REG (XEXP (lhs, 0)), 0)),
+						dest))
+    c1 = INTVAL (XEXP (lhs, 1)), other = rhs;
   else
     return x;
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 69488ec..c60230f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2015-01-29  Jeff Law  <law@redhat.com>
+
+	PR target/15184
+	* gcc.target/i386/pr15184-1.c: New test.
+	* gcc.target/i386/pr15184-2.c: New test.
+
 2015-01-29  Yuri Rumyantsev  <ysrumyan@gmail.com>
 
 	PR tree-optimization/64746
diff --git a/gcc/testsuite/gcc.target/i386/pr15184-1.c b/gcc/testsuite/gcc.target/i386/pr15184-1.c
new file mode 100644
index 0000000..9eb544c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr15184-1.c
@@ -0,0 +1,33 @@
+/* PR 15184 first two tests, plus two addition ones.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -m32 -march=pentiumpro" } */
+
+#define regparm __attribute__((__regparm__(3)))
+
+extern unsigned int x;
+extern unsigned short y;
+
+void regparm f0(unsigned char c)
+{
+       x = (x & 0xFFFFFF00) | (unsigned int)c;
+}
+
+void regparm f1(unsigned char c)
+{
+     x = (x & 0xFFFF00FF) | ((unsigned int)c << 8);
+}
+
+void regparm f2(unsigned char c)
+{
+     x = (x & 0xFF00FFFF) | ((unsigned int)c << 16);
+}
+void regparm f3(unsigned char c)
+{
+     x = (x & 0x00FFFFFF) | ((unsigned int)c << 24);
+}
+
+
+/* Each function should compile down to a byte move from
+   the input register into x, possibly at an offset within x.  */
+/* { dg-final { scan-assembler-times "\tmovb\t%al, x" 4 } } */
+
diff --git a/gcc/testsuite/gcc.target/i386/pr15184-2.c b/gcc/testsuite/gcc.target/i386/pr15184-2.c
new file mode 100644
index 0000000..99fdbc8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr15184-2.c
@@ -0,0 +1,23 @@
+/* PR 15184 second two tests
+/* { dg-do compile } */
+/* { dg-options "-O2 -m32 -march=pentiumpro" } */
+
+#define regparm __attribute__((__regparm__(3)))
+
+extern unsigned int x;
+extern unsigned short y;
+
+void regparm g0(unsigned char c)
+{
+        y = (y & 0xFF00) | (unsigned short)c;
+}
+
+void regparm g1(unsigned char c)
+{
+        y = (y & 0x00FF) | ((unsigned short)c << 8);
+}
+
+/* Each function should compile down to a byte move from
+   the input register into x, possibly at an offset within x.  */
+/* { dg-final { scan-assembler-times "\tmovb\t%al, y" 2 } } */
+

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-01-30 19:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29 15:51 [PATCH][PR target/15184] Fix for direct byte access on x86 Uros Bizjak
2015-01-29 16:06 ` Jakub Jelinek
2015-01-29 16:08   ` Uros Bizjak
2015-01-29 16:23     ` Jakub Jelinek
2015-01-29 21:42       ` H.J. Lu
2015-01-30  9:21 ` Jeff Law
2015-01-30 11:02   ` Uros Bizjak
2015-01-30 11:21     ` Jakub Jelinek
2015-01-30 11:21       ` Uros Bizjak
2015-01-30 11:46         ` Jakub Jelinek
2015-01-30 20:36           ` Jeff Law
2015-01-30 20:37     ` Jeff Law
  -- strict thread matches above, loose matches on Subject: below --
2015-01-30 11:01 Dominique Dhumieres
2015-01-29 15:21 Jeff Law
2015-01-29 20:50 ` Segher Boessenkool
2015-01-30  8:57   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).