Improve things for PR71724, in combine/if_then_else

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Improve things for PR71724, in combine/if_then_else_cond
@ 2017-01-20 12:36 Bernd Schmidt
  2017-01-20 19:30 ` Segher Boessenkool
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-20 12:36 UTC (permalink / raw)
  To: GCC Patches; +Cc: Segher Boessenkool

[-- Attachment #1: Type: text/plain, Size: 1137 bytes --]

The PR is about infinite recursion in combine_simplify_rtx, because 
if_then_else_cond does strange things to an expression, and we end up 
simplifying something to itself. The patch below tries to address this 
by improving that function a little. As stated in the PR, the situation 
is that we have

(plus:SI (if_then_else:SI (eq (reg:CC 185)
             (const_int 0 [0]))
         (reg:SI 165)
         (reg:SI 174 [ t9.0_1+4 ]))
     (reg:SI 165))

Reg 165 is known to be zero or one, so it gets turned into a condition, 
and we have two different conditions on the operands. That causes us to 
fail to make the fairly obvious transformation to

  cond = reg:CC 185
  true_rtx = (plus r165 r165)
  false_rtx = (plus r174 r165)

So, when looking for situations where we have only one condition, we can 
try to undo the conversion of a plain REG into a condition, on the 
grounds that this is probably less helpful.

This seems to cure the testcase, but Segher also has a patch in the PR 
that looks like a good and more direct approach. IMO both should be 
applied. This one was bootstrapped and tested on x86_64-linux. Ok?


Bernd

[-- Attachment #2: 71724-a.diff --]
[-- Type: text/x-patch, Size: 1518 bytes --]

	PR rtl-optimization/71724
	* combine.c (if_then_else_cond): Look for situations where it is
	beneficial to undo the work of one of the recursive calls.

Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 244528)
+++ gcc/combine.c	(working copy)
@@ -9044,11 +9044,31 @@ if_then_else_cond (rtx x, rtx *ptrue, rt
      the same value, compute the new true and false values.  */
   else if (BINARY_P (x))
     {
-      cond0 = if_then_else_cond (XEXP (x, 0), &true0, &false0);
-      cond1 = if_then_else_cond (XEXP (x, 1), &true1, &false1);
+      rtx op0 = XEXP (x, 0);
+      rtx op1 = XEXP (x, 1);
+      cond0 = if_then_else_cond (op0, &true0, &false0);
+      cond1 = if_then_else_cond (op1, &true1, &false1);
+
+      if ((cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1))
+	  && (REG_P (op0) || REG_P (op1)))
+	{
+	  /* Try to enable a simplification by undoing work done by
+	     if_then_else_cond if it converted a REG into something more
+	     complex.  */
+	  if (REG_P (op0))
+	    {
+	      cond0 = 0;
+	      true0 = false0 = op0;
+	    }
+	  else
+	    {
+	      cond1 = 0;
+	      true1 = false1 = op1;
+	    }
+	}
 
       if ((cond0 != 0 || cond1 != 0)
-	  && ! (cond0 != 0 && cond1 != 0 && ! rtx_equal_p (cond0, cond1)))
+	  && ! (cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1)))
 	{
 	  /* If if_then_else_cond returned zero, then true/false are the
 	     same rtl.  We must copy one of them to prevent invalid rtl

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt
@ 2017-01-20 19:30 ` Segher Boessenkool
  2017-01-24  9:38   ` Christophe Lyon
  2017-01-25 20:54   ` Segher Boessenkool
  0 siblings, 2 replies; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-20 19:30 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

Hi Bernd,

On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
> So, when looking for situations where we have only one condition, we can 
> try to undo the conversion of a plain REG into a condition, on the 
> grounds that this is probably less helpful.
> 
> This seems to cure the testcase, but Segher also has a patch in the PR 
> that looks like a good and more direct approach. IMO both should be 
> applied. This one was bootstrapped and tested on x86_64-linux. Ok?

My patch does not cure all problems, it simply simplifies things a bit
better; but the same is true for your patch if I read it correctly.

Okay for trunk, and I'll do my half as well.  Thanks,


Segher


> 	PR rtl-optimization/71724
> 	* combine.c (if_then_else_cond): Look for situations where it is
> 	beneficial to undo the work of one of the recursive calls.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-20 19:30 ` Segher Boessenkool
@ 2017-01-24  9:38   ` Christophe Lyon
  2017-01-24 12:52     ` Bernd Schmidt
  2017-01-25 20:54   ` Segher Boessenkool
  1 sibling, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24  9:38 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On 20 January 2017 at 20:24, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> Hi Bernd,
>
> On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
>> So, when looking for situations where we have only one condition, we can
>> try to undo the conversion of a plain REG into a condition, on the
>> grounds that this is probably less helpful.
>>
>> This seems to cure the testcase, but Segher also has a patch in the PR
>> that looks like a good and more direct approach. IMO both should be
>> applied. This one was bootstrapped and tested on x86_64-linux. Ok?
>
> My patch does not cure all problems, it simply simplifies things a bit
> better; but the same is true for your patch if I read it correctly.
>
> Okay for trunk, and I'll do my half as well.  Thanks,
>
>

Hi,

It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:

  gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
  gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
  gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
  gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1

Maybe the upcoming patch from Segher intends to address this?

Thanks,

Christophe



> Segher
>
>
>>       PR rtl-optimization/71724
>>       * combine.c (if_then_else_cond): Look for situations where it is
>>       beneficial to undo the work of one of the recursive calls.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24  9:38   ` Christophe Lyon
@ 2017-01-24 12:52     ` Bernd Schmidt
  2017-01-24 15:30       ` Segher Boessenkool
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 12:52 UTC (permalink / raw)
  To: Christophe Lyon, Segher Boessenkool; +Cc: GCC Patches

On 01/24/2017 09:38 AM, Christophe Lyon wrote:
> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>
>   gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>   gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>   gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>   gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>
> Maybe the upcoming patch from Segher intends to address this?

Not really. How exactly do I reproduce this - can you give me a command 
line and expected assembly output (I'm having some trouble figuring out 
the right set of switches)?


Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 12:52     ` Bernd Schmidt
@ 2017-01-24 15:30       ` Segher Boessenkool
  2017-01-24 16:28         ` Kyrill Tkachov
  0 siblings, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-24 15:30 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches

On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
> >It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
> >--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
> >
> >  gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> >  gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
> >  gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> >  gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
> >
> >Maybe the upcoming patch from Segher intends to address this?
> 
> Not really.

Not at all, even, first I hear of this.  Please open a PR.

> How exactly do I reproduce this - can you give me a command 
> line and expected assembly output (I'm having some trouble figuring out 
> the right set of switches)?

Same here.


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 15:30       ` Segher Boessenkool
@ 2017-01-24 16:28         ` Kyrill Tkachov
  2017-01-24 16:28           ` Christophe Lyon
  0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:28 UTC (permalink / raw)
  To: Segher Boessenkool, Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches


On 24/01/17 15:21, Segher Boessenkool wrote:
> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>
>>>   gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>   gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>   gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>   gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>
>>> Maybe the upcoming patch from Segher intends to address this?
>> Not really.
> Not at all, even, first I hear of this.  Please open a PR.
>
>> How exactly do I reproduce this - can you give me a command
>> line and expected assembly output (I'm having some trouble figuring out
>> the right set of switches)?
> Same here.

I haven't looked at these in detail but these are the ARMv8 FP conditional select tests
and they would be affected by if_then_else generation.

So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 would reproduce this.
Maybe the Cortex-A5 BRANCH_COST affects things...

Kyrill



>
> Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 16:28         ` Kyrill Tkachov
@ 2017-01-24 16:28           ` Christophe Lyon
  2017-01-24 16:36             ` Kyrill Tkachov
  0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24 16:28 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches

On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>
> On 24/01/17 15:21, Segher Boessenkool wrote:
>>
>> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>>>
>>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>>>
>>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>>
>>>>   gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>   gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>   gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>   gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>
>>>> Maybe the upcoming patch from Segher intends to address this?
>>>
>>> Not really.
>>
>> Not at all, even, first I hear of this.  Please open a PR.
>>
>>> How exactly do I reproduce this - can you give me a command
>>> line and expected assembly output (I'm having some trouble figuring out
>>> the right set of switches)?
>>
>> Same here.
>
>
> I haven't looked at these in detail but these are the ARMv8 FP conditional
> select tests
> and they would be affected by if_then_else generation.
>
> So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5
> would reproduce this.
> Maybe the Cortex-A5 BRANCH_COST affects things...
>

Thanks Kyrill, and sorry for not replying earlier.
Yes something like that.

The configuration in question is configured with:
-target=arm-none-linux-gnueabihf --with-float=hard
--enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5
--with-fpu=vfpv3-d16-fp16

If you can't easily rebuild the whole toolchain, you can use an
arm-linux-gnueabihf toolchain and
use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard.

It's not the first time that Cortex-A5's BRANCH_COST implies
changes on a testcase.

Christophe

> Kyrill
>
>
>
>>
>> Segher
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 16:28           ` Christophe Lyon
@ 2017-01-24 16:36             ` Kyrill Tkachov
  2017-01-24 16:36               ` Bernd Schmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:36 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches


On 24/01/17 16:28, Christophe Lyon wrote:
> On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>> On 24/01/17 15:21, Segher Boessenkool wrote:
>>> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>>>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>>>
>>>>>    gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>>    gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>>    gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>>    gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>>
>>>>> Maybe the upcoming patch from Segher intends to address this?
>>>> Not really.
>>> Not at all, even, first I hear of this.  Please open a PR.
>>>
>>>> How exactly do I reproduce this - can you give me a command
>>>> line and expected assembly output (I'm having some trouble figuring out
>>>> the right set of switches)?
>>> Same here.
>>
>> I haven't looked at these in detail but these are the ARMv8 FP conditional
>> select tests
>> and they would be affected by if_then_else generation.
>>
>> So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5
>> would reproduce this.
>> Maybe the Cortex-A5 BRANCH_COST affects things...
>>
> Thanks Kyrill, and sorry for not replying earlier.
> Yes something like that.
>
> The configuration in question is configured with:
> -target=arm-none-linux-gnueabihf --with-float=hard
> --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5
> --with-fpu=vfpv3-d16-fp16
>
> If you can't easily rebuild the whole toolchain, you can use an
> arm-linux-gnueabihf toolchain and
> use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard.

The -mfpu is overridden in the testcase to add the ARMv8 instructions.
So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or
something equivalent rather than vfpv3-d16-fp16.

Thanks,
Kyrill

>
> It's not the first time that Cortex-A5's BRANCH_COST implies
> changes on a testcase.
>
> Christophe
>
>> Kyrill
>>
>>
>>
>>> Segher
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 16:36             ` Kyrill Tkachov
@ 2017-01-24 16:36               ` Bernd Schmidt
  2017-01-24 16:55                 ` Kyrill Tkachov
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 16:36 UTC (permalink / raw)
  To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches

On 01/24/2017 05:30 PM, Kyrill Tkachov wrote:
>
> The -mfpu is overridden in the testcase to add the ARMv8 instructions.
> So to reproduce the compilation in that testcase you'd want
> -mfpu=fp-armv8 or
> something equivalent rather than vfpv3-d16-fp16.

Exact steps please. No one who's not well-versed in all the ARM variants 
will be able to figure this out. I've been able to generate identical 
before/after code, both with and without vselvs.f64 instructions, after 
trying out a number of switch combinations, but I've not been able to 
find a way to show where the patch makes a difference.

Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 16:36               ` Bernd Schmidt
@ 2017-01-24 16:55                 ` Kyrill Tkachov
  2017-01-24 17:03                   ` Bernd Schmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:55 UTC (permalink / raw)
  To: Bernd Schmidt, Kyrill Tkachov, Christophe Lyon
  Cc: Segher Boessenkool, GCC Patches


On 24/01/17 16:36, Bernd Schmidt wrote:
> On 01/24/2017 05:30 PM, Kyrill Tkachov wrote:
>>
>> The -mfpu is overridden in the testcase to add the ARMv8 instructions.
>> So to reproduce the compilation in that testcase you'd want
>> -mfpu=fp-armv8 or
>> something equivalent rather than vfpv3-d16-fp16.
>
> Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of 
> switch combinations, but I've not been able to find a way to show where the patch makes a difference.
>

I was just off-hand trying to give the options that would be
expected to be exercised in this testcase.

Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
the test failing before and after the patch. The code generated is
         vcmp.f64        d0, d1
         vmrs    APSR_nzcv, FPSCR
         vmovvs.f64      d0, d1
         bx      lr

whereas the desired (e.g. with -mcpu=cortex-a57) is:
         vcmp.f64        d0, d1
         vmrs    APSR_nzcv, FPSCR
         vselvs.f64      d0, d1, d0
         bx      lr

Given that VSEL is an ARMv8-A instruction and Cortex-A5 is an ARMv7-A cpu it doesn't make much sense to try getting
it to generate that VSEL. So maybe we should just include an explicit -mtune=cortex-a57 to the testcases.

Thanks,
Kyrill

>
> Bernd
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 16:55                 ` Kyrill Tkachov
@ 2017-01-24 17:03                   ` Bernd Schmidt
  2017-01-24 17:08                     ` Christophe Lyon
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 17:03 UTC (permalink / raw)
  To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches

On 01/24/2017 05:50 PM, Kyrill Tkachov wrote:
>
> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S
> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
> the test failing before and after the patch. The code generated is
>         vcmp.f64        d0, d1
>         vmrs    APSR_nzcv, FPSCR
>         vmovvs.f64      d0, d1
>         bx      lr
>
> whereas the desired (e.g. with -mcpu=cortex-a57) is:
>         vcmp.f64        d0, d1
>         vmrs    APSR_nzcv, FPSCR
>         vselvs.f64      d0, d1, d0
>         bx      lr

Yes, I've seen both of these generated with different options, but the 
patch did not make a difference here either.

For the moment I'll assume this was a false alarm, i.e. Christophe 
misidentified the patch and something else went wrong.


Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 17:03                   ` Bernd Schmidt
@ 2017-01-24 17:08                     ` Christophe Lyon
  2017-01-24 17:22                       ` Bernd Schmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24 17:08 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches

On 24 January 2017 at 17:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/24/2017 05:50 PM, Kyrill Tkachov wrote:
>>
>>
>> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S
>> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
>> the test failing before and after the patch. The code generated is
>>         vcmp.f64        d0, d1
>>         vmrs    APSR_nzcv, FPSCR
>>         vmovvs.f64      d0, d1
>>         bx      lr
>>
>> whereas the desired (e.g. with -mcpu=cortex-a57) is:
>>         vcmp.f64        d0, d1
>>         vmrs    APSR_nzcv, FPSCR
>>         vselvs.f64      d0, d1, d0
>>         bx      lr
>
>
> Yes, I've seen both of these generated with different options, but the patch
> did not make a difference here either.
>
> For the moment I'll assume this was a false alarm, i.e. Christophe
> misidentified the patch and something else went wrong.
>
Ha... the regression occurred between r 244818  and r 244816,
and I read r 244816 ChangeLog too quickly and did not notice
it was modifying ifcvt.c in addition to x86-only files.

So it's likely that it's your other patch for pr78634
that caused the regression I mentioned. Does it make
more sense?

Sorry for the probably wrong identification.

It always takes some time for me to reproduce regressions
manually because I can only keep logs/results of
validations, the toolchains actually built are deleted
once validation completes.

Christophe

>
> Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 17:08                     ` Christophe Lyon
@ 2017-01-24 17:22                       ` Bernd Schmidt
  2017-01-25  9:06                         ` Christophe Lyon
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 17:22 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches

On 01/24/2017 06:03 PM, Christophe Lyon wrote:
> Ha... the regression occurred between r 244818  and r 244816,
> and I read r 244816 ChangeLog too quickly and did not notice
> it was modifying ifcvt.c in addition to x86-only files.
>
> So it's likely that it's your other patch for pr78634
> that caused the regression I mentioned. Does it make
> more sense?

That's possible. That added a missing cost check, so the question 
becomes - is the change in generated assembly sensible, given the 
selected CPU type?


Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-24 17:22                       ` Bernd Schmidt
@ 2017-01-25  9:06                         ` Christophe Lyon
  2017-01-25  9:21                           ` Kyrill Tkachov
  0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25  9:06 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches

On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>
>> Ha... the regression occurred between r 244818  and r 244816,
>> and I read r 244816 ChangeLog too quickly and did not notice
>> it was modifying ifcvt.c in addition to x86-only files.
>>
>> So it's likely that it's your other patch for pr78634
>> that caused the regression I mentioned. Does it make
>> more sense?
>
>
> That's possible. That added a missing cost check, so the question becomes -
> is the change in generated assembly sensible, given the selected CPU type?
>

I can now confirm that the change is indeed caused by r244816 (pr78634).
The difference in the generated asm is:
-       vmov    d17, r0, r1
-       vmov    d16, r2, r3
-       vcmp.f64        d17, d16
+       vmov    d16, r0, r1
+       vmov    d17, r2, r3
+       vcmp.f64        d16, d17
        vmrs    APSR_nzcv, FPSCR
-       vselvs.f64      d16, d16, d17
+       vmovvs.f64      d16, d17

which, besides swapping d16 and d17, summarizes to
-       vselvs.f64      d16, d16, d17
+       vmovvs.f64      d16, d17

I'm not sure if there is a "best" one ?

>
> Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25  9:06                         ` Christophe Lyon
@ 2017-01-25  9:21                           ` Kyrill Tkachov
  2017-01-25 10:01                             ` Christophe Lyon
  2017-01-25 14:58                             ` Bernd Schmidt
  0 siblings, 2 replies; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-25  9:21 UTC (permalink / raw)
  To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches


On 25/01/17 08:53, Christophe Lyon wrote:
> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>> Ha... the regression occurred between r 244818  and r 244816,
>>> and I read r 244816 ChangeLog too quickly and did not notice
>>> it was modifying ifcvt.c in addition to x86-only files.
>>>
>>> So it's likely that it's your other patch for pr78634
>>> that caused the regression I mentioned. Does it make
>>> more sense?
>>
>> That's possible. That added a missing cost check, so the question becomes -
>> is the change in generated assembly sensible, given the selected CPU type?
>>
> I can now confirm that the change is indeed caused by r244816 (pr78634).
> The difference in the generated asm is:
> -       vmov    d17, r0, r1
> -       vmov    d16, r2, r3
> -       vcmp.f64        d17, d16
> +       vmov    d16, r0, r1
> +       vmov    d17, r2, r3
> +       vcmp.f64        d16, d17
>          vmrs    APSR_nzcv, FPSCR
> -       vselvs.f64      d16, d16, d17
> +       vmovvs.f64      d16, d17
>
> which, besides swapping d16 and d17, summarizes to
> -       vselvs.f64      d16, d16, d17
> +       vmovvs.f64      d16, d17
>
> I'm not sure if there is a "best" one ?
>

The test is supposed to test the generation of the vsel instruction.
I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
VSEL isn't actually available on Cortex-A5, it's just enabled by the -mfpu=fp-armv8 option.
A more realistic configuration would target an ARMv8-A CPU like the Cortex-A57.

Thanks,
Kyrill

>> Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25  9:21                           ` Kyrill Tkachov
@ 2017-01-25 10:01                             ` Christophe Lyon
  2017-01-25 10:34                               ` Richard Earnshaw (lists)
  2017-01-25 14:58                             ` Bernd Schmidt
  1 sibling, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25 10:01 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches

On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>
> On 25/01/17 08:53, Christophe Lyon wrote:
>>
>> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>>>
>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>>>
>>>> Ha... the regression occurred between r 244818  and r 244816,
>>>> and I read r 244816 ChangeLog too quickly and did not notice
>>>> it was modifying ifcvt.c in addition to x86-only files.
>>>>
>>>> So it's likely that it's your other patch for pr78634
>>>> that caused the regression I mentioned. Does it make
>>>> more sense?
>>>
>>>
>>> That's possible. That added a missing cost check, so the question becomes
>>> -
>>> is the change in generated assembly sensible, given the selected CPU
>>> type?
>>>
>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>> The difference in the generated asm is:
>> -       vmov    d17, r0, r1
>> -       vmov    d16, r2, r3
>> -       vcmp.f64        d17, d16
>> +       vmov    d16, r0, r1
>> +       vmov    d17, r2, r3
>> +       vcmp.f64        d16, d17
>>          vmrs    APSR_nzcv, FPSCR
>> -       vselvs.f64      d16, d16, d17
>> +       vmovvs.f64      d16, d17
>>
>> which, besides swapping d16 and d17, summarizes to
>> -       vselvs.f64      d16, d16, d17
>> +       vmovvs.f64      d16, d17
>>
>> I'm not sure if there is a "best" one ?
>>
>
> The test is supposed to test the generation of the vsel instruction.
> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
> VSEL isn't actually available on Cortex-A5, it's just enabled by the
> -mfpu=fp-armv8 option.
> A more realistic configuration would target an ARMv8-A CPU like the
> Cortex-A57.
>

Yes indeed, it's always confusing to be able to provide "incompatible"
-mcpu and -mfpu flags (as in: no such combination actually exists).


> Thanks,
> Kyrill
>
>>> Bernd
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25 10:01                             ` Christophe Lyon
@ 2017-01-25 10:34                               ` Richard Earnshaw (lists)
  0 siblings, 0 replies; 23+ messages in thread
From: Richard Earnshaw (lists) @ 2017-01-25 10:34 UTC (permalink / raw)
  To: Christophe Lyon, Kyrill Tkachov
  Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches

On 25/01/17 09:29, Christophe Lyon wrote:
> On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>>
>> On 25/01/17 08:53, Christophe Lyon wrote:
>>>
>>> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>>>>
>>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>>>>
>>>>> Ha... the regression occurred between r 244818  and r 244816,
>>>>> and I read r 244816 ChangeLog too quickly and did not notice
>>>>> it was modifying ifcvt.c in addition to x86-only files.
>>>>>
>>>>> So it's likely that it's your other patch for pr78634
>>>>> that caused the regression I mentioned. Does it make
>>>>> more sense?
>>>>
>>>>
>>>> That's possible. That added a missing cost check, so the question becomes
>>>> -
>>>> is the change in generated assembly sensible, given the selected CPU
>>>> type?
>>>>
>>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>>> The difference in the generated asm is:
>>> -       vmov    d17, r0, r1
>>> -       vmov    d16, r2, r3
>>> -       vcmp.f64        d17, d16
>>> +       vmov    d16, r0, r1
>>> +       vmov    d17, r2, r3
>>> +       vcmp.f64        d16, d17
>>>          vmrs    APSR_nzcv, FPSCR
>>> -       vselvs.f64      d16, d16, d17
>>> +       vmovvs.f64      d16, d17
>>>
>>> which, besides swapping d16 and d17, summarizes to
>>> -       vselvs.f64      d16, d16, d17
>>> +       vmovvs.f64      d16, d17
>>>
>>> I'm not sure if there is a "best" one ?
>>>
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>>
> 
> Yes indeed, it's always confusing to be able to provide "incompatible"
> -mcpu and -mfpu flags (as in: no such combination actually exists).
> 

As discussed at the Cauldron, I'm working on fixing that, but it won't
be until GCC-8 now.

R.

> 
>> Thanks,
>> Kyrill
>>
>>>> Bernd
>>
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25  9:21                           ` Kyrill Tkachov
  2017-01-25 10:01                             ` Christophe Lyon
@ 2017-01-25 14:58                             ` Bernd Schmidt
  2017-01-25 15:25                               ` Christophe Lyon
  1 sibling, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-25 14:58 UTC (permalink / raw)
  To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches

On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
> The test is supposed to test the generation of the vsel instruction.
> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
> VSEL isn't actually available on Cortex-A5, it's just enabled by the
> -mfpu=fp-armv8 option.
> A more realistic configuration would target an ARMv8-A CPU like the
> Cortex-A57.

Ok, let me know if there's anything else you need from my side.


Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25 14:58                             ` Bernd Schmidt
@ 2017-01-25 15:25                               ` Christophe Lyon
  2017-01-25 15:43                                 ` Kyrill Tkachov
  0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25 15:25 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1149 bytes --]

On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>
>
> Ok, let me know if there's anything else you need from my side.
>
Kyrill,

How about the attached patch?

I've added dg-require-effective-target arm_arch_v8a_ok to make sure
it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
to make sure the intended instructions are present (in case at some
point add-options-for-arm-arch-v8a activates costs/arch variant that
would imply not generating vsel anymore).

I've noticed there are other tests adding arm_v8_vfp and not making
sure to select an appriopriate cpu. As a follow-up patch?

And I checked that my patch makes the tests pass again even
when configuring --with-cpu=cortex-a5.

Thanks,

Christophe

>
> Bernd
>

[-- Attachment #2: vsel.chlog.txt --]
[-- Type: text/plain, Size: 764 bytes --]

gcc/testsuite/ChangeLog:

2017-01-25  Christophe Lyon  <christophe.lyon@linaro.org>

	* gcc.target/arm/vseleqdf.c: Require arm_arch_v8a_ok, add
	-mcpu=cortex-a57.
	* gcc.target/arm/vseleqsf.c: Likewise.
	* gcc.target/arm/vselgedf.c: Likewise.
	* gcc.target/arm/vselgesf.c: Likewise.
	* gcc.target/arm/vselgtdf.c: Likewise.
	* gcc.target/arm/vselgtsf.c: Likewise.
	* gcc.target/arm/vselledf.c: Likewise.
	* gcc.target/arm/vsellesf.c: Likewise.
	* gcc.target/arm/vselltdf.c: Likewise.
	* gcc.target/arm/vselltsf.c: Likewise.
	* gcc.target/arm/vselnedf.c: Likewise.
	* gcc.target/arm/vselnesf.c: Likewise.
	* gcc.target/arm/vselvcdf.c: Likewise.
	* gcc.target/arm/vselvcsf.c: Likewise.
	* gcc.target/arm/vselvsdf.c: Likewise.
	* gcc.target/arm/vselvssf.c: Likewise.


[-- Attachment #3: vsel.patch.txt --]
[-- Type: text/plain, Size: 7736 bytes --]

diff --git a/gcc/testsuite/gcc.target/arm/vseleqdf.c b/gcc/testsuite/gcc.target/arm/vseleqdf.c
index 86e147b..64d5784 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqdf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vseleqsf.c b/gcc/testsuite/gcc.target/arm/vseleqsf.c
index 120f44b..b052704 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqsf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselgedf.c b/gcc/testsuite/gcc.target/arm/vselgedf.c
index cea08d1..e10508f 100644
--- a/gcc/testsuite/gcc.target/arm/vselgedf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgedf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselgesf.c b/gcc/testsuite/gcc.target/arm/vselgesf.c
index 86f2a04..645cf5d 100644
--- a/gcc/testsuite/gcc.target/arm/vselgesf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgesf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselgtdf.c b/gcc/testsuite/gcc.target/arm/vselgtdf.c
index 2c4a6ba..741b9a8 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselgtsf.c b/gcc/testsuite/gcc.target/arm/vselgtsf.c
index 388e74c..3042c5b 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselledf.c b/gcc/testsuite/gcc.target/arm/vselledf.c
index 088dc04..dcf46a3 100644
--- a/gcc/testsuite/gcc.target/arm/vselledf.c
+++ b/gcc/testsuite/gcc.target/arm/vselledf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vsellesf.c b/gcc/testsuite/gcc.target/arm/vsellesf.c
index d0afdbc..38b06eb 100644
--- a/gcc/testsuite/gcc.target/arm/vsellesf.c
+++ b/gcc/testsuite/gcc.target/arm/vsellesf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselltdf.c b/gcc/testsuite/gcc.target/arm/vselltdf.c
index fbcb9ea..4d4d91c 100644
--- a/gcc/testsuite/gcc.target/arm/vselltdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselltdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselltsf.c b/gcc/testsuite/gcc.target/arm/vselltsf.c
index 959dab7..ab3f77f 100644
--- a/gcc/testsuite/gcc.target/arm/vselltsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselltsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselnedf.c b/gcc/testsuite/gcc.target/arm/vselnedf.c
index cf67f29..4b0fa5e 100644
--- a/gcc/testsuite/gcc.target/arm/vselnedf.c
+++ b/gcc/testsuite/gcc.target/arm/vselnedf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselnesf.c b/gcc/testsuite/gcc.target/arm/vselnesf.c
index 2e16423..4fcd5a0 100644
--- a/gcc/testsuite/gcc.target/arm/vselnesf.c
+++ b/gcc/testsuite/gcc.target/arm/vselnesf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselvcdf.c b/gcc/testsuite/gcc.target/arm/vselvcdf.c
index 7f30270..9701e11 100644
--- a/gcc/testsuite/gcc.target/arm/vselvcdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvcdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselvcsf.c b/gcc/testsuite/gcc.target/arm/vselvcsf.c
index 1bb7369..2d47f30 100644
--- a/gcc/testsuite/gcc.target/arm/vselvcsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvcsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselvsdf.c b/gcc/testsuite/gcc.target/arm/vselvsdf.c
index 83ad5bf..54555a2 100644
--- a/gcc/testsuite/gcc.target/arm/vselvsdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvsdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselvssf.c b/gcc/testsuite/gcc.target/arm/vselvssf.c
index 7d76289..a37305d 100644
--- a/gcc/testsuite/gcc.target/arm/vselvssf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvssf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25 15:25                               ` Christophe Lyon
@ 2017-01-25 15:43                                 ` Kyrill Tkachov
  0 siblings, 0 replies; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-25 15:43 UTC (permalink / raw)
  To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches


On 25/01/17 15:20, Christophe Lyon wrote:
> On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
>>> The test is supposed to test the generation of the vsel instruction.
>>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>>> -mfpu=fp-armv8 option.
>>> A more realistic configuration would target an ARMv8-A CPU like the
>>> Cortex-A57.
>>
>> Ok, let me know if there's anything else you need from my side.
>>
> Kyrill,
>
> How about the attached patch?

Yes, thanks Christophe.

> I've added dg-require-effective-target arm_arch_v8a_ok to make sure
> it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
> to make sure the intended instructions are present (in case at some
> point add-options-for-arm-arch-v8a activates costs/arch variant that
> would imply not generating vsel anymore).
>
> I've noticed there are other tests adding arm_v8_vfp and not making
> sure to select an appriopriate cpu. As a follow-up patch?

I wouldn't want to do that too much in the testsuite.
In the VSEL tests we have a C-level idiom (?: construct) that we expect
the optimisers to transform into a conditional select instruction that may or may not
be a win on some cores.

In some of those other tests I suspect we want to generate the instruction for all tunings.
I.e. I'd expect the rounding tests (__builtin_floor/trunc etc) to always generate VRINT*
when the -mfpu allows it, regardless of the CPU tuning.

Kyrill

> And I checked that my patch makes the tests pass again even
> when configuring --with-cpu=cortex-a5.


> Thanks,
>
> Christophe
>
>> Bernd
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-20 19:30 ` Segher Boessenkool
  2017-01-24  9:38   ` Christophe Lyon
@ 2017-01-25 20:54   ` Segher Boessenkool
  2017-01-26 13:04     ` Bernd Schmidt
  1 sibling, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-25 20:54 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Fri, Jan 20, 2017 at 01:24:15PM -0600, Segher Boessenkool wrote:
> On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
> > So, when looking for situations where we have only one condition, we can 
> > try to undo the conversion of a plain REG into a condition, on the 
> > grounds that this is probably less helpful.
> > 
> > This seems to cure the testcase, but Segher also has a patch in the PR 
> > that looks like a good and more direct approach. IMO both should be 
> > applied. This one was bootstrapped and tested on x86_64-linux. Ok?
> 
> My patch does not cure all problems, it simply simplifies things a bit
> better; but the same is true for your patch if I read it correctly.
> 
> Okay for trunk, and I'll do my half as well.  Thanks,

It turns out my patch (see the PR) causes (or at least triggers)
miscompilations on tilegx.  I will drop it for now.

Longer term we will have to fix this whole if_then_else_cond business,
maybe make it less expensive and/or more effective as well.


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-25 20:54   ` Segher Boessenkool
@ 2017-01-26 13:04     ` Bernd Schmidt
  2017-01-26 19:05       ` Segher Boessenkool
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-26 13:04 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On 01/25/2017 08:46 PM, Segher Boessenkool wrote:
>
> It turns out my patch (see the PR) causes (or at least triggers)
> miscompilations on tilegx.  I will drop it for now.

Curious, it looked very reasonable. What's needed to reproduce this?


Bernd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Improve things for PR71724, in combine/if_then_else_cond
  2017-01-26 13:04     ` Bernd Schmidt
@ 2017-01-26 19:05       ` Segher Boessenkool
  0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-26 19:05 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Thu, Jan 26, 2017 at 01:57:02PM +0100, Bernd Schmidt wrote:
> On 01/25/2017 08:46 PM, Segher Boessenkool wrote:
> >
> >It turns out my patch (see the PR) causes (or at least triggers)
> >miscompilations on tilegx.  I will drop it for now.
> 
> Curious, it looked very reasonable. What's needed to reproduce this?

Yeah, quite curious.

I just build a cross compiler (and binutils) targeting tilegx-linux,
and build a Linux kernel (ARCH=tile, defconfig) with that.  In many
places where there are two independent conditional jumps the generated
code before the patch combines the two conditions and does one jump;
the code after the patch completely loses the second condition:

-{ cmpeqi r10, r3, 14 ; cmpeqi r11, r3, -1 }
-{ or r10, r10, r11 }
-{ beqzt r10, fffffff700031ba8 <KBacktraceIterator_restart+0x4f0> }
+{ cmpeqi r10, r3, 14 }
+{ beqzt r10, fffffff700031ba0 <KBacktraceIterator_restart+0x4e8> }

This happens quite a few times.  Sometimes a lot of code is deleted.

It may well be a target problem, but it is stage 4, and I do not
currently have the bandwidth to investigate this.  If you do, please
do :-)

Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-01-26 18:57 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt
2017-01-20 19:30 ` Segher Boessenkool
2017-01-24  9:38   ` Christophe Lyon
2017-01-24 12:52     ` Bernd Schmidt
2017-01-24 15:30       ` Segher Boessenkool
2017-01-24 16:28         ` Kyrill Tkachov
2017-01-24 16:28           ` Christophe Lyon
2017-01-24 16:36             ` Kyrill Tkachov
2017-01-24 16:36               ` Bernd Schmidt
2017-01-24 16:55                 ` Kyrill Tkachov
2017-01-24 17:03                   ` Bernd Schmidt
2017-01-24 17:08                     ` Christophe Lyon
2017-01-24 17:22                       ` Bernd Schmidt
2017-01-25  9:06                         ` Christophe Lyon
2017-01-25  9:21                           ` Kyrill Tkachov
2017-01-25 10:01                             ` Christophe Lyon
2017-01-25 10:34                               ` Richard Earnshaw (lists)
2017-01-25 14:58                             ` Bernd Schmidt
2017-01-25 15:25                               ` Christophe Lyon
2017-01-25 15:43                                 ` Kyrill Tkachov
2017-01-25 20:54   ` Segher Boessenkool
2017-01-26 13:04     ` Bernd Schmidt
2017-01-26 19:05       ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).