* Improve things for PR71724, in combine/if_then_else_cond
@ 2017-01-20 12:36 Bernd Schmidt
2017-01-20 19:30 ` Segher Boessenkool
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-20 12:36 UTC (permalink / raw)
To: GCC Patches; +Cc: Segher Boessenkool
[-- Attachment #1: Type: text/plain, Size: 1137 bytes --]
The PR is about infinite recursion in combine_simplify_rtx, because
if_then_else_cond does strange things to an expression, and we end up
simplifying something to itself. The patch below tries to address this
by improving that function a little. As stated in the PR, the situation
is that we have
(plus:SI (if_then_else:SI (eq (reg:CC 185)
(const_int 0 [0]))
(reg:SI 165)
(reg:SI 174 [ t9.0_1+4 ]))
(reg:SI 165))
Reg 165 is known to be zero or one, so it gets turned into a condition,
and we have two different conditions on the operands. That causes us to
fail to make the fairly obvious transformation to
cond = reg:CC 185
true_rtx = (plus r165 r165)
false_rtx = (plus r174 r165)
So, when looking for situations where we have only one condition, we can
try to undo the conversion of a plain REG into a condition, on the
grounds that this is probably less helpful.
This seems to cure the testcase, but Segher also has a patch in the PR
that looks like a good and more direct approach. IMO both should be
applied. This one was bootstrapped and tested on x86_64-linux. Ok?
Bernd
[-- Attachment #2: 71724-a.diff --]
[-- Type: text/x-patch, Size: 1518 bytes --]
PR rtl-optimization/71724
* combine.c (if_then_else_cond): Look for situations where it is
beneficial to undo the work of one of the recursive calls.
Index: gcc/combine.c
===================================================================
--- gcc/combine.c (revision 244528)
+++ gcc/combine.c (working copy)
@@ -9044,11 +9044,31 @@ if_then_else_cond (rtx x, rtx *ptrue, rt
the same value, compute the new true and false values. */
else if (BINARY_P (x))
{
- cond0 = if_then_else_cond (XEXP (x, 0), &true0, &false0);
- cond1 = if_then_else_cond (XEXP (x, 1), &true1, &false1);
+ rtx op0 = XEXP (x, 0);
+ rtx op1 = XEXP (x, 1);
+ cond0 = if_then_else_cond (op0, &true0, &false0);
+ cond1 = if_then_else_cond (op1, &true1, &false1);
+
+ if ((cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1))
+ && (REG_P (op0) || REG_P (op1)))
+ {
+ /* Try to enable a simplification by undoing work done by
+ if_then_else_cond if it converted a REG into something more
+ complex. */
+ if (REG_P (op0))
+ {
+ cond0 = 0;
+ true0 = false0 = op0;
+ }
+ else
+ {
+ cond1 = 0;
+ true1 = false1 = op1;
+ }
+ }
if ((cond0 != 0 || cond1 != 0)
- && ! (cond0 != 0 && cond1 != 0 && ! rtx_equal_p (cond0, cond1)))
+ && ! (cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1)))
{
/* If if_then_else_cond returned zero, then true/false are the
same rtl. We must copy one of them to prevent invalid rtl
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt
@ 2017-01-20 19:30 ` Segher Boessenkool
2017-01-24 9:38 ` Christophe Lyon
2017-01-25 20:54 ` Segher Boessenkool
0 siblings, 2 replies; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-20 19:30 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
Hi Bernd,
On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
> So, when looking for situations where we have only one condition, we can
> try to undo the conversion of a plain REG into a condition, on the
> grounds that this is probably less helpful.
>
> This seems to cure the testcase, but Segher also has a patch in the PR
> that looks like a good and more direct approach. IMO both should be
> applied. This one was bootstrapped and tested on x86_64-linux. Ok?
My patch does not cure all problems, it simply simplifies things a bit
better; but the same is true for your patch if I read it correctly.
Okay for trunk, and I'll do my half as well. Thanks,
Segher
> PR rtl-optimization/71724
> * combine.c (if_then_else_cond): Look for situations where it is
> beneficial to undo the work of one of the recursive calls.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-20 19:30 ` Segher Boessenkool
@ 2017-01-24 9:38 ` Christophe Lyon
2017-01-24 12:52 ` Bernd Schmidt
2017-01-25 20:54 ` Segher Boessenkool
1 sibling, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24 9:38 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On 20 January 2017 at 20:24, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> Hi Bernd,
>
> On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
>> So, when looking for situations where we have only one condition, we can
>> try to undo the conversion of a plain REG into a condition, on the
>> grounds that this is probably less helpful.
>>
>> This seems to cure the testcase, but Segher also has a patch in the PR
>> that looks like a good and more direct approach. IMO both should be
>> applied. This one was bootstrapped and tested on x86_64-linux. Ok?
>
> My patch does not cure all problems, it simply simplifies things a bit
> better; but the same is true for your patch if I read it correctly.
>
> Okay for trunk, and I'll do my half as well. Thanks,
>
>
Hi,
It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
Maybe the upcoming patch from Segher intends to address this?
Thanks,
Christophe
> Segher
>
>
>> PR rtl-optimization/71724
>> * combine.c (if_then_else_cond): Look for situations where it is
>> beneficial to undo the work of one of the recursive calls.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 9:38 ` Christophe Lyon
@ 2017-01-24 12:52 ` Bernd Schmidt
2017-01-24 15:30 ` Segher Boessenkool
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 12:52 UTC (permalink / raw)
To: Christophe Lyon, Segher Boessenkool; +Cc: GCC Patches
On 01/24/2017 09:38 AM, Christophe Lyon wrote:
> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>
> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>
> Maybe the upcoming patch from Segher intends to address this?
Not really. How exactly do I reproduce this - can you give me a command
line and expected assembly output (I'm having some trouble figuring out
the right set of switches)?
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 12:52 ` Bernd Schmidt
@ 2017-01-24 15:30 ` Segher Boessenkool
2017-01-24 16:28 ` Kyrill Tkachov
0 siblings, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-24 15:30 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches
On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
> >It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
> >--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
> >
> > gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> > gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
> > gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
> > gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
> >
> >Maybe the upcoming patch from Segher intends to address this?
>
> Not really.
Not at all, even, first I hear of this. Please open a PR.
> How exactly do I reproduce this - can you give me a command
> line and expected assembly output (I'm having some trouble figuring out
> the right set of switches)?
Same here.
Segher
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 16:28 ` Kyrill Tkachov
@ 2017-01-24 16:28 ` Christophe Lyon
2017-01-24 16:36 ` Kyrill Tkachov
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24 16:28 UTC (permalink / raw)
To: Kyrill Tkachov; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches
On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>
> On 24/01/17 15:21, Segher Boessenkool wrote:
>>
>> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>>>
>>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>>>
>>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>>
>>>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>
>>>> Maybe the upcoming patch from Segher intends to address this?
>>>
>>> Not really.
>>
>> Not at all, even, first I hear of this. Please open a PR.
>>
>>> How exactly do I reproduce this - can you give me a command
>>> line and expected assembly output (I'm having some trouble figuring out
>>> the right set of switches)?
>>
>> Same here.
>
>
> I haven't looked at these in detail but these are the ARMv8 FP conditional
> select tests
> and they would be affected by if_then_else generation.
>
> So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5
> would reproduce this.
> Maybe the Cortex-A5 BRANCH_COST affects things...
>
Thanks Kyrill, and sorry for not replying earlier.
Yes something like that.
The configuration in question is configured with:
-target=arm-none-linux-gnueabihf --with-float=hard
--enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5
--with-fpu=vfpv3-d16-fp16
If you can't easily rebuild the whole toolchain, you can use an
arm-linux-gnueabihf toolchain and
use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard.
It's not the first time that Cortex-A5's BRANCH_COST implies
changes on a testcase.
Christophe
> Kyrill
>
>
>
>>
>> Segher
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 15:30 ` Segher Boessenkool
@ 2017-01-24 16:28 ` Kyrill Tkachov
2017-01-24 16:28 ` Christophe Lyon
0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:28 UTC (permalink / raw)
To: Segher Boessenkool, Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches
On 24/01/17 15:21, Segher Boessenkool wrote:
> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>
>>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>
>>> Maybe the upcoming patch from Segher intends to address this?
>> Not really.
> Not at all, even, first I hear of this. Please open a PR.
>
>> How exactly do I reproduce this - can you give me a command
>> line and expected assembly output (I'm having some trouble figuring out
>> the right set of switches)?
> Same here.
I haven't looked at these in detail but these are the ARMv8 FP conditional select tests
and they would be affected by if_then_else generation.
So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 would reproduce this.
Maybe the Cortex-A5 BRANCH_COST affects things...
Kyrill
>
> Segher
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 16:28 ` Christophe Lyon
@ 2017-01-24 16:36 ` Kyrill Tkachov
2017-01-24 16:36 ` Bernd Schmidt
0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:36 UTC (permalink / raw)
To: Christophe Lyon; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches
On 24/01/17 16:28, Christophe Lyon wrote:
> On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>> On 24/01/17 15:21, Segher Boessenkool wrote:
>>> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote:
>>>> On 01/24/2017 09:38 AM, Christophe Lyon wrote:
>>>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf
>>>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16:
>>>>>
>>>>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1
>>>>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1
>>>>>
>>>>> Maybe the upcoming patch from Segher intends to address this?
>>>> Not really.
>>> Not at all, even, first I hear of this. Please open a PR.
>>>
>>>> How exactly do I reproduce this - can you give me a command
>>>> line and expected assembly output (I'm having some trouble figuring out
>>>> the right set of switches)?
>>> Same here.
>>
>> I haven't looked at these in detail but these are the ARMv8 FP conditional
>> select tests
>> and they would be affected by if_then_else generation.
>>
>> So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5
>> would reproduce this.
>> Maybe the Cortex-A5 BRANCH_COST affects things...
>>
> Thanks Kyrill, and sorry for not replying earlier.
> Yes something like that.
>
> The configuration in question is configured with:
> -target=arm-none-linux-gnueabihf --with-float=hard
> --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5
> --with-fpu=vfpv3-d16-fp16
>
> If you can't easily rebuild the whole toolchain, you can use an
> arm-linux-gnueabihf toolchain and
> use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard.
The -mfpu is overridden in the testcase to add the ARMv8 instructions.
So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or
something equivalent rather than vfpv3-d16-fp16.
Thanks,
Kyrill
>
> It's not the first time that Cortex-A5's BRANCH_COST implies
> changes on a testcase.
>
> Christophe
>
>> Kyrill
>>
>>
>>
>>> Segher
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 16:36 ` Kyrill Tkachov
@ 2017-01-24 16:36 ` Bernd Schmidt
2017-01-24 16:55 ` Kyrill Tkachov
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 16:36 UTC (permalink / raw)
To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches
On 01/24/2017 05:30 PM, Kyrill Tkachov wrote:
>
> The -mfpu is overridden in the testcase to add the ARMv8 instructions.
> So to reproduce the compilation in that testcase you'd want
> -mfpu=fp-armv8 or
> something equivalent rather than vfpv3-d16-fp16.
Exact steps please. No one who's not well-versed in all the ARM variants
will be able to figure this out. I've been able to generate identical
before/after code, both with and without vselvs.f64 instructions, after
trying out a number of switch combinations, but I've not been able to
find a way to show where the patch makes a difference.
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 16:36 ` Bernd Schmidt
@ 2017-01-24 16:55 ` Kyrill Tkachov
2017-01-24 17:03 ` Bernd Schmidt
0 siblings, 1 reply; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-24 16:55 UTC (permalink / raw)
To: Bernd Schmidt, Kyrill Tkachov, Christophe Lyon
Cc: Segher Boessenkool, GCC Patches
On 24/01/17 16:36, Bernd Schmidt wrote:
> On 01/24/2017 05:30 PM, Kyrill Tkachov wrote:
>>
>> The -mfpu is overridden in the testcase to add the ARMv8 instructions.
>> So to reproduce the compilation in that testcase you'd want
>> -mfpu=fp-armv8 or
>> something equivalent rather than vfpv3-d16-fp16.
>
> Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of
> switch combinations, but I've not been able to find a way to show where the patch makes a difference.
>
I was just off-hand trying to give the options that would be
expected to be exercised in this testcase.
Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
the test failing before and after the patch. The code generated is
vcmp.f64 d0, d1
vmrs APSR_nzcv, FPSCR
vmovvs.f64 d0, d1
bx lr
whereas the desired (e.g. with -mcpu=cortex-a57) is:
vcmp.f64 d0, d1
vmrs APSR_nzcv, FPSCR
vselvs.f64 d0, d1, d0
bx lr
Given that VSEL is an ARMv8-A instruction and Cortex-A5 is an ARMv7-A cpu it doesn't make much sense to try getting
it to generate that VSEL. So maybe we should just include an explicit -mtune=cortex-a57 to the testcases.
Thanks,
Kyrill
>
> Bernd
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 16:55 ` Kyrill Tkachov
@ 2017-01-24 17:03 ` Bernd Schmidt
2017-01-24 17:08 ` Christophe Lyon
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 17:03 UTC (permalink / raw)
To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches
On 01/24/2017 05:50 PM, Kyrill Tkachov wrote:
>
> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S
> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
> the test failing before and after the patch. The code generated is
> vcmp.f64 d0, d1
> vmrs APSR_nzcv, FPSCR
> vmovvs.f64 d0, d1
> bx lr
>
> whereas the desired (e.g. with -mcpu=cortex-a57) is:
> vcmp.f64 d0, d1
> vmrs APSR_nzcv, FPSCR
> vselvs.f64 d0, d1, d0
> bx lr
Yes, I've seen both of these generated with different options, but the
patch did not make a difference here either.
For the moment I'll assume this was a false alarm, i.e. Christophe
misidentified the patch and something else went wrong.
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 17:03 ` Bernd Schmidt
@ 2017-01-24 17:08 ` Christophe Lyon
2017-01-24 17:22 ` Bernd Schmidt
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-24 17:08 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches
On 24 January 2017 at 17:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/24/2017 05:50 PM, Kyrill Tkachov wrote:
>>
>>
>> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S
>> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
>> the test failing before and after the patch. The code generated is
>> vcmp.f64 d0, d1
>> vmrs APSR_nzcv, FPSCR
>> vmovvs.f64 d0, d1
>> bx lr
>>
>> whereas the desired (e.g. with -mcpu=cortex-a57) is:
>> vcmp.f64 d0, d1
>> vmrs APSR_nzcv, FPSCR
>> vselvs.f64 d0, d1, d0
>> bx lr
>
>
> Yes, I've seen both of these generated with different options, but the patch
> did not make a difference here either.
>
> For the moment I'll assume this was a false alarm, i.e. Christophe
> misidentified the patch and something else went wrong.
>
Ha... the regression occurred between r 244818 and r 244816,
and I read r 244816 ChangeLog too quickly and did not notice
it was modifying ifcvt.c in addition to x86-only files.
So it's likely that it's your other patch for pr78634
that caused the regression I mentioned. Does it make
more sense?
Sorry for the probably wrong identification.
It always takes some time for me to reproduce regressions
manually because I can only keep logs/results of
validations, the toolchains actually built are deleted
once validation completes.
Christophe
>
> Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 17:08 ` Christophe Lyon
@ 2017-01-24 17:22 ` Bernd Schmidt
2017-01-25 9:06 ` Christophe Lyon
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-24 17:22 UTC (permalink / raw)
To: Christophe Lyon; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches
On 01/24/2017 06:03 PM, Christophe Lyon wrote:
> Ha... the regression occurred between r 244818 and r 244816,
> and I read r 244816 ChangeLog too quickly and did not notice
> it was modifying ifcvt.c in addition to x86-only files.
>
> So it's likely that it's your other patch for pr78634
> that caused the regression I mentioned. Does it make
> more sense?
That's possible. That added a missing cost check, so the question
becomes - is the change in generated assembly sensible, given the
selected CPU type?
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-24 17:22 ` Bernd Schmidt
@ 2017-01-25 9:06 ` Christophe Lyon
2017-01-25 9:21 ` Kyrill Tkachov
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25 9:06 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches
On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>
>> Ha... the regression occurred between r 244818 and r 244816,
>> and I read r 244816 ChangeLog too quickly and did not notice
>> it was modifying ifcvt.c in addition to x86-only files.
>>
>> So it's likely that it's your other patch for pr78634
>> that caused the regression I mentioned. Does it make
>> more sense?
>
>
> That's possible. That added a missing cost check, so the question becomes -
> is the change in generated assembly sensible, given the selected CPU type?
>
I can now confirm that the change is indeed caused by r244816 (pr78634).
The difference in the generated asm is:
- vmov d17, r0, r1
- vmov d16, r2, r3
- vcmp.f64 d17, d16
+ vmov d16, r0, r1
+ vmov d17, r2, r3
+ vcmp.f64 d16, d17
vmrs APSR_nzcv, FPSCR
- vselvs.f64 d16, d16, d17
+ vmovvs.f64 d16, d17
which, besides swapping d16 and d17, summarizes to
- vselvs.f64 d16, d16, d17
+ vmovvs.f64 d16, d17
I'm not sure if there is a "best" one ?
>
> Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 9:06 ` Christophe Lyon
@ 2017-01-25 9:21 ` Kyrill Tkachov
2017-01-25 10:01 ` Christophe Lyon
2017-01-25 14:58 ` Bernd Schmidt
0 siblings, 2 replies; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-25 9:21 UTC (permalink / raw)
To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches
On 25/01/17 08:53, Christophe Lyon wrote:
> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>> Ha... the regression occurred between r 244818 and r 244816,
>>> and I read r 244816 ChangeLog too quickly and did not notice
>>> it was modifying ifcvt.c in addition to x86-only files.
>>>
>>> So it's likely that it's your other patch for pr78634
>>> that caused the regression I mentioned. Does it make
>>> more sense?
>>
>> That's possible. That added a missing cost check, so the question becomes -
>> is the change in generated assembly sensible, given the selected CPU type?
>>
> I can now confirm that the change is indeed caused by r244816 (pr78634).
> The difference in the generated asm is:
> - vmov d17, r0, r1
> - vmov d16, r2, r3
> - vcmp.f64 d17, d16
> + vmov d16, r0, r1
> + vmov d17, r2, r3
> + vcmp.f64 d16, d17
> vmrs APSR_nzcv, FPSCR
> - vselvs.f64 d16, d16, d17
> + vmovvs.f64 d16, d17
>
> which, besides swapping d16 and d17, summarizes to
> - vselvs.f64 d16, d16, d17
> + vmovvs.f64 d16, d17
>
> I'm not sure if there is a "best" one ?
>
The test is supposed to test the generation of the vsel instruction.
I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
VSEL isn't actually available on Cortex-A5, it's just enabled by the -mfpu=fp-armv8 option.
A more realistic configuration would target an ARMv8-A CPU like the Cortex-A57.
Thanks,
Kyrill
>> Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 9:21 ` Kyrill Tkachov
@ 2017-01-25 10:01 ` Christophe Lyon
2017-01-25 10:34 ` Richard Earnshaw (lists)
2017-01-25 14:58 ` Bernd Schmidt
1 sibling, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25 10:01 UTC (permalink / raw)
To: Kyrill Tkachov; +Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches
On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>
> On 25/01/17 08:53, Christophe Lyon wrote:
>>
>> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>>>
>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>>>
>>>> Ha... the regression occurred between r 244818 and r 244816,
>>>> and I read r 244816 ChangeLog too quickly and did not notice
>>>> it was modifying ifcvt.c in addition to x86-only files.
>>>>
>>>> So it's likely that it's your other patch for pr78634
>>>> that caused the regression I mentioned. Does it make
>>>> more sense?
>>>
>>>
>>> That's possible. That added a missing cost check, so the question becomes
>>> -
>>> is the change in generated assembly sensible, given the selected CPU
>>> type?
>>>
>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>> The difference in the generated asm is:
>> - vmov d17, r0, r1
>> - vmov d16, r2, r3
>> - vcmp.f64 d17, d16
>> + vmov d16, r0, r1
>> + vmov d17, r2, r3
>> + vcmp.f64 d16, d17
>> vmrs APSR_nzcv, FPSCR
>> - vselvs.f64 d16, d16, d17
>> + vmovvs.f64 d16, d17
>>
>> which, besides swapping d16 and d17, summarizes to
>> - vselvs.f64 d16, d16, d17
>> + vmovvs.f64 d16, d17
>>
>> I'm not sure if there is a "best" one ?
>>
>
> The test is supposed to test the generation of the vsel instruction.
> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
> VSEL isn't actually available on Cortex-A5, it's just enabled by the
> -mfpu=fp-armv8 option.
> A more realistic configuration would target an ARMv8-A CPU like the
> Cortex-A57.
>
Yes indeed, it's always confusing to be able to provide "incompatible"
-mcpu and -mfpu flags (as in: no such combination actually exists).
> Thanks,
> Kyrill
>
>>> Bernd
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 10:01 ` Christophe Lyon
@ 2017-01-25 10:34 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 23+ messages in thread
From: Richard Earnshaw (lists) @ 2017-01-25 10:34 UTC (permalink / raw)
To: Christophe Lyon, Kyrill Tkachov
Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches
On 25/01/17 09:29, Christophe Lyon wrote:
> On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>>
>> On 25/01/17 08:53, Christophe Lyon wrote:
>>>
>>> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote:
>>>>
>>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>>>>
>>>>> Ha... the regression occurred between r 244818 and r 244816,
>>>>> and I read r 244816 ChangeLog too quickly and did not notice
>>>>> it was modifying ifcvt.c in addition to x86-only files.
>>>>>
>>>>> So it's likely that it's your other patch for pr78634
>>>>> that caused the regression I mentioned. Does it make
>>>>> more sense?
>>>>
>>>>
>>>> That's possible. That added a missing cost check, so the question becomes
>>>> -
>>>> is the change in generated assembly sensible, given the selected CPU
>>>> type?
>>>>
>>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>>> The difference in the generated asm is:
>>> - vmov d17, r0, r1
>>> - vmov d16, r2, r3
>>> - vcmp.f64 d17, d16
>>> + vmov d16, r0, r1
>>> + vmov d17, r2, r3
>>> + vcmp.f64 d16, d17
>>> vmrs APSR_nzcv, FPSCR
>>> - vselvs.f64 d16, d16, d17
>>> + vmovvs.f64 d16, d17
>>>
>>> which, besides swapping d16 and d17, summarizes to
>>> - vselvs.f64 d16, d16, d17
>>> + vmovvs.f64 d16, d17
>>>
>>> I'm not sure if there is a "best" one ?
>>>
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>>
>
> Yes indeed, it's always confusing to be able to provide "incompatible"
> -mcpu and -mfpu flags (as in: no such combination actually exists).
>
As discussed at the Cauldron, I'm working on fixing that, but it won't
be until GCC-8 now.
R.
>
>> Thanks,
>> Kyrill
>>
>>>> Bernd
>>
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 9:21 ` Kyrill Tkachov
2017-01-25 10:01 ` Christophe Lyon
@ 2017-01-25 14:58 ` Bernd Schmidt
2017-01-25 15:25 ` Christophe Lyon
1 sibling, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-25 14:58 UTC (permalink / raw)
To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches
On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
> The test is supposed to test the generation of the vsel instruction.
> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
> VSEL isn't actually available on Cortex-A5, it's just enabled by the
> -mfpu=fp-armv8 option.
> A more realistic configuration would target an ARMv8-A CPU like the
> Cortex-A57.
Ok, let me know if there's anything else you need from my side.
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 14:58 ` Bernd Schmidt
@ 2017-01-25 15:25 ` Christophe Lyon
2017-01-25 15:43 ` Kyrill Tkachov
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Lyon @ 2017-01-25 15:25 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches
[-- Attachment #1: Type: text/plain, Size: 1149 bytes --]
On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>
>
> Ok, let me know if there's anything else you need from my side.
>
Kyrill,
How about the attached patch?
I've added dg-require-effective-target arm_arch_v8a_ok to make sure
it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
to make sure the intended instructions are present (in case at some
point add-options-for-arm-arch-v8a activates costs/arch variant that
would imply not generating vsel anymore).
I've noticed there are other tests adding arm_v8_vfp and not making
sure to select an appriopriate cpu. As a follow-up patch?
And I checked that my patch makes the tests pass again even
when configuring --with-cpu=cortex-a5.
Thanks,
Christophe
>
> Bernd
>
[-- Attachment #2: vsel.chlog.txt --]
[-- Type: text/plain, Size: 764 bytes --]
gcc/testsuite/ChangeLog:
2017-01-25 Christophe Lyon <christophe.lyon@linaro.org>
* gcc.target/arm/vseleqdf.c: Require arm_arch_v8a_ok, add
-mcpu=cortex-a57.
* gcc.target/arm/vseleqsf.c: Likewise.
* gcc.target/arm/vselgedf.c: Likewise.
* gcc.target/arm/vselgesf.c: Likewise.
* gcc.target/arm/vselgtdf.c: Likewise.
* gcc.target/arm/vselgtsf.c: Likewise.
* gcc.target/arm/vselledf.c: Likewise.
* gcc.target/arm/vsellesf.c: Likewise.
* gcc.target/arm/vselltdf.c: Likewise.
* gcc.target/arm/vselltsf.c: Likewise.
* gcc.target/arm/vselnedf.c: Likewise.
* gcc.target/arm/vselnesf.c: Likewise.
* gcc.target/arm/vselvcdf.c: Likewise.
* gcc.target/arm/vselvcsf.c: Likewise.
* gcc.target/arm/vselvsdf.c: Likewise.
* gcc.target/arm/vselvssf.c: Likewise.
[-- Attachment #3: vsel.patch.txt --]
[-- Type: text/plain, Size: 7736 bytes --]
diff --git a/gcc/testsuite/gcc.target/arm/vseleqdf.c b/gcc/testsuite/gcc.target/arm/vseleqdf.c
index 86e147b..64d5784 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqdf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqdf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vseleqsf.c b/gcc/testsuite/gcc.target/arm/vseleqsf.c
index 120f44b..b052704 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqsf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqsf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselgedf.c b/gcc/testsuite/gcc.target/arm/vselgedf.c
index cea08d1..e10508f 100644
--- a/gcc/testsuite/gcc.target/arm/vselgedf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgedf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselgesf.c b/gcc/testsuite/gcc.target/arm/vselgesf.c
index 86f2a04..645cf5d 100644
--- a/gcc/testsuite/gcc.target/arm/vselgesf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgesf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselgtdf.c b/gcc/testsuite/gcc.target/arm/vselgtdf.c
index 2c4a6ba..741b9a8 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtdf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselgtsf.c b/gcc/testsuite/gcc.target/arm/vselgtsf.c
index 388e74c..3042c5b 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtsf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselledf.c b/gcc/testsuite/gcc.target/arm/vselledf.c
index 088dc04..dcf46a3 100644
--- a/gcc/testsuite/gcc.target/arm/vselledf.c
+++ b/gcc/testsuite/gcc.target/arm/vselledf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vsellesf.c b/gcc/testsuite/gcc.target/arm/vsellesf.c
index d0afdbc..38b06eb 100644
--- a/gcc/testsuite/gcc.target/arm/vsellesf.c
+++ b/gcc/testsuite/gcc.target/arm/vsellesf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselltdf.c b/gcc/testsuite/gcc.target/arm/vselltdf.c
index fbcb9ea..4d4d91c 100644
--- a/gcc/testsuite/gcc.target/arm/vselltdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselltdf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselltsf.c b/gcc/testsuite/gcc.target/arm/vselltsf.c
index 959dab7..ab3f77f 100644
--- a/gcc/testsuite/gcc.target/arm/vselltsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselltsf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselnedf.c b/gcc/testsuite/gcc.target/arm/vselnedf.c
index cf67f29..4b0fa5e 100644
--- a/gcc/testsuite/gcc.target/arm/vselnedf.c
+++ b/gcc/testsuite/gcc.target/arm/vselnedf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselnesf.c b/gcc/testsuite/gcc.target/arm/vselnesf.c
index 2e16423..4fcd5a0 100644
--- a/gcc/testsuite/gcc.target/arm/vselnesf.c
+++ b/gcc/testsuite/gcc.target/arm/vselnesf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselvcdf.c b/gcc/testsuite/gcc.target/arm/vselvcdf.c
index 7f30270..9701e11 100644
--- a/gcc/testsuite/gcc.target/arm/vselvcdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvcdf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselvcsf.c b/gcc/testsuite/gcc.target/arm/vselvcsf.c
index 1bb7369..2d47f30 100644
--- a/gcc/testsuite/gcc.target/arm/vselvcsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvcsf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
diff --git a/gcc/testsuite/gcc.target/arm/vselvsdf.c b/gcc/testsuite/gcc.target/arm/vselvsdf.c
index 83ad5bf..54555a2 100644
--- a/gcc/testsuite/gcc.target/arm/vselvsdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvsdf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
double
diff --git a/gcc/testsuite/gcc.target/arm/vselvssf.c b/gcc/testsuite/gcc.target/arm/vselvssf.c
index 7d76289..a37305d 100644
--- a/gcc/testsuite/gcc.target/arm/vselvssf.c
+++ b/gcc/testsuite/gcc.target/arm/vselvssf.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
/* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
/* { dg-add-options arm_v8_vfp } */
float
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 15:25 ` Christophe Lyon
@ 2017-01-25 15:43 ` Kyrill Tkachov
0 siblings, 0 replies; 23+ messages in thread
From: Kyrill Tkachov @ 2017-01-25 15:43 UTC (permalink / raw)
To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches
On 25/01/17 15:20, Christophe Lyon wrote:
> On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
>>> The test is supposed to test the generation of the vsel instruction.
>>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>>> -mfpu=fp-armv8 option.
>>> A more realistic configuration would target an ARMv8-A CPU like the
>>> Cortex-A57.
>>
>> Ok, let me know if there's anything else you need from my side.
>>
> Kyrill,
>
> How about the attached patch?
Yes, thanks Christophe.
> I've added dg-require-effective-target arm_arch_v8a_ok to make sure
> it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
> to make sure the intended instructions are present (in case at some
> point add-options-for-arm-arch-v8a activates costs/arch variant that
> would imply not generating vsel anymore).
>
> I've noticed there are other tests adding arm_v8_vfp and not making
> sure to select an appriopriate cpu. As a follow-up patch?
I wouldn't want to do that too much in the testsuite.
In the VSEL tests we have a C-level idiom (?: construct) that we expect
the optimisers to transform into a conditional select instruction that may or may not
be a win on some cores.
In some of those other tests I suspect we want to generate the instruction for all tunings.
I.e. I'd expect the rounding tests (__builtin_floor/trunc etc) to always generate VRINT*
when the -mfpu allows it, regardless of the CPU tuning.
Kyrill
> And I checked that my patch makes the tests pass again even
> when configuring --with-cpu=cortex-a5.
> Thanks,
>
> Christophe
>
>> Bernd
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-20 19:30 ` Segher Boessenkool
2017-01-24 9:38 ` Christophe Lyon
@ 2017-01-25 20:54 ` Segher Boessenkool
2017-01-26 13:04 ` Bernd Schmidt
1 sibling, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-25 20:54 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Fri, Jan 20, 2017 at 01:24:15PM -0600, Segher Boessenkool wrote:
> On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
> > So, when looking for situations where we have only one condition, we can
> > try to undo the conversion of a plain REG into a condition, on the
> > grounds that this is probably less helpful.
> >
> > This seems to cure the testcase, but Segher also has a patch in the PR
> > that looks like a good and more direct approach. IMO both should be
> > applied. This one was bootstrapped and tested on x86_64-linux. Ok?
>
> My patch does not cure all problems, it simply simplifies things a bit
> better; but the same is true for your patch if I read it correctly.
>
> Okay for trunk, and I'll do my half as well. Thanks,
It turns out my patch (see the PR) causes (or at least triggers)
miscompilations on tilegx. I will drop it for now.
Longer term we will have to fix this whole if_then_else_cond business,
maybe make it less expensive and/or more effective as well.
Segher
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-25 20:54 ` Segher Boessenkool
@ 2017-01-26 13:04 ` Bernd Schmidt
2017-01-26 19:05 ` Segher Boessenkool
0 siblings, 1 reply; 23+ messages in thread
From: Bernd Schmidt @ 2017-01-26 13:04 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches
On 01/25/2017 08:46 PM, Segher Boessenkool wrote:
>
> It turns out my patch (see the PR) causes (or at least triggers)
> miscompilations on tilegx. I will drop it for now.
Curious, it looked very reasonable. What's needed to reproduce this?
Bernd
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond
2017-01-26 13:04 ` Bernd Schmidt
@ 2017-01-26 19:05 ` Segher Boessenkool
0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2017-01-26 19:05 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Thu, Jan 26, 2017 at 01:57:02PM +0100, Bernd Schmidt wrote:
> On 01/25/2017 08:46 PM, Segher Boessenkool wrote:
> >
> >It turns out my patch (see the PR) causes (or at least triggers)
> >miscompilations on tilegx. I will drop it for now.
>
> Curious, it looked very reasonable. What's needed to reproduce this?
Yeah, quite curious.
I just build a cross compiler (and binutils) targeting tilegx-linux,
and build a Linux kernel (ARCH=tile, defconfig) with that. In many
places where there are two independent conditional jumps the generated
code before the patch combines the two conditions and does one jump;
the code after the patch completely loses the second condition:
-{ cmpeqi r10, r3, 14 ; cmpeqi r11, r3, -1 }
-{ or r10, r10, r11 }
-{ beqzt r10, fffffff700031ba8 <KBacktraceIterator_restart+0x4f0> }
+{ cmpeqi r10, r3, 14 }
+{ beqzt r10, fffffff700031ba0 <KBacktraceIterator_restart+0x4e8> }
This happens quite a few times. Sometimes a lot of code is deleted.
It may well be a target problem, but it is stage 4, and I do not
currently have the bandwidth to investigate this. If you do, please
do :-)
Segher
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-01-26 18:57 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt
2017-01-20 19:30 ` Segher Boessenkool
2017-01-24 9:38 ` Christophe Lyon
2017-01-24 12:52 ` Bernd Schmidt
2017-01-24 15:30 ` Segher Boessenkool
2017-01-24 16:28 ` Kyrill Tkachov
2017-01-24 16:28 ` Christophe Lyon
2017-01-24 16:36 ` Kyrill Tkachov
2017-01-24 16:36 ` Bernd Schmidt
2017-01-24 16:55 ` Kyrill Tkachov
2017-01-24 17:03 ` Bernd Schmidt
2017-01-24 17:08 ` Christophe Lyon
2017-01-24 17:22 ` Bernd Schmidt
2017-01-25 9:06 ` Christophe Lyon
2017-01-25 9:21 ` Kyrill Tkachov
2017-01-25 10:01 ` Christophe Lyon
2017-01-25 10:34 ` Richard Earnshaw (lists)
2017-01-25 14:58 ` Bernd Schmidt
2017-01-25 15:25 ` Christophe Lyon
2017-01-25 15:43 ` Kyrill Tkachov
2017-01-25 20:54 ` Segher Boessenkool
2017-01-26 13:04 ` Bernd Schmidt
2017-01-26 19:05 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).