* Improve things for PR71724, in combine/if_then_else_cond @ 2017-01-20 12:36 Bernd Schmidt 2017-01-20 19:30 ` Segher Boessenkool 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-20 12:36 UTC (permalink / raw) To: GCC Patches; +Cc: Segher Boessenkool [-- Attachment #1: Type: text/plain, Size: 1137 bytes --] The PR is about infinite recursion in combine_simplify_rtx, because if_then_else_cond does strange things to an expression, and we end up simplifying something to itself. The patch below tries to address this by improving that function a little. As stated in the PR, the situation is that we have (plus:SI (if_then_else:SI (eq (reg:CC 185) (const_int 0 [0])) (reg:SI 165) (reg:SI 174 [ t9.0_1+4 ])) (reg:SI 165)) Reg 165 is known to be zero or one, so it gets turned into a condition, and we have two different conditions on the operands. That causes us to fail to make the fairly obvious transformation to cond = reg:CC 185 true_rtx = (plus r165 r165) false_rtx = (plus r174 r165) So, when looking for situations where we have only one condition, we can try to undo the conversion of a plain REG into a condition, on the grounds that this is probably less helpful. This seems to cure the testcase, but Segher also has a patch in the PR that looks like a good and more direct approach. IMO both should be applied. This one was bootstrapped and tested on x86_64-linux. Ok? Bernd [-- Attachment #2: 71724-a.diff --] [-- Type: text/x-patch, Size: 1518 bytes --] PR rtl-optimization/71724 * combine.c (if_then_else_cond): Look for situations where it is beneficial to undo the work of one of the recursive calls. Index: gcc/combine.c =================================================================== --- gcc/combine.c (revision 244528) +++ gcc/combine.c (working copy) @@ -9044,11 +9044,31 @@ if_then_else_cond (rtx x, rtx *ptrue, rt the same value, compute the new true and false values. */ else if (BINARY_P (x)) { - cond0 = if_then_else_cond (XEXP (x, 0), &true0, &false0); - cond1 = if_then_else_cond (XEXP (x, 1), &true1, &false1); + rtx op0 = XEXP (x, 0); + rtx op1 = XEXP (x, 1); + cond0 = if_then_else_cond (op0, &true0, &false0); + cond1 = if_then_else_cond (op1, &true1, &false1); + + if ((cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1)) + && (REG_P (op0) || REG_P (op1))) + { + /* Try to enable a simplification by undoing work done by + if_then_else_cond if it converted a REG into something more + complex. */ + if (REG_P (op0)) + { + cond0 = 0; + true0 = false0 = op0; + } + else + { + cond1 = 0; + true1 = false1 = op1; + } + } if ((cond0 != 0 || cond1 != 0) - && ! (cond0 != 0 && cond1 != 0 && ! rtx_equal_p (cond0, cond1))) + && ! (cond0 != 0 && cond1 != 0 && !rtx_equal_p (cond0, cond1))) { /* If if_then_else_cond returned zero, then true/false are the same rtl. We must copy one of them to prevent invalid rtl ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt @ 2017-01-20 19:30 ` Segher Boessenkool 2017-01-24 9:38 ` Christophe Lyon 2017-01-25 20:54 ` Segher Boessenkool 0 siblings, 2 replies; 23+ messages in thread From: Segher Boessenkool @ 2017-01-20 19:30 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches Hi Bernd, On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote: > So, when looking for situations where we have only one condition, we can > try to undo the conversion of a plain REG into a condition, on the > grounds that this is probably less helpful. > > This seems to cure the testcase, but Segher also has a patch in the PR > that looks like a good and more direct approach. IMO both should be > applied. This one was bootstrapped and tested on x86_64-linux. Ok? My patch does not cure all problems, it simply simplifies things a bit better; but the same is true for your patch if I read it correctly. Okay for trunk, and I'll do my half as well. Thanks, Segher > PR rtl-optimization/71724 > * combine.c (if_then_else_cond): Look for situations where it is > beneficial to undo the work of one of the recursive calls. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-20 19:30 ` Segher Boessenkool @ 2017-01-24 9:38 ` Christophe Lyon 2017-01-24 12:52 ` Bernd Schmidt 2017-01-25 20:54 ` Segher Boessenkool 1 sibling, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-24 9:38 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On 20 January 2017 at 20:24, Segher Boessenkool <segher@kernel.crashing.org> wrote: > Hi Bernd, > > On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote: >> So, when looking for situations where we have only one condition, we can >> try to undo the conversion of a plain REG into a condition, on the >> grounds that this is probably less helpful. >> >> This seems to cure the testcase, but Segher also has a patch in the PR >> that looks like a good and more direct approach. IMO both should be >> applied. This one was bootstrapped and tested on x86_64-linux. Ok? > > My patch does not cure all problems, it simply simplifies things a bit > better; but the same is true for your patch if I read it correctly. > > Okay for trunk, and I'll do my half as well. Thanks, > > Hi, It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? Thanks, Christophe > Segher > > >> PR rtl-optimization/71724 >> * combine.c (if_then_else_cond): Look for situations where it is >> beneficial to undo the work of one of the recursive calls. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 9:38 ` Christophe Lyon @ 2017-01-24 12:52 ` Bernd Schmidt 2017-01-24 15:30 ` Segher Boessenkool 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-24 12:52 UTC (permalink / raw) To: Christophe Lyon, Segher Boessenkool; +Cc: GCC Patches On 01/24/2017 09:38 AM, Christophe Lyon wrote: > It seems that Bernd's patch causes regressions on arm-linux-gnueabihf > --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: > > gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > > Maybe the upcoming patch from Segher intends to address this? Not really. How exactly do I reproduce this - can you give me a command line and expected assembly output (I'm having some trouble figuring out the right set of switches)? Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 12:52 ` Bernd Schmidt @ 2017-01-24 15:30 ` Segher Boessenkool 2017-01-24 16:28 ` Kyrill Tkachov 0 siblings, 1 reply; 23+ messages in thread From: Segher Boessenkool @ 2017-01-24 15:30 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: > On 01/24/2017 09:38 AM, Christophe Lyon wrote: > >It seems that Bernd's patch causes regressions on arm-linux-gnueabihf > >--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: > > > > gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > > gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > > gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > > gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > > > >Maybe the upcoming patch from Segher intends to address this? > > Not really. Not at all, even, first I hear of this. Please open a PR. > How exactly do I reproduce this - can you give me a command > line and expected assembly output (I'm having some trouble figuring out > the right set of switches)? Same here. Segher ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 15:30 ` Segher Boessenkool @ 2017-01-24 16:28 ` Kyrill Tkachov 2017-01-24 16:28 ` Christophe Lyon 0 siblings, 1 reply; 23+ messages in thread From: Kyrill Tkachov @ 2017-01-24 16:28 UTC (permalink / raw) To: Segher Boessenkool, Bernd Schmidt; +Cc: Christophe Lyon, GCC Patches On 24/01/17 15:21, Segher Boessenkool wrote: > On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: >> On 01/24/2017 09:38 AM, Christophe Lyon wrote: >>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf >>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: >>> >>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>> >>> Maybe the upcoming patch from Segher intends to address this? >> Not really. > Not at all, even, first I hear of this. Please open a PR. > >> How exactly do I reproduce this - can you give me a command >> line and expected assembly output (I'm having some trouble figuring out >> the right set of switches)? > Same here. I haven't looked at these in detail but these are the ARMv8 FP conditional select tests and they would be affected by if_then_else generation. So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 would reproduce this. Maybe the Cortex-A5 BRANCH_COST affects things... Kyrill > > Segher ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 16:28 ` Kyrill Tkachov @ 2017-01-24 16:28 ` Christophe Lyon 2017-01-24 16:36 ` Kyrill Tkachov 0 siblings, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-24 16:28 UTC (permalink / raw) To: Kyrill Tkachov; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > On 24/01/17 15:21, Segher Boessenkool wrote: >> >> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: >>> >>> On 01/24/2017 09:38 AM, Christophe Lyon wrote: >>>> >>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf >>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: >>>> >>>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>>> >>>> Maybe the upcoming patch from Segher intends to address this? >>> >>> Not really. >> >> Not at all, even, first I hear of this. Please open a PR. >> >>> How exactly do I reproduce this - can you give me a command >>> line and expected assembly output (I'm having some trouble figuring out >>> the right set of switches)? >> >> Same here. > > > I haven't looked at these in detail but these are the ARMv8 FP conditional > select tests > and they would be affected by if_then_else generation. > > So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 > would reproduce this. > Maybe the Cortex-A5 BRANCH_COST affects things... > Thanks Kyrill, and sorry for not replying earlier. Yes something like that. The configuration in question is configured with: -target=arm-none-linux-gnueabihf --with-float=hard --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16 If you can't easily rebuild the whole toolchain, you can use an arm-linux-gnueabihf toolchain and use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard. It's not the first time that Cortex-A5's BRANCH_COST implies changes on a testcase. Christophe > Kyrill > > > >> >> Segher > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 16:28 ` Christophe Lyon @ 2017-01-24 16:36 ` Kyrill Tkachov 2017-01-24 16:36 ` Bernd Schmidt 0 siblings, 1 reply; 23+ messages in thread From: Kyrill Tkachov @ 2017-01-24 16:36 UTC (permalink / raw) To: Christophe Lyon; +Cc: Segher Boessenkool, Bernd Schmidt, GCC Patches On 24/01/17 16:28, Christophe Lyon wrote: > On 24 January 2017 at 17:02, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: >> On 24/01/17 15:21, Segher Boessenkool wrote: >>> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: >>>> On 01/24/2017 09:38 AM, Christophe Lyon wrote: >>>>> It seems that Bernd's patch causes regressions on arm-linux-gnueabihf >>>>> --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: >>>>> >>>>> gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>>>> gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>>>> gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 >>>>> gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 >>>>> >>>>> Maybe the upcoming patch from Segher intends to address this? >>>> Not really. >>> Not at all, even, first I hear of this. Please open a PR. >>> >>>> How exactly do I reproduce this - can you give me a command >>>> line and expected assembly output (I'm having some trouble figuring out >>>> the right set of switches)? >>> Same here. >> >> I haven't looked at these in detail but these are the ARMv8 FP conditional >> select tests >> and they would be affected by if_then_else generation. >> >> So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 >> would reproduce this. >> Maybe the Cortex-A5 BRANCH_COST affects things... >> > Thanks Kyrill, and sorry for not replying earlier. > Yes something like that. > > The configuration in question is configured with: > -target=arm-none-linux-gnueabihf --with-float=hard > --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5 > --with-fpu=vfpv3-d16-fp16 > > If you can't easily rebuild the whole toolchain, you can use an > arm-linux-gnueabihf toolchain and > use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard. The -mfpu is overridden in the testcase to add the ARMv8 instructions. So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or something equivalent rather than vfpv3-d16-fp16. Thanks, Kyrill > > It's not the first time that Cortex-A5's BRANCH_COST implies > changes on a testcase. > > Christophe > >> Kyrill >> >> >> >>> Segher >> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 16:36 ` Kyrill Tkachov @ 2017-01-24 16:36 ` Bernd Schmidt 2017-01-24 16:55 ` Kyrill Tkachov 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-24 16:36 UTC (permalink / raw) To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches On 01/24/2017 05:30 PM, Kyrill Tkachov wrote: > > The -mfpu is overridden in the testcase to add the ARMv8 instructions. > So to reproduce the compilation in that testcase you'd want > -mfpu=fp-armv8 or > something equivalent rather than vfpv3-d16-fp16. Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of switch combinations, but I've not been able to find a way to show where the patch makes a difference. Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 16:36 ` Bernd Schmidt @ 2017-01-24 16:55 ` Kyrill Tkachov 2017-01-24 17:03 ` Bernd Schmidt 0 siblings, 1 reply; 23+ messages in thread From: Kyrill Tkachov @ 2017-01-24 16:55 UTC (permalink / raw) To: Bernd Schmidt, Kyrill Tkachov, Christophe Lyon Cc: Segher Boessenkool, GCC Patches On 24/01/17 16:36, Bernd Schmidt wrote: > On 01/24/2017 05:30 PM, Kyrill Tkachov wrote: >> >> The -mfpu is overridden in the testcase to add the ARMv8 instructions. >> So to reproduce the compilation in that testcase you'd want >> -mfpu=fp-armv8 or >> something equivalent rather than vfpv3-d16-fp16. > > Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of > switch combinations, but I've not been able to find a way to show where the patch makes a difference. > I was just off-hand trying to give the options that would be expected to be exercised in this testcase. Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get the test failing before and after the patch. The code generated is vcmp.f64 d0, d1 vmrs APSR_nzcv, FPSCR vmovvs.f64 d0, d1 bx lr whereas the desired (e.g. with -mcpu=cortex-a57) is: vcmp.f64 d0, d1 vmrs APSR_nzcv, FPSCR vselvs.f64 d0, d1, d0 bx lr Given that VSEL is an ARMv8-A instruction and Cortex-A5 is an ARMv7-A cpu it doesn't make much sense to try getting it to generate that VSEL. So maybe we should just include an explicit -mtune=cortex-a57 to the testcases. Thanks, Kyrill > > Bernd > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 16:55 ` Kyrill Tkachov @ 2017-01-24 17:03 ` Bernd Schmidt 2017-01-24 17:08 ` Christophe Lyon 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-24 17:03 UTC (permalink / raw) To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches On 01/24/2017 05:50 PM, Kyrill Tkachov wrote: > > Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S > -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get > the test failing before and after the patch. The code generated is > vcmp.f64 d0, d1 > vmrs APSR_nzcv, FPSCR > vmovvs.f64 d0, d1 > bx lr > > whereas the desired (e.g. with -mcpu=cortex-a57) is: > vcmp.f64 d0, d1 > vmrs APSR_nzcv, FPSCR > vselvs.f64 d0, d1, d0 > bx lr Yes, I've seen both of these generated with different options, but the patch did not make a difference here either. For the moment I'll assume this was a false alarm, i.e. Christophe misidentified the patch and something else went wrong. Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 17:03 ` Bernd Schmidt @ 2017-01-24 17:08 ` Christophe Lyon 2017-01-24 17:22 ` Bernd Schmidt 0 siblings, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-24 17:08 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches On 24 January 2017 at 17:55, Bernd Schmidt <bschmidt@redhat.com> wrote: > On 01/24/2017 05:50 PM, Kyrill Tkachov wrote: >> >> >> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S >> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get >> the test failing before and after the patch. The code generated is >> vcmp.f64 d0, d1 >> vmrs APSR_nzcv, FPSCR >> vmovvs.f64 d0, d1 >> bx lr >> >> whereas the desired (e.g. with -mcpu=cortex-a57) is: >> vcmp.f64 d0, d1 >> vmrs APSR_nzcv, FPSCR >> vselvs.f64 d0, d1, d0 >> bx lr > > > Yes, I've seen both of these generated with different options, but the patch > did not make a difference here either. > > For the moment I'll assume this was a false alarm, i.e. Christophe > misidentified the patch and something else went wrong. > Ha... the regression occurred between r 244818 and r 244816, and I read r 244816 ChangeLog too quickly and did not notice it was modifying ifcvt.c in addition to x86-only files. So it's likely that it's your other patch for pr78634 that caused the regression I mentioned. Does it make more sense? Sorry for the probably wrong identification. It always takes some time for me to reproduce regressions manually because I can only keep logs/results of validations, the toolchains actually built are deleted once validation completes. Christophe > > Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 17:08 ` Christophe Lyon @ 2017-01-24 17:22 ` Bernd Schmidt 2017-01-25 9:06 ` Christophe Lyon 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-24 17:22 UTC (permalink / raw) To: Christophe Lyon; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches On 01/24/2017 06:03 PM, Christophe Lyon wrote: > Ha... the regression occurred between r 244818 and r 244816, > and I read r 244816 ChangeLog too quickly and did not notice > it was modifying ifcvt.c in addition to x86-only files. > > So it's likely that it's your other patch for pr78634 > that caused the regression I mentioned. Does it make > more sense? That's possible. That added a missing cost check, so the question becomes - is the change in generated assembly sensible, given the selected CPU type? Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-24 17:22 ` Bernd Schmidt @ 2017-01-25 9:06 ` Christophe Lyon 2017-01-25 9:21 ` Kyrill Tkachov 0 siblings, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-25 9:06 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote: > On 01/24/2017 06:03 PM, Christophe Lyon wrote: >> >> Ha... the regression occurred between r 244818 and r 244816, >> and I read r 244816 ChangeLog too quickly and did not notice >> it was modifying ifcvt.c in addition to x86-only files. >> >> So it's likely that it's your other patch for pr78634 >> that caused the regression I mentioned. Does it make >> more sense? > > > That's possible. That added a missing cost check, so the question becomes - > is the change in generated assembly sensible, given the selected CPU type? > I can now confirm that the change is indeed caused by r244816 (pr78634). The difference in the generated asm is: - vmov d17, r0, r1 - vmov d16, r2, r3 - vcmp.f64 d17, d16 + vmov d16, r0, r1 + vmov d17, r2, r3 + vcmp.f64 d16, d17 vmrs APSR_nzcv, FPSCR - vselvs.f64 d16, d16, d17 + vmovvs.f64 d16, d17 which, besides swapping d16 and d17, summarizes to - vselvs.f64 d16, d16, d17 + vmovvs.f64 d16, d17 I'm not sure if there is a "best" one ? > > Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 9:06 ` Christophe Lyon @ 2017-01-25 9:21 ` Kyrill Tkachov 2017-01-25 10:01 ` Christophe Lyon 2017-01-25 14:58 ` Bernd Schmidt 0 siblings, 2 replies; 23+ messages in thread From: Kyrill Tkachov @ 2017-01-25 9:21 UTC (permalink / raw) To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches On 25/01/17 08:53, Christophe Lyon wrote: > On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote: >> On 01/24/2017 06:03 PM, Christophe Lyon wrote: >>> Ha... the regression occurred between r 244818 and r 244816, >>> and I read r 244816 ChangeLog too quickly and did not notice >>> it was modifying ifcvt.c in addition to x86-only files. >>> >>> So it's likely that it's your other patch for pr78634 >>> that caused the regression I mentioned. Does it make >>> more sense? >> >> That's possible. That added a missing cost check, so the question becomes - >> is the change in generated assembly sensible, given the selected CPU type? >> > I can now confirm that the change is indeed caused by r244816 (pr78634). > The difference in the generated asm is: > - vmov d17, r0, r1 > - vmov d16, r2, r3 > - vcmp.f64 d17, d16 > + vmov d16, r0, r1 > + vmov d17, r2, r3 > + vcmp.f64 d16, d17 > vmrs APSR_nzcv, FPSCR > - vselvs.f64 d16, d16, d17 > + vmovvs.f64 d16, d17 > > which, besides swapping d16 and d17, summarizes to > - vselvs.f64 d16, d16, d17 > + vmovvs.f64 d16, d17 > > I'm not sure if there is a "best" one ? > The test is supposed to test the generation of the vsel instruction. I believe adding an -mcpu=cortex-a57 to the testcases would be best, as VSEL isn't actually available on Cortex-A5, it's just enabled by the -mfpu=fp-armv8 option. A more realistic configuration would target an ARMv8-A CPU like the Cortex-A57. Thanks, Kyrill >> Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 9:21 ` Kyrill Tkachov @ 2017-01-25 10:01 ` Christophe Lyon 2017-01-25 10:34 ` Richard Earnshaw (lists) 2017-01-25 14:58 ` Bernd Schmidt 1 sibling, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-25 10:01 UTC (permalink / raw) To: Kyrill Tkachov; +Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > On 25/01/17 08:53, Christophe Lyon wrote: >> >> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote: >>> >>> On 01/24/2017 06:03 PM, Christophe Lyon wrote: >>>> >>>> Ha... the regression occurred between r 244818 and r 244816, >>>> and I read r 244816 ChangeLog too quickly and did not notice >>>> it was modifying ifcvt.c in addition to x86-only files. >>>> >>>> So it's likely that it's your other patch for pr78634 >>>> that caused the regression I mentioned. Does it make >>>> more sense? >>> >>> >>> That's possible. That added a missing cost check, so the question becomes >>> - >>> is the change in generated assembly sensible, given the selected CPU >>> type? >>> >> I can now confirm that the change is indeed caused by r244816 (pr78634). >> The difference in the generated asm is: >> - vmov d17, r0, r1 >> - vmov d16, r2, r3 >> - vcmp.f64 d17, d16 >> + vmov d16, r0, r1 >> + vmov d17, r2, r3 >> + vcmp.f64 d16, d17 >> vmrs APSR_nzcv, FPSCR >> - vselvs.f64 d16, d16, d17 >> + vmovvs.f64 d16, d17 >> >> which, besides swapping d16 and d17, summarizes to >> - vselvs.f64 d16, d16, d17 >> + vmovvs.f64 d16, d17 >> >> I'm not sure if there is a "best" one ? >> > > The test is supposed to test the generation of the vsel instruction. > I believe adding an -mcpu=cortex-a57 to the testcases would be best, as > VSEL isn't actually available on Cortex-A5, it's just enabled by the > -mfpu=fp-armv8 option. > A more realistic configuration would target an ARMv8-A CPU like the > Cortex-A57. > Yes indeed, it's always confusing to be able to provide "incompatible" -mcpu and -mfpu flags (as in: no such combination actually exists). > Thanks, > Kyrill > >>> Bernd > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 10:01 ` Christophe Lyon @ 2017-01-25 10:34 ` Richard Earnshaw (lists) 0 siblings, 0 replies; 23+ messages in thread From: Richard Earnshaw (lists) @ 2017-01-25 10:34 UTC (permalink / raw) To: Christophe Lyon, Kyrill Tkachov Cc: Bernd Schmidt, Segher Boessenkool, GCC Patches On 25/01/17 09:29, Christophe Lyon wrote: > On 25 January 2017 at 10:18, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: >> >> On 25/01/17 08:53, Christophe Lyon wrote: >>> >>> On 24 January 2017 at 18:15, Bernd Schmidt <bschmidt@redhat.com> wrote: >>>> >>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote: >>>>> >>>>> Ha... the regression occurred between r 244818 and r 244816, >>>>> and I read r 244816 ChangeLog too quickly and did not notice >>>>> it was modifying ifcvt.c in addition to x86-only files. >>>>> >>>>> So it's likely that it's your other patch for pr78634 >>>>> that caused the regression I mentioned. Does it make >>>>> more sense? >>>> >>>> >>>> That's possible. That added a missing cost check, so the question becomes >>>> - >>>> is the change in generated assembly sensible, given the selected CPU >>>> type? >>>> >>> I can now confirm that the change is indeed caused by r244816 (pr78634). >>> The difference in the generated asm is: >>> - vmov d17, r0, r1 >>> - vmov d16, r2, r3 >>> - vcmp.f64 d17, d16 >>> + vmov d16, r0, r1 >>> + vmov d17, r2, r3 >>> + vcmp.f64 d16, d17 >>> vmrs APSR_nzcv, FPSCR >>> - vselvs.f64 d16, d16, d17 >>> + vmovvs.f64 d16, d17 >>> >>> which, besides swapping d16 and d17, summarizes to >>> - vselvs.f64 d16, d16, d17 >>> + vmovvs.f64 d16, d17 >>> >>> I'm not sure if there is a "best" one ? >>> >> >> The test is supposed to test the generation of the vsel instruction. >> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as >> VSEL isn't actually available on Cortex-A5, it's just enabled by the >> -mfpu=fp-armv8 option. >> A more realistic configuration would target an ARMv8-A CPU like the >> Cortex-A57. >> > > Yes indeed, it's always confusing to be able to provide "incompatible" > -mcpu and -mfpu flags (as in: no such combination actually exists). > As discussed at the Cauldron, I'm working on fixing that, but it won't be until GCC-8 now. R. > >> Thanks, >> Kyrill >> >>>> Bernd >> >> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 9:21 ` Kyrill Tkachov 2017-01-25 10:01 ` Christophe Lyon @ 2017-01-25 14:58 ` Bernd Schmidt 2017-01-25 15:25 ` Christophe Lyon 1 sibling, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-25 14:58 UTC (permalink / raw) To: Kyrill Tkachov, Christophe Lyon; +Cc: Segher Boessenkool, GCC Patches On 01/25/2017 10:18 AM, Kyrill Tkachov wrote: > The test is supposed to test the generation of the vsel instruction. > I believe adding an -mcpu=cortex-a57 to the testcases would be best, as > VSEL isn't actually available on Cortex-A5, it's just enabled by the > -mfpu=fp-armv8 option. > A more realistic configuration would target an ARMv8-A CPU like the > Cortex-A57. Ok, let me know if there's anything else you need from my side. Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 14:58 ` Bernd Schmidt @ 2017-01-25 15:25 ` Christophe Lyon 2017-01-25 15:43 ` Kyrill Tkachov 0 siblings, 1 reply; 23+ messages in thread From: Christophe Lyon @ 2017-01-25 15:25 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Kyrill Tkachov, Segher Boessenkool, GCC Patches [-- Attachment #1: Type: text/plain, Size: 1149 bytes --] On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote: > On 01/25/2017 10:18 AM, Kyrill Tkachov wrote: >> >> The test is supposed to test the generation of the vsel instruction. >> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as >> VSEL isn't actually available on Cortex-A5, it's just enabled by the >> -mfpu=fp-armv8 option. >> A more realistic configuration would target an ARMv8-A CPU like the >> Cortex-A57. > > > Ok, let me know if there's anything else you need from my side. > Kyrill, How about the attached patch? I've added dg-require-effective-target arm_arch_v8a_ok to make sure it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57 to make sure the intended instructions are present (in case at some point add-options-for-arm-arch-v8a activates costs/arch variant that would imply not generating vsel anymore). I've noticed there are other tests adding arm_v8_vfp and not making sure to select an appriopriate cpu. As a follow-up patch? And I checked that my patch makes the tests pass again even when configuring --with-cpu=cortex-a5. Thanks, Christophe > > Bernd > [-- Attachment #2: vsel.chlog.txt --] [-- Type: text/plain, Size: 764 bytes --] gcc/testsuite/ChangeLog: 2017-01-25 Christophe Lyon <christophe.lyon@linaro.org> * gcc.target/arm/vseleqdf.c: Require arm_arch_v8a_ok, add -mcpu=cortex-a57. * gcc.target/arm/vseleqsf.c: Likewise. * gcc.target/arm/vselgedf.c: Likewise. * gcc.target/arm/vselgesf.c: Likewise. * gcc.target/arm/vselgtdf.c: Likewise. * gcc.target/arm/vselgtsf.c: Likewise. * gcc.target/arm/vselledf.c: Likewise. * gcc.target/arm/vsellesf.c: Likewise. * gcc.target/arm/vselltdf.c: Likewise. * gcc.target/arm/vselltsf.c: Likewise. * gcc.target/arm/vselnedf.c: Likewise. * gcc.target/arm/vselnesf.c: Likewise. * gcc.target/arm/vselvcdf.c: Likewise. * gcc.target/arm/vselvcsf.c: Likewise. * gcc.target/arm/vselvsdf.c: Likewise. * gcc.target/arm/vselvssf.c: Likewise. [-- Attachment #3: vsel.patch.txt --] [-- Type: text/plain, Size: 7736 bytes --] diff --git a/gcc/testsuite/gcc.target/arm/vseleqdf.c b/gcc/testsuite/gcc.target/arm/vseleqdf.c index 86e147b..64d5784 100644 --- a/gcc/testsuite/gcc.target/arm/vseleqdf.c +++ b/gcc/testsuite/gcc.target/arm/vseleqdf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vseleqsf.c b/gcc/testsuite/gcc.target/arm/vseleqsf.c index 120f44b..b052704 100644 --- a/gcc/testsuite/gcc.target/arm/vseleqsf.c +++ b/gcc/testsuite/gcc.target/arm/vseleqsf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselgedf.c b/gcc/testsuite/gcc.target/arm/vselgedf.c index cea08d1..e10508f 100644 --- a/gcc/testsuite/gcc.target/arm/vselgedf.c +++ b/gcc/testsuite/gcc.target/arm/vselgedf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselgesf.c b/gcc/testsuite/gcc.target/arm/vselgesf.c index 86f2a04..645cf5d 100644 --- a/gcc/testsuite/gcc.target/arm/vselgesf.c +++ b/gcc/testsuite/gcc.target/arm/vselgesf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselgtdf.c b/gcc/testsuite/gcc.target/arm/vselgtdf.c index 2c4a6ba..741b9a8 100644 --- a/gcc/testsuite/gcc.target/arm/vselgtdf.c +++ b/gcc/testsuite/gcc.target/arm/vselgtdf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselgtsf.c b/gcc/testsuite/gcc.target/arm/vselgtsf.c index 388e74c..3042c5b 100644 --- a/gcc/testsuite/gcc.target/arm/vselgtsf.c +++ b/gcc/testsuite/gcc.target/arm/vselgtsf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselledf.c b/gcc/testsuite/gcc.target/arm/vselledf.c index 088dc04..dcf46a3 100644 --- a/gcc/testsuite/gcc.target/arm/vselledf.c +++ b/gcc/testsuite/gcc.target/arm/vselledf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vsellesf.c b/gcc/testsuite/gcc.target/arm/vsellesf.c index d0afdbc..38b06eb 100644 --- a/gcc/testsuite/gcc.target/arm/vsellesf.c +++ b/gcc/testsuite/gcc.target/arm/vsellesf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselltdf.c b/gcc/testsuite/gcc.target/arm/vselltdf.c index fbcb9ea..4d4d91c 100644 --- a/gcc/testsuite/gcc.target/arm/vselltdf.c +++ b/gcc/testsuite/gcc.target/arm/vselltdf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselltsf.c b/gcc/testsuite/gcc.target/arm/vselltsf.c index 959dab7..ab3f77f 100644 --- a/gcc/testsuite/gcc.target/arm/vselltsf.c +++ b/gcc/testsuite/gcc.target/arm/vselltsf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselnedf.c b/gcc/testsuite/gcc.target/arm/vselnedf.c index cf67f29..4b0fa5e 100644 --- a/gcc/testsuite/gcc.target/arm/vselnedf.c +++ b/gcc/testsuite/gcc.target/arm/vselnedf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselnesf.c b/gcc/testsuite/gcc.target/arm/vselnesf.c index 2e16423..4fcd5a0 100644 --- a/gcc/testsuite/gcc.target/arm/vselnesf.c +++ b/gcc/testsuite/gcc.target/arm/vselnesf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselvcdf.c b/gcc/testsuite/gcc.target/arm/vselvcdf.c index 7f30270..9701e11 100644 --- a/gcc/testsuite/gcc.target/arm/vselvcdf.c +++ b/gcc/testsuite/gcc.target/arm/vselvcdf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselvcsf.c b/gcc/testsuite/gcc.target/arm/vselvcsf.c index 1bb7369..2d47f30 100644 --- a/gcc/testsuite/gcc.target/arm/vselvcsf.c +++ b/gcc/testsuite/gcc.target/arm/vselvcsf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float diff --git a/gcc/testsuite/gcc.target/arm/vselvsdf.c b/gcc/testsuite/gcc.target/arm/vselvsdf.c index 83ad5bf..54555a2 100644 --- a/gcc/testsuite/gcc.target/arm/vselvsdf.c +++ b/gcc/testsuite/gcc.target/arm/vselvsdf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ double diff --git a/gcc/testsuite/gcc.target/arm/vselvssf.c b/gcc/testsuite/gcc.target/arm/vselvssf.c index 7d76289..a37305d 100644 --- a/gcc/testsuite/gcc.target/arm/vselvssf.c +++ b/gcc/testsuite/gcc.target/arm/vselvssf.c @@ -1,6 +1,7 @@ /* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v8a_ok */ /* { dg-require-effective-target arm_v8_vfp_ok } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mcpu=cortex-a57" } */ /* { dg-add-options arm_v8_vfp } */ float ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 15:25 ` Christophe Lyon @ 2017-01-25 15:43 ` Kyrill Tkachov 0 siblings, 0 replies; 23+ messages in thread From: Kyrill Tkachov @ 2017-01-25 15:43 UTC (permalink / raw) To: Christophe Lyon, Bernd Schmidt; +Cc: Segher Boessenkool, GCC Patches On 25/01/17 15:20, Christophe Lyon wrote: > On 25 January 2017 at 15:55, Bernd Schmidt <bschmidt@redhat.com> wrote: >> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote: >>> The test is supposed to test the generation of the vsel instruction. >>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as >>> VSEL isn't actually available on Cortex-A5, it's just enabled by the >>> -mfpu=fp-armv8 option. >>> A more realistic configuration would target an ARMv8-A CPU like the >>> Cortex-A57. >> >> Ok, let me know if there's anything else you need from my side. >> > Kyrill, > > How about the attached patch? Yes, thanks Christophe. > I've added dg-require-effective-target arm_arch_v8a_ok to make sure > it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57 > to make sure the intended instructions are present (in case at some > point add-options-for-arm-arch-v8a activates costs/arch variant that > would imply not generating vsel anymore). > > I've noticed there are other tests adding arm_v8_vfp and not making > sure to select an appriopriate cpu. As a follow-up patch? I wouldn't want to do that too much in the testsuite. In the VSEL tests we have a C-level idiom (?: construct) that we expect the optimisers to transform into a conditional select instruction that may or may not be a win on some cores. In some of those other tests I suspect we want to generate the instruction for all tunings. I.e. I'd expect the rounding tests (__builtin_floor/trunc etc) to always generate VRINT* when the -mfpu allows it, regardless of the CPU tuning. Kyrill > And I checked that my patch makes the tests pass again even > when configuring --with-cpu=cortex-a5. > Thanks, > > Christophe > >> Bernd >> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-20 19:30 ` Segher Boessenkool 2017-01-24 9:38 ` Christophe Lyon @ 2017-01-25 20:54 ` Segher Boessenkool 2017-01-26 13:04 ` Bernd Schmidt 1 sibling, 1 reply; 23+ messages in thread From: Segher Boessenkool @ 2017-01-25 20:54 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches On Fri, Jan 20, 2017 at 01:24:15PM -0600, Segher Boessenkool wrote: > On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote: > > So, when looking for situations where we have only one condition, we can > > try to undo the conversion of a plain REG into a condition, on the > > grounds that this is probably less helpful. > > > > This seems to cure the testcase, but Segher also has a patch in the PR > > that looks like a good and more direct approach. IMO both should be > > applied. This one was bootstrapped and tested on x86_64-linux. Ok? > > My patch does not cure all problems, it simply simplifies things a bit > better; but the same is true for your patch if I read it correctly. > > Okay for trunk, and I'll do my half as well. Thanks, It turns out my patch (see the PR) causes (or at least triggers) miscompilations on tilegx. I will drop it for now. Longer term we will have to fix this whole if_then_else_cond business, maybe make it less expensive and/or more effective as well. Segher ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-25 20:54 ` Segher Boessenkool @ 2017-01-26 13:04 ` Bernd Schmidt 2017-01-26 19:05 ` Segher Boessenkool 0 siblings, 1 reply; 23+ messages in thread From: Bernd Schmidt @ 2017-01-26 13:04 UTC (permalink / raw) To: Segher Boessenkool; +Cc: GCC Patches On 01/25/2017 08:46 PM, Segher Boessenkool wrote: > > It turns out my patch (see the PR) causes (or at least triggers) > miscompilations on tilegx. I will drop it for now. Curious, it looked very reasonable. What's needed to reproduce this? Bernd ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Improve things for PR71724, in combine/if_then_else_cond 2017-01-26 13:04 ` Bernd Schmidt @ 2017-01-26 19:05 ` Segher Boessenkool 0 siblings, 0 replies; 23+ messages in thread From: Segher Boessenkool @ 2017-01-26 19:05 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches On Thu, Jan 26, 2017 at 01:57:02PM +0100, Bernd Schmidt wrote: > On 01/25/2017 08:46 PM, Segher Boessenkool wrote: > > > >It turns out my patch (see the PR) causes (or at least triggers) > >miscompilations on tilegx. I will drop it for now. > > Curious, it looked very reasonable. What's needed to reproduce this? Yeah, quite curious. I just build a cross compiler (and binutils) targeting tilegx-linux, and build a Linux kernel (ARCH=tile, defconfig) with that. In many places where there are two independent conditional jumps the generated code before the patch combines the two conditions and does one jump; the code after the patch completely loses the second condition: -{ cmpeqi r10, r3, 14 ; cmpeqi r11, r3, -1 } -{ or r10, r10, r11 } -{ beqzt r10, fffffff700031ba8 <KBacktraceIterator_restart+0x4f0> } +{ cmpeqi r10, r3, 14 } +{ beqzt r10, fffffff700031ba0 <KBacktraceIterator_restart+0x4e8> } This happens quite a few times. Sometimes a lot of code is deleted. It may well be a target problem, but it is stage 4, and I do not currently have the bandwidth to investigate this. If you do, please do :-) Segher ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-01-26 18:57 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-20 12:36 Improve things for PR71724, in combine/if_then_else_cond Bernd Schmidt 2017-01-20 19:30 ` Segher Boessenkool 2017-01-24 9:38 ` Christophe Lyon 2017-01-24 12:52 ` Bernd Schmidt 2017-01-24 15:30 ` Segher Boessenkool 2017-01-24 16:28 ` Kyrill Tkachov 2017-01-24 16:28 ` Christophe Lyon 2017-01-24 16:36 ` Kyrill Tkachov 2017-01-24 16:36 ` Bernd Schmidt 2017-01-24 16:55 ` Kyrill Tkachov 2017-01-24 17:03 ` Bernd Schmidt 2017-01-24 17:08 ` Christophe Lyon 2017-01-24 17:22 ` Bernd Schmidt 2017-01-25 9:06 ` Christophe Lyon 2017-01-25 9:21 ` Kyrill Tkachov 2017-01-25 10:01 ` Christophe Lyon 2017-01-25 10:34 ` Richard Earnshaw (lists) 2017-01-25 14:58 ` Bernd Schmidt 2017-01-25 15:25 ` Christophe Lyon 2017-01-25 15:43 ` Kyrill Tkachov 2017-01-25 20:54 ` Segher Boessenkool 2017-01-26 13:04 ` Bernd Schmidt 2017-01-26 19:05 ` Segher Boessenkool
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).