* [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion @ 2015-10-02 19:04 Peter Bergner 2015-10-02 20:25 ` David Edelsohn ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Peter Bergner @ 2015-10-02 19:04 UTC (permalink / raw) To: GCC Patches; +Cc: David Edelsohn, Michael Meissner, Segher Boessenkool PR67808 exposes a problem with the constraints in the *extenddftf2_internal pattern, in that it allows TFmode operands to occupy Altivec registers which they are not allowed to do. Reload was able to work around the problem, but LRA is more pedantic and it caused it to go into an infinite spill loop until it ICEd. The following patch from Mike changes the TFmode output operand to use the "d" constraint instead of "ws". It also allows using the "ws" constraint for the two input operands, since that is allowed for DFmode operands. This passed bootstraps (with reload on by default and lra on by default) and shows no testsuite regressions. Is this ok for trunk? The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for that too, assuming my bootstrap/regtesting there are clean? Peter gcc/ PR target/67808 * config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints. gcc/testsuite/ * gcc.target/powerpc/pr67808.c: New test. Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 228387) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6511,9 +6511,9 @@ }) (define_insn_and_split "*extenddftf2_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d") - (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md"))) - (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))] + [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,d,&d") + (float_extend:TF (match_operand:DF 1 "input_operand" "ws,r,md,md,md"))) + (use (match_operand:DF 2 "zero_reg_mem_operand" "ws,r,j,m,ws"))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LONG_DOUBLE_128" Index: gcc/testsuite/gcc.target/powerpc/pr67808.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr67808.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr67808.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O1 -mvsx -mlra" } */ + +/* PR 67808: LRA ICEs on simple double to long double conversion test case */ + +void +foo (long double *ldb1, double *db1) +{ + *ldb1 = *db1; +} ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-02 19:04 [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion Peter Bergner @ 2015-10-02 20:25 ` David Edelsohn 2015-10-05 17:12 ` Michael Meissner 2015-11-21 1:21 ` [PATCH, rs6000, gcc 5 backport] " Michael Meissner 2 siblings, 0 replies; 8+ messages in thread From: David Edelsohn @ 2015-10-02 20:25 UTC (permalink / raw) To: Peter Bergner; +Cc: GCC Patches, Michael Meissner, Segher Boessenkool On Fri, Oct 2, 2015 at 3:04 PM, Peter Bergner <bergner@vnet.ibm.com> wrote: > PR67808 exposes a problem with the constraints in the *extenddftf2_internal > pattern, in that it allows TFmode operands to occupy Altivec registers > which they are not allowed to do. Reload was able to work around the > problem, but LRA is more pedantic and it caused it to go into an infinite > spill loop until it ICEd. The following patch from Mike changes the TFmode > output operand to use the "d" constraint instead of "ws". It also allows > using the "ws" constraint for the two input operands, since that is allowed > for DFmode operands. > > This passed bootstraps (with reload on by default and lra on by default) > and shows no testsuite regressions. Is this ok for trunk? > > The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for > that too, assuming my bootstrap/regtesting there are clean? > > Peter > > > gcc/ > PR target/67808 > * config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints. > > gcc/testsuite/ > > * gcc.target/powerpc/pr67808.c: New test. Okay. Thanks, David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-02 19:04 [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion Peter Bergner 2015-10-02 20:25 ` David Edelsohn @ 2015-10-05 17:12 ` Michael Meissner 2015-10-05 17:39 ` Peter Bergner 2015-11-21 1:21 ` [PATCH, rs6000, gcc 5 backport] " Michael Meissner 2 siblings, 1 reply; 8+ messages in thread From: Michael Meissner @ 2015-10-05 17:12 UTC (permalink / raw) To: Peter Bergner Cc: GCC Patches, David Edelsohn, Michael Meissner, Segher Boessenkool [-- Attachment #1: Type: text/plain, Size: 3134 bytes --] On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote: > PR67808 exposes a problem with the constraints in the *extenddftf2_internal > pattern, in that it allows TFmode operands to occupy Altivec registers > which they are not allowed to do. Reload was able to work around the > problem, but LRA is more pedantic and it caused it to go into an infinite > spill loop until it ICEd. The following patch from Mike changes the TFmode > output operand to use the "d" constraint instead of "ws". It also allows > using the "ws" constraint for the two input operands, since that is allowed > for DFmode operands. > > This passed bootstraps (with reload on by default and lra on by default) > and shows no testsuite regressions. Is this ok for trunk? > > The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for > that too, assuming my bootstrap/regtesting there are clean? > > Peter > > > gcc/ > PR target/67808 > * config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints. > > gcc/testsuite/ > > * gcc.target/powerpc/pr67808.c: New test. > In looking at the constraints in more detail, after the patch we have the following alternatives: #1: op0 = m, op1 = ws, op2 = ws #2: op0 = Y, op1 = r, op2 = r #3: op0 = d, op1 = md, op2 = j #4: op0 = d, op1 = md, op2 = m #5: op0 = &d, op1 = md, op2 = ws I.e. #1: Store result, input in VSX register, 0.0 in VSX register (VSX only) #2: Store result, input in GPR register, 0.0 in GPR register #3: Result in FPR register, input in FPR or memory, 0.0 direct (VSX only) #4: Result in FPR register, input in FPR or memory, 0.0 in memory #5: Result in FPR reg (no overlap), input in FPR/memory, 0.0 in VSX reg So, the non-VSX case (were ws is NO_REGS) only deals with alternatives #2 and #4. I think (but I don't have a test case) that alternative #1 is potentially a problem if the input register is ever allocated to an Altivec register and the address mode is reg+offset (in which case we would not be able to form the address after the insn is split post-reload. I have attached a better version of the patch. This gives the constraints: #1: op0 = m, op1 = d, op2 = d #2: op0 = Y, op1 = r, op2 = r #3: op0 = d, op1 = ws, op2 = j #4: op0 = d, op1 = md, op2 = m #5: op0 = &d, op1 = m, op2 = md I.e. #1: Store result, input in FPR register, 0.0 in FPR register #2: Store result, input in GPR register, 0.0 in GPR register #3: Result in FPR reg, input in VSX reg, 0.0 direct (VSX only) #4: Result in FPR reg, input in FPR/memory, 0.0 in memory #5: Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap) [gcc] 2015-10-05 Peter Bergner <bergner@vnet.ibm.com> Michael Meissner <meissner@linux.vnet.ibm.com> PR target/67808 * config/rs6000/rs6000.md (extenddftf2_internal): Fix up constraints. [gcc/testsuite] 2015-10-05 Peter Bergner <bergner@vnet.ibm.com> PR target/67808 * gcc.target/powerpc/pr67808.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797 [-- Attachment #2: pr67808.patch01b --] [-- Type: text/plain, Size: 1471 bytes --] Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 228495) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6505,9 +6505,9 @@ (define_expand "extenddftf2_fprs" }) (define_insn_and_split "*extenddftf2_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d") - (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md"))) - (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))] + [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,d,&d") + (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,ws,md,md"))) + (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,md"))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LONG_DOUBLE_128" Index: gcc/testsuite/gcc.target/powerpc/pr67808.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr67808.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr67808.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O1 -mvsx -mlra" } */ + +/* PR 67808: LRA ICEs on simple double to long double conversion test case */ + +void +foo (long double *ldb1, double *db1) +{ + *ldb1 = *db1; +} ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-05 17:12 ` Michael Meissner @ 2015-10-05 17:39 ` Peter Bergner 2015-10-05 22:36 ` Michael Meissner 0 siblings, 1 reply; 8+ messages in thread From: Peter Bergner @ 2015-10-05 17:39 UTC (permalink / raw) To: Michael Meissner; +Cc: GCC Patches, David Edelsohn, Segher Boessenkool On Mon, 2015-10-05 at 13:12 -0400, Michael Meissner wrote: > I have attached a better version of the patch. I'll note that I have not committed the earlier patch and will hold off while we sort out what is best here. > This gives the constraints: > > #1: op0 = m, op1 = d, op2 = d > #2: op0 = Y, op1 = r, op2 = r > #3: op0 = d, op1 = ws, op2 = j > #4: op0 = d, op1 = md, op2 = m > #5: op0 = &d, op1 = m, op2 = md > > I.e. > > #1: Store result, input in FPR register, 0.0 in FPR register > #2: Store result, input in GPR register, 0.0 in GPR register > #3: Result in FPR reg, input in VSX reg, 0.0 direct (VSX only) > #4: Result in FPR reg, input in FPR/memory, 0.0 in memory > #5: Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap) As we discussed on IRC, this alt #3 now does not accept memory as an input operand and that is the only alternative that allows us to generate a xxlxor to create a zero fp value. Either we can change #3's opt1 from "ws" to "mws" or just create another alternative. I'll let you decide what works best. Since this test is testing whether we ICE when -mlra -mvsx is enabled, how about if we verify we're also getting the xxlxor too, with the addition of: /* { dg-final { scan-assembler-times "xxlxor" 1 } } */ to the test case? Peter ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-05 17:39 ` Peter Bergner @ 2015-10-05 22:36 ` Michael Meissner 2015-10-06 14:33 ` David Edelsohn 0 siblings, 1 reply; 8+ messages in thread From: Michael Meissner @ 2015-10-05 22:36 UTC (permalink / raw) To: Peter Bergner Cc: Michael Meissner, GCC Patches, David Edelsohn, Segher Boessenkool [-- Attachment #1: Type: text/plain, Size: 3089 bytes --] Ok, after spending the day on going down the rabbit hole of trying to optimize just about every, here are my patches. Note, I simplified the constraints to eliminate some rare possibilities, like optimizing converting from double to long double if the double happened to be in a GPR register and the long double value was stored in memory (but there never was an optimization for having the double value be in a GPR and the long double value to also be a GPR). I also separated the VSX case from the non-VSX case. This is to simplify things at the RTL level (non-VSX must load up 0.0 to put into the lower word, while VSX can use the XXLXOR instruction to clear the register). I dropped support in the insns for extending the DFmode value to TFmode that is located in memory directly. Now, the compiler builds the whole value in FPRs and then does the store. This simplifies the code somewhat, and SPE/ieeequad paths require the value to be in registers, which might lead to other lra bugs. In the case of just doing one conversion in straight line code, it just changes the register allocation somewhat (allocate 1 TFmode pseudo instead of 1-2 DFmode psuedos). I have bootstrapped the compiler on little endian power8 with no regressions. I have built the test case with various options (-mlra vs. no -mlra, 32-bit, 64-bit, power5/power6/power7/power8), and it all builds correctly. Is this patch ok to apply to the trunk? I would like to apply this patch to GCC 5.x as well. However, in doing the patch, this patch touches areas that I've been working on for IEEE 128-bit floating point support, and so the patch will need to be reworked for GCC 5.x. Is it ok to install in the trunk? In addition, I will need to modify this area again with the next IEEE 128-bit floating point support patch, but I wanted to separate this patch out so that it could be considered by itself, and back ported to GCC 5.x. [gcc] 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> Peter Bergner <bergner@vnet.ibm.com> PR target/67808 * config/rs6000/rs6000.md (extenddftf2): In the expander, only allow registers, but provide insns for the combiner to create for loads from memory. Separate VSX code from non-VSX code. For non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename externaldftf2_internal to externaldftf2_fprs. Reorder constraints so that registers come before memory operations. Drop support from converting DFmode to TFmode, if the DFmode value is in a GPR register, and the TFmode is in memory. (extenddftf2_fprs): Likewise. (extenddftf2_internal): Likewise. (extenddftf2_vsx): Likewise. (extendsftf2): In the expander, only allow registers, but provide insns for the combiner to create for stores and loads. [gcc/testsuite] 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> Peter Bergner <bergner@vnet.ibm.com> PR target/67808 * gcc.target/powerpc/pr67808.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797 [-- Attachment #2: pr67808.patch02b --] [-- Type: text/plain, Size: 5394 bytes --] Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 228496) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6471,8 +6471,8 @@ (define_insn_and_split "*mov<mode>_softf [(set_attr "length" "20,20,16")]) (define_expand "extenddftf2" - [(set (match_operand:TF 0 "nonimmediate_operand" "") - (float_extend:TF (match_operand:DF 1 "input_operand" "")))] + [(set (match_operand:TF 0 "gpc_reg_operand" "") + (float_extend:TF (match_operand:DF 1 "gpc_reg_operand" "")))] "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE) && TARGET_LONG_DOUBLE_128" { @@ -6480,52 +6480,55 @@ (define_expand "extenddftf2" rs6000_expand_float128_convert (operands[0], operands[1], false); else if (TARGET_E500_DOUBLE) emit_insn (gen_spe_extenddftf2 (operands[0], operands[1])); + else if (TARGET_VSX) + emit_insn (gen_extenddftf2_vsx (operands[0], operands[1])); else - emit_insn (gen_extenddftf2_fprs (operands[0], operands[1])); + { + rtx zero = gen_reg_rtx (DFmode); + rs6000_emit_move (zero, CONST0_RTX (DFmode), DFmode); + emit_insn (gen_extenddftf2_fprs (operands[0], operands[1], zero)); + } DONE; }) -(define_expand "extenddftf2_fprs" - [(parallel [(set (match_operand:TF 0 "nonimmediate_operand" "") - (float_extend:TF (match_operand:DF 1 "input_operand" ""))) - (use (match_dup 2))])] - "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT - && TARGET_LONG_DOUBLE_128" +;; Allow memory operands for the source to be created by the combiner. +(define_insn_and_split "extenddftf2_fprs" + [(set (match_operand:TF 0 "gpc_reg_operand" "=d,d,&d") + (float_extend:TF (match_operand:DF 1 "nonimmediate_operand" "d,m,d"))) + (use (match_operand:DF 2 "nonimmediate_operand" "m,m,d"))] + "!TARGET_VSX && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD" + "#" + "&& reload_completed" + [(set (match_dup 3) (match_dup 1)) + (set (match_dup 4) (match_dup 2))] { - /* VSX can create 0.0 directly, otherwise let rs6000_emit_move create - the proper constant. */ - if (TARGET_VSX) - operands[2] = CONST0_RTX (DFmode); - else - { - operands[2] = gen_reg_rtx (DFmode); - rs6000_emit_move (operands[2], CONST0_RTX (DFmode), DFmode); - } + const int lo_word = LONG_DOUBLE_LARGE_FIRST ? GET_MODE_SIZE (DFmode) : 0; + const int hi_word = LONG_DOUBLE_LARGE_FIRST ? 0 : GET_MODE_SIZE (DFmode); + + operands[3] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word); + operands[4] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word); }) -(define_insn_and_split "*extenddftf2_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d") - (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md"))) - (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))] - "!TARGET_IEEEQUAD - && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT - && TARGET_LONG_DOUBLE_128" +(define_insn_and_split "extenddftf2_vsx" + [(set (match_operand:TF 0 "gpc_reg_operand" "=d,d") + (float_extend:TF (match_operand:DF 1 "nonimmediate_operand" "ws,m")))] + "TARGET_LONG_DOUBLE_128 && TARGET_VSX && !TARGET_IEEEQUAD" "#" "&& reload_completed" - [(pc)] + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 3) (match_dup 4))] { const int lo_word = LONG_DOUBLE_LARGE_FIRST ? GET_MODE_SIZE (DFmode) : 0; const int hi_word = LONG_DOUBLE_LARGE_FIRST ? 0 : GET_MODE_SIZE (DFmode); - emit_move_insn (simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word), - operands[1]); - emit_move_insn (simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word), - operands[2]); - DONE; + + operands[2] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word); + operands[3] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word); + operands[4] = CONST0_RTX (DFmode); }) (define_expand "extendsftf2" - [(set (match_operand:TF 0 "nonimmediate_operand" "") + [(set (match_operand:TF 0 "gpc_reg_operand" "") (float_extend:TF (match_operand:SF 1 "gpc_reg_operand" "")))] "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE) Index: gcc/testsuite/gcc.target/powerpc/pr67808.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr67808.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr67808.c (revision 0) @@ -0,0 +1,46 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */ +/* { dg-options "-O1 -mvsx -mlra -mcpu=power7" } */ + +/* PR 67808: LRA ICEs on simple double to long double conversion test case */ + +void +dfoo (long double *ldb1, double *db1) +{ + *ldb1 = *db1; +} + +long double +dfoo2 (double *db1) +{ + return *db1; +} + +long double +dfoo3 (double x) +{ + return x; +} + +void +ffoo (long double *ldb1, float *db1) +{ + *ldb1 = *db1; +} + +long double +ffoo2 (float *db1) +{ + return *db1; +} + +long double +ffoo3 (float x) +{ + return x; +} + +/* { dg-final { scan-assembler "xxlxor" } } */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-05 22:36 ` Michael Meissner @ 2015-10-06 14:33 ` David Edelsohn 0 siblings, 0 replies; 8+ messages in thread From: David Edelsohn @ 2015-10-06 14:33 UTC (permalink / raw) To: Michael Meissner, Peter Bergner, GCC Patches, David Edelsohn, Segher Boessenkool On Mon, Oct 5, 2015 at 6:36 PM, Michael Meissner <meissner@linux.vnet.ibm.com> wrote: > Ok, after spending the day on going down the rabbit hole of trying to optimize > just about every, here are my patches. > > Note, I simplified the constraints to eliminate some rare possibilities, like > optimizing converting from double to long double if the double happened to be > in a GPR register and the long double value was stored in memory (but there > never was an optimization for having the double value be in a GPR and the long > double value to also be a GPR). > > I also separated the VSX case from the non-VSX case. This is to simplify things > at the RTL level (non-VSX must load up 0.0 to put into the lower word, while > VSX can use the XXLXOR instruction to clear the register). > > I dropped support in the insns for extending the DFmode value to TFmode that is > located in memory directly. Now, the compiler builds the whole value in FPRs > and then does the store. This simplifies the code somewhat, and SPE/ieeequad > paths require the value to be in registers, which might lead to other lra > bugs. In the case of just doing one conversion in straight line code, it just > changes the register allocation somewhat (allocate 1 TFmode pseudo instead of > 1-2 DFmode psuedos). > > I have bootstrapped the compiler on little endian power8 with no regressions. I > have built the test case with various options (-mlra vs. no -mlra, 32-bit, > 64-bit, power5/power6/power7/power8), and it all builds correctly. > > Is this patch ok to apply to the trunk? > > I would like to apply this patch to GCC 5.x as well. However, in doing the > patch, this patch touches areas that I've been working on for IEEE 128-bit > floating point support, and so the patch will need to be reworked for GCC > 5.x. Is it ok to install in the trunk? > > In addition, I will need to modify this area again with the next IEEE 128-bit > floating point support patch, but I wanted to separate this patch out so that > it could be considered by itself, and back ported to GCC 5.x. > > [gcc] > 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> > Peter Bergner <bergner@vnet.ibm.com> > > PR target/67808 > * config/rs6000/rs6000.md (extenddftf2): In the expander, only > allow registers, but provide insns for the combiner to create for > loads from memory. Separate VSX code from non-VSX code. For > non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename > externaldftf2_internal to externaldftf2_fprs. Reorder constraints > so that registers come before memory operations. Drop support from > converting DFmode to TFmode, if the DFmode value is in a GPR > register, and the TFmode is in memory. > (extenddftf2_fprs): Likewise. > (extenddftf2_internal): Likewise. > (extenddftf2_vsx): Likewise. > (extendsftf2): In the expander, only allow registers, but provide > insns for the combiner to create for stores and loads. > > [gcc/testsuite] > 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> > Peter Bergner <bergner@vnet.ibm.com> > > PR target/67808 > * gcc.target/powerpc/pr67808.c: New test. This is okay for trunk, but can we hold off for GCC 5 and allow things to settle? Thanks, David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000, gcc 5 backport] Fix PR target/67808, LRA ICE on double to long double conversion 2015-10-02 19:04 [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion Peter Bergner 2015-10-02 20:25 ` David Edelsohn 2015-10-05 17:12 ` Michael Meissner @ 2015-11-21 1:21 ` Michael Meissner 2015-11-21 18:50 ` David Edelsohn 2 siblings, 1 reply; 8+ messages in thread From: Michael Meissner @ 2015-11-21 1:21 UTC (permalink / raw) To: Peter Bergner, GCC Patches, David Edelsohn, Michael Meissner, Segher Boessenkool On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote: > PR67808 exposes a problem with the constraints in the *extenddftf2_internal > pattern, in that it allows TFmode operands to occupy Altivec registers > which they are not allowed to do. Reload was able to work around the > problem, but LRA is more pedantic and it caused it to go into an infinite > spill loop until it ICEd. The following patch from Mike changes the TFmode > output operand to use the "d" constraint instead of "ws". It also allows > using the "ws" constraint for the two input operands, since that is allowed > for DFmode operands. > > This passed bootstraps (with reload on by default and lra on by default) > and shows no testsuite regressions. Is this ok for trunk? > > The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for > that too, assuming my bootstrap/regtesting there are clean? The following patch backports the fix to GCC 5.x. There were no regressions in doing the bootstrap/make check on both a big endian power7 system and a little endian power8 system. Is it ok to apply the patch to the gcc-5 branch? 2015-10-20 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from trunk: 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> Peter Bergner <bergner@vnet.ibm.com> PR target/67808 * config/rs6000/rs6000.md (extenddftf2): In the expander, only allow registers, but provide insns for the combiner to create for loads from memory. Separate VSX code from non-VSX code. For non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename externaldftf2_internal to externaldftf2_fprs. Reorder constraints so that registers come before memory operations. Drop support from converting DFmode to TFmode, if the DFmode value is in a GPR register. (extenddftf2_fprs): Likewise. (extenddftf2_internal): Likewise. (extenddftf2_vsx): Likewise. (extendsftf2): In the expander, only allow registers, but provide insns for the combiner to create for stores and loads. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, rs6000, gcc 5 backport] Fix PR target/67808, LRA ICE on double to long double conversion 2015-11-21 1:21 ` [PATCH, rs6000, gcc 5 backport] " Michael Meissner @ 2015-11-21 18:50 ` David Edelsohn 0 siblings, 0 replies; 8+ messages in thread From: David Edelsohn @ 2015-11-21 18:50 UTC (permalink / raw) To: Michael Meissner, Peter Bergner, GCC Patches, Segher Boessenkool On Fri, Nov 20, 2015 at 6:47 PM, Michael Meissner <meissner@linux.vnet.ibm.com> wrote: > On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote: >> PR67808 exposes a problem with the constraints in the *extenddftf2_internal >> pattern, in that it allows TFmode operands to occupy Altivec registers >> which they are not allowed to do. Reload was able to work around the >> problem, but LRA is more pedantic and it caused it to go into an infinite >> spill loop until it ICEd. The following patch from Mike changes the TFmode >> output operand to use the "d" constraint instead of "ws". It also allows >> using the "ws" constraint for the two input operands, since that is allowed >> for DFmode operands. >> >> This passed bootstraps (with reload on by default and lra on by default) >> and shows no testsuite regressions. Is this ok for trunk? >> >> The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for >> that too, assuming my bootstrap/regtesting there are clean? > > The following patch backports the fix to GCC 5.x. There were no regressions in > doing the bootstrap/make check on both a big endian power7 system and a little > endian power8 system. Is it ok to apply the patch to the gcc-5 branch? > > 2015-10-20 Michael Meissner <meissner@linux.vnet.ibm.com> > > Back port from trunk: > 2015-10-05 Michael Meissner <meissner@linux.vnet.ibm.com> > Peter Bergner <bergner@vnet.ibm.com> > > PR target/67808 > * config/rs6000/rs6000.md (extenddftf2): In the expander, only > allow registers, but provide insns for the combiner to create for > loads from memory. Separate VSX code from non-VSX code. For > non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename > externaldftf2_internal to externaldftf2_fprs. Reorder constraints > so that registers come before memory operations. Drop support from > converting DFmode to TFmode, if the DFmode value is in a GPR > register. > (extenddftf2_fprs): Likewise. > (extenddftf2_internal): Likewise. > (extenddftf2_vsx): Likewise. > (extendsftf2): In the expander, only allow registers, but provide > insns for the combiner to create for stores and loads. Okay. Thanks, David ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-11-21 18:49 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-02 19:04 [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion Peter Bergner 2015-10-02 20:25 ` David Edelsohn 2015-10-05 17:12 ` Michael Meissner 2015-10-05 17:39 ` Peter Bergner 2015-10-05 22:36 ` Michael Meissner 2015-10-06 14:33 ` David Edelsohn 2015-11-21 1:21 ` [PATCH, rs6000, gcc 5 backport] " Michael Meissner 2015-11-21 18:50 ` David Edelsohn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).