* [PATCH 0/6] [ARC] Various fixes @ 2016-04-18 14:35 Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu ` (5 more replies) 0 siblings, 6 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett Hi, This series of 6 patches are fixing a number of small issues found during time with our compiler. Patch 1 fixes the problem of using drsub* instructions when compiling with double assist instruction support. Patch 2 fixes big-endian emitted code when using FPX extension instructions. Patch 3 passes mfpuda option to the compiler whenever we use double assist instructions in or compilation. Patch 4 fixes the floating point optimized equality routine to handle NaNs emitted by FPX extenssion. Patch 5 fixes the case when combiner matches a sign-extended 16-bit number with umulhisi3_imm pattern. Patch 6 fixes various instruction patterns. OK to apply? Claudiu Claudiu Zissulescu (6): [ARC] Don't use drsub* instructions when selecting fpuda. [ARC] Fix FPX/FPUDA code gen when compiling for big-endian. [ARC] Pass mfpuda to assembler. [ARC] Handle FPX NaN within optimized floating point library. [ARC] Fix unwanted match for sign extend 16-bit constant. [ARC] Various instruction pattern fixes gcc/config/arc/arc.c | 12 ++- gcc/config/arc/arc.h | 2 +- gcc/config/arc/arc.md | 154 ++++++++++++++++------------- gcc/config/arc/fpx.md | 7 +- gcc/testsuite/gcc.target/arc/ieee_eq.c | 47 +++++++++ gcc/testsuite/gcc.target/arc/trsub.c | 10 ++ gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++++ libgcc/config/arc/ieee-754/eqdf2.S | 13 ++- 8 files changed, 186 insertions(+), 82 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c create mode 100644 gcc/testsuite/gcc.target/arc/trsub.c create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 6/6] [ARC] Various instruction pattern fixes 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-18 18:26 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu ` (4 subsequent siblings) 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (mulsidi3): Change operand 0 predicate to register_operand. (umulsidi3): Likewise. (indirect_jump): Fix jump instruction assembly patterns. (arcset<code>): Change operand 1 predicate to nonmemory_operand. (arcsetltu, arcsetgeu): Likewise. (arcsethi, arcsetls): Fix pattern. --- gcc/config/arc/arc.md | 125 +++++++++++++++++++++++++++----------------------- 1 file changed, 67 insertions(+), 58 deletions(-) diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 6731072..170ac1c 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -1964,7 +1964,7 @@ (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")]) (define_expand "mulsidi3" - [(set (match_operand:DI 0 "nonimmediate_operand" "") + [(set (match_operand:DI 0 "register_operand" "") (mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" "")) (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] "TARGET_ANY_MPY" @@ -2200,9 +2200,9 @@ }") (define_expand "umulsidi3" - [(set (match_operand:DI 0 "nonimmediate_operand" "") - (mult:DI (zero_extend:DI(match_operand:SI 1 "register_operand" "")) - (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] + [(set (match_operand:DI 0 "register_operand" "") + (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "")) + (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""))))] "" { if (TARGET_MPY) @@ -3673,7 +3673,12 @@ (define_insn "indirect_jump" [(set (pc) (match_operand:SI 0 "nonmemory_operand" "L,I,Cal,Rcqq,r"))] "" - "j%!%* [%0]%&" + "@ + j%!%* %0%& + j%!%* %0%& + j%!%* %0%& + j%!%* [%0]%& + j%!%* [%0]%&" [(set_attr "type" "jump") (set_attr "iscompact" "false,false,false,maybe,false") (set_attr "cond" "canuse,canuse_limm,canuse,canuse,canuse")]) @@ -5425,90 +5430,94 @@ (define_code_iterator arcCC_cond [eq ne gt lt ge le]) (define_insn "arcset<code>" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (arcCC_cond:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0,0,r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))] + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r,r") + (arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,0,r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,n,n")))] "TARGET_V2 && TARGET_CODE_DENSITY" "set<code>%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) (define_insn "arcsetltu" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") - (ltu:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0, 0, r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r, r, r") + (ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I, n, n")))] "TARGET_V2 && TARGET_CODE_DENSITY" "setlo%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) (define_insn "arcsetgeu" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") - (geu:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0, 0, r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r, r, r") + (geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I, n, n")))] "TARGET_V2 && TARGET_CODE_DENSITY" "seths%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) ;; Special cases of SETCC (define_insn_and_split "arcsethi" - [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") - (gtu:SI (match_operand:SI 1 "register_operand" "r,r, r,r") - (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] + [(set (match_operand:SI 0 "register_operand" "=r, r,r,r") + (gtu:SI (match_operand:SI 1 "nonmemory_operand" "r, r,r,n") + (match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))] "TARGET_V2 && TARGET_CODE_DENSITY" - "setlo%? %0, %2, %1" - "reload_completed - && CONST_INT_P (operands[2]) - && satisfies_constraint_C62 (operands[2])" + "#" + "reload_completed" [(const_int 0)] "{ - /* sethi a,b,u6 => seths a,b,u6 + 1. */ - operands[2] = GEN_INT (INTVAL (operands[2]) + 1); - emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2])); - DONE; + if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2])) + { + /* sethi a,b,u6 => seths a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2])); + DONE; + } + else + { + emit_insn (gen_arcsetltu (operands[0], operands[2], operands[1])); + DONE; + } }" - [(set_attr "length" "4,4,4,8") - (set_attr "iscompact" "false") - (set_attr "type" "compare") - (set_attr "predicable" "yes,no,no,no") - (set_attr "cond" "canuse,nocond,nocond,nocond")] -) + [(set_attr "length" "4,4,8,8") + (set_attr "type" "compare")]) (define_insn_and_split "arcsetls" - [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") - (leu:SI (match_operand:SI 1 "register_operand" "r,r, r,r") - (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] + [(set (match_operand:SI 0 "register_operand" "=r, r,r,r") + (leu:SI (match_operand:SI 1 "nonmemory_operand" "r, r,r,n") + (match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))] "TARGET_V2 && TARGET_CODE_DENSITY" - "seths%? %0, %2, %1" - "reload_completed - && CONST_INT_P (operands[2]) - && satisfies_constraint_C62 (operands[2])" + "#" + "reload_completed" [(const_int 0)] "{ - /* setls a,b,u6 => setlo a,b,u6 + 1. */ - operands[2] = GEN_INT (INTVAL (operands[2]) + 1); - emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2])); - DONE; - }" - [(set_attr "length" "4,4,4,8") - (set_attr "iscompact" "false") - (set_attr "type" "compare") - (set_attr "predicable" "yes,no,no,no") - (set_attr "cond" "canuse,nocond,nocond,nocond")] -) + if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2])) + { + /* setls a,b,u6 => setlo a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2])); + DONE; + } + else + { + emit_insn (gen_arcsetgeu (operands[0], operands[2], operands[1])); + DONE; + } + }" + [(set_attr "length" "4,4,8,8") + (set_attr "type" "compare")]) ; Any mode that needs to be solved by secondary reload (define_mode_iterator SRI [QI HI]) -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 6/6] [ARC] Various instruction pattern fixes 2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu @ 2016-04-18 18:26 ` Claudiu Zissulescu 2016-04-28 12:31 ` Joern Wolfgang Rennecke 0 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 18:26 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: gnu, Francois.Bedard, jeremy.bennett Forgot to add the reload cases. Here it is the updated patch. //Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (mulsidi3): Change operand 0 predicate to register_operand. (umulsidi3): Likewise. (indirect_jump): Fix jump instruction assembly patterns. (arcset<code>): Change operand 1 predicate to nonmemory_operand. (arcsetltu, arcsetgeu): Likewise. (arcsethi, arcsetls): Fix pattern. --- gcc/config/arc/arc.md | 146 ++++++++++++++++++++++++++++---------------------- 1 file changed, 83 insertions(+), 63 deletions(-) diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 6731072..9d87b76 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -1964,7 +1964,7 @@ (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")]) (define_expand "mulsidi3" - [(set (match_operand:DI 0 "nonimmediate_operand" "") + [(set (match_operand:DI 0 "register_operand" "") (mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" "")) (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] "TARGET_ANY_MPY" @@ -2200,9 +2200,9 @@ }") (define_expand "umulsidi3" - [(set (match_operand:DI 0 "nonimmediate_operand" "") - (mult:DI (zero_extend:DI(match_operand:SI 1 "register_operand" "")) - (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] + [(set (match_operand:DI 0 "register_operand" "") + (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "")) + (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""))))] "" { if (TARGET_MPY) @@ -3673,7 +3673,12 @@ (define_insn "indirect_jump" [(set (pc) (match_operand:SI 0 "nonmemory_operand" "L,I,Cal,Rcqq,r"))] "" - "j%!%* [%0]%&" + "@ + j%!%* %0%& + j%!%* %0%& + j%!%* %0%& + j%!%* [%0]%& + j%!%* [%0]%&" [(set_attr "type" "jump") (set_attr "iscompact" "false,false,false,maybe,false") (set_attr "cond" "canuse,canuse_limm,canuse,canuse,canuse")]) @@ -5425,90 +5430,105 @@ (define_code_iterator arcCC_cond [eq ne gt lt ge le]) (define_insn "arcset<code>" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (arcCC_cond:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0,0,r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))] - "TARGET_V2 && TARGET_CODE_DENSITY" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r,r") + (arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,0,r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,n,n")))] + "TARGET_V2 && TARGET_CODE_DENSITY + && (register_operand (operands[1], SImode) + || register_operand (operands[2], SImode))" "set<code>%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) (define_insn "arcsetltu" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") - (ltu:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0, 0, r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] - "TARGET_V2 && TARGET_CODE_DENSITY" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r, r, r") + (ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I, n, n")))] + "TARGET_V2 && TARGET_CODE_DENSITY + && (register_operand (operands[1], SImode) + || register_operand (operands[2], SImode))" "setlo%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) (define_insn "arcsetgeu" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") - (geu:SI (match_operand:SI 1 "register_operand" "0,r,0,r,0, 0, r") - (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] - "TARGET_V2 && TARGET_CODE_DENSITY" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r, r, r") + (geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I, n, n")))] + "TARGET_V2 && TARGET_CODE_DENSITY + && (register_operand (operands[1], SImode) + || register_operand (operands[2], SImode))" "seths%? %0, %1, %2" - [(set_attr "length" "4,4,4,4,4,8,8") + [(set_attr "length" "4,4,8,4,4,4,8,8") (set_attr "iscompact" "false") (set_attr "type" "compare") - (set_attr "predicable" "yes,no,yes,no,no,yes,no") - (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") ]) ;; Special cases of SETCC (define_insn_and_split "arcsethi" - [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") - (gtu:SI (match_operand:SI 1 "register_operand" "r,r, r,r") - (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] - "TARGET_V2 && TARGET_CODE_DENSITY" - "setlo%? %0, %2, %1" - "reload_completed - && CONST_INT_P (operands[2]) - && satisfies_constraint_C62 (operands[2])" + [(set (match_operand:SI 0 "register_operand" "=r, r,r,r") + (gtu:SI (match_operand:SI 1 "nonmemory_operand" "r, r,r,n") + (match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))] + "TARGET_V2 && TARGET_CODE_DENSITY + && (register_operand (operands[1], SImode) + || register_operand (operands[2], SImode))" + + "#" + "reload_completed" [(const_int 0)] "{ - /* sethi a,b,u6 => seths a,b,u6 + 1. */ - operands[2] = GEN_INT (INTVAL (operands[2]) + 1); - emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2])); - DONE; + if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2])) + { + /* sethi a,b,u6 => seths a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2])); + DONE; + } + else + { + emit_insn (gen_arcsetltu (operands[0], operands[2], operands[1])); + DONE; + } }" - [(set_attr "length" "4,4,4,8") - (set_attr "iscompact" "false") - (set_attr "type" "compare") - (set_attr "predicable" "yes,no,no,no") - (set_attr "cond" "canuse,nocond,nocond,nocond")] -) + [(set_attr "length" "4,4,8,8") + (set_attr "type" "compare")]) (define_insn_and_split "arcsetls" - [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") - (leu:SI (match_operand:SI 1 "register_operand" "r,r, r,r") - (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] - "TARGET_V2 && TARGET_CODE_DENSITY" - "seths%? %0, %2, %1" - "reload_completed - && CONST_INT_P (operands[2]) - && satisfies_constraint_C62 (operands[2])" + [(set (match_operand:SI 0 "register_operand" "=r, r,r,r") + (leu:SI (match_operand:SI 1 "nonmemory_operand" "r, r,r,n") + (match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))] + "TARGET_V2 && TARGET_CODE_DENSITY + && (register_operand (operands[1], SImode) + || register_operand (operands[2], SImode))" + "#" + "reload_completed" [(const_int 0)] "{ - /* setls a,b,u6 => setlo a,b,u6 + 1. */ - operands[2] = GEN_INT (INTVAL (operands[2]) + 1); - emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2])); - DONE; - }" - [(set_attr "length" "4,4,4,8") - (set_attr "iscompact" "false") - (set_attr "type" "compare") - (set_attr "predicable" "yes,no,no,no") - (set_attr "cond" "canuse,nocond,nocond,nocond")] -) + if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2])) + { + /* setls a,b,u6 => setlo a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2])); + DONE; + } + else + { + emit_insn (gen_arcsetgeu (operands[0], operands[2], operands[1])); + DONE; + } + }" + [(set_attr "length" "4,4,8,8") + (set_attr "type" "compare")]) ; Any mode that needs to be solved by secondary reload (define_mode_iterator SRI [QI HI]) -- 2.5.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 6/6] [ARC] Various instruction pattern fixes 2016-04-18 18:26 ` Claudiu Zissulescu @ 2016-04-28 12:31 ` Joern Wolfgang Rennecke 2016-05-02 11:21 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 12:31 UTC (permalink / raw) To: Claudiu Zissulescu, Claudiu Zissulescu, gcc-patches Cc: Francois.Bedard, jeremy.bennett On 18/04/16 19:25, Claudiu Zissulescu wrote: > Forgot to add the reload cases. Here it is the updated patch. > > //Claudiu > > > gcc/ > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.md (mulsidi3): Change operand 0 predicate to > register_operand. > (umulsidi3): Likewise. > (indirect_jump): Fix jump instruction assembly patterns. > (arcset<code>): Change operand 1 predicate to nonmemory_operand. > (arcsetltu, arcsetgeu): Likewise. ChangeLog omission: You are also adding an r/n/r alternative. > (arcsethi, arcsetls): Fix pattern. Otherwise this is OK. If the constant / register comparisons come from an expander, in general the expander should be fixed to swap the operands and use the swapped comparison code, to get canonical rtl. OTOH, constant re-materialization during register allocation can change a reg-reg into a constant-reg comparison, and at that stage, canonicalization would not be expected. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 6/6] [ARC] Various instruction pattern fixes 2016-04-28 12:31 ` Joern Wolfgang Rennecke @ 2016-05-02 11:21 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-05-02 11:21 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett > > gcc/ > > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > > > * config/arc/arc.md (mulsidi3): Change operand 0 predicate to > > register_operand. > > (umulsidi3): Likewise. > > (indirect_jump): Fix jump instruction assembly patterns. > > (arcset<code>): Change operand 1 predicate to > nonmemory_operand. > > (arcsetltu, arcsetgeu): Likewise. > ChangeLog omission: You are also adding an r/n/r alternative. > > (arcsethi, arcsetls): Fix pattern. > Otherwise this is OK. > > If the constant / register comparisons come from an expander, in > general the expander should be fixed to swap the operands and > use the swapped comparison code, to get canonical rtl. > OTOH, constant re-materialization during register allocation can change > a reg-reg into > a constant-reg comparison, and at that stage, canonicalization would not > be expected. I will commit this patch without the arcset* mods, this is safer. Thanks! Claudiu ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 3/6] [ARC] Pass mfpuda to assembler. 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-28 10:30 ` Joern Wolfgang Rennecke 2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu ` (3 subsequent siblings) 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler. --- gcc/config/arc/arc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index 1c2a38d..299e63a 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -153,7 +153,7 @@ along with GCC; see the file COPYING3. If not see %{mcpu=ARC700:-mEA} \ %{!mcpu=*:" ASM_DEFAULT "} \ %{mbarrel-shifter} %{mno-mpy} %{mmul64} %{mmul32x16:-mdsp-packa} %{mnorm} \ -%{mswap} %{mEA} %{mmin-max} %{mspfp*} %{mdpfp*} \ +%{mswap} %{mEA} %{mmin-max} %{mspfp*} %{mdpfp*} %{mfpu=fpuda*:-mfpuda} \ %{msimd} \ %{mmac-d16} %{mmac-24} %{mdsp-packa} %{mcrc} %{mdvbf} %{mtelephony} %{mxy} \ %{mcpu=ARC700|!mcpu=*:%{mlock}} \ -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 3/6] [ARC] Pass mfpuda to assembler. 2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu @ 2016-04-28 10:30 ` Joern Wolfgang Rennecke 2016-04-28 13:10 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 10:30 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 18/04/16 15:33, Claudiu Zissulescu wrote: > OK to apply? > Claudiu > > gcc/ > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler. > OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 3/6] [ARC] Pass mfpuda to assembler. 2016-04-28 10:30 ` Joern Wolfgang Rennecke @ 2016-04-28 13:10 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 13:10 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett Committed r235568. Thanks, Claudiu > > gcc/ > > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > > > * config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler. > > > OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-28 11:47 ` Joern Wolfgang Rennecke 2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu ` (2 subsequent siblings) 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett The combine pass may conclude umulhisi3_imm pattern can accept also sign extended 16-bit constants. This patch prohibits the combine in considering this pattern as suitable. OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (umulhisi3_imm): Avoid unwanted match for sign extend 16-bit constants. * testsuite/gcc.target/arc/umulsihi3_z.c: New file. --- gcc/config/arc/arc.md | 3 ++- gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 74530b1..6731072 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -1729,7 +1729,8 @@ (define_insn "umulhisi3_imm" [(set (match_operand:SI 0 "register_operand" "=r, r,r, r, r") (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0, 0, r")) - (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16")))] + (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16"))) + (use (match_dup 2))] "TARGET_MPYW" "mpyuw%? %0,%1,%2" [(set_attr "length" "4,4,4,8,8") diff --git a/gcc/testsuite/gcc.target/arc/umulsihi3_z.c b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c new file mode 100644 index 0000000..cf1c00d --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c @@ -0,0 +1,23 @@ +/* Check if the optimizers are not removing the umulsihi3_imm + instruction. */ +/* { dg-do run } */ +/* { dg-options "-O2 -fno-inline" } */ + +#include <stdint.h> + +static int32_t test (int16_t reg_val) +{ + int32_t x = (reg_val & 0xf) * 62500; + return x; +} + +int main (void) +{ + volatile int32_t x = 0xc172; + x = test (x); + + if (x != 0x0001e848) + __builtin_abort (); + return 0; +} + -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu @ 2016-04-28 11:47 ` Joern Wolfgang Rennecke 2016-04-28 17:12 ` [PATCH] " Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 11:47 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 18/04/16 15:33, Claudiu Zissulescu wrote: > The combine pass may conclude umulhisi3_imm pattern can accept also sign > extended 16-bit constants. This patch prohibits the combine in considering > this pattern as suitable. > > OK to apply? > Claudiu > > gcc/ > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.md (umulhisi3_imm): Avoid unwanted match for sign > extend 16-bit constants. ... > * testsuite/gcc.target/arc/umulsihi3_z.c: New file. > - (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16")))] > + (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16"))) > + (use (match_dup 2))] > That's not the way to fix it. Get the predicates and constraints right. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-28 11:47 ` Joern Wolfgang Rennecke @ 2016-04-28 17:12 ` Claudiu Zissulescu 2016-04-28 17:46 ` Joern Wolfgang Rennecke 0 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 17:12 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett Please find the updated patch. Claudiu gcc/ 2016-04-28 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define. * config/arc/arc.md (umulhisi3): Use arc_short_operand predicate. (umulhisi3_imm): Update predicates and constraint letters. (umulhisi3_reg): Declare instruction as commutative. * config/arc/constraints.md (U12, U16): New constraints. * config/arc/predicates.md (short_unsigned_const_operand): New predicate. (arc_short_operand): Likewise. * testsuite/gcc.target/arc/umulsihi3_z.c: New file. --- gcc/config/arc/arc.h | 2 ++ gcc/config/arc/arc.md | 14 +++++++------- gcc/config/arc/constraints.md | 11 +++++++++++ gcc/config/arc/predicates.md | 8 ++++++++ gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++++++++++++++++++++++ 5 files changed, 51 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index 37c1afa..1b75099 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -795,6 +795,8 @@ extern enum reg_class arc_regno_reg_class[]; #define UNSIGNED_INT6(X) ((unsigned) (X) < 0x40) #define UNSIGNED_INT7(X) ((unsigned) (X) < 0x80) #define UNSIGNED_INT8(X) ((unsigned) (X) < 0x100) +#define UNSIGNED_INT12(X) ((unsigned) (X) < 0x800) +#define UNSIGNED_INT16(X) ((unsigned) (X) < 0x10000) #define IS_ONE(X) ((X) == 1) #define IS_ZERO(X) ((X) == 0) diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 8ec0ce0..e0f74e4 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -1720,21 +1720,21 @@ (define_expand "umulhisi3" [(set (match_operand:SI 0 "register_operand" "") (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "")) - (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))] + (zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""))))] "TARGET_MPYW" "{ if (CONSTANT_P (operands[2])) { - emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2])); - DONE; + emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2])); + DONE; } }" ) (define_insn "umulhisi3_imm" - [(set (match_operand:SI 0 "register_operand" "=r, r,r, r, r") - (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0, 0, r")) - (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16")))] + [(set (match_operand:SI 0 "register_operand" "=r, r, r, r, r") + (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "%0, r, 0, 0, r")) + (match_operand:HI 2 "short_unsigned_const_operand" " L, L,U12,U16,U16")))] "TARGET_MPYW" "mpyuw%? %0,%1,%2" [(set_attr "length" "4,4,4,8,8") @@ -1746,7 +1746,7 @@ (define_insn "umulhisi3_reg" [(set (match_operand:SI 0 "register_operand" "=Rcqq, r, r") - (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, 0, r")) + (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " %0, 0, r")) (zero_extend:SI (match_operand:HI 2 "register_operand" " Rcqq, r, r"))))] "TARGET_MPYW" "mpyuw%? %0,%1,%2" diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md index 668b60a..cdf94ef 100644 --- a/gcc/config/arc/constraints.md +++ b/gcc/config/arc/constraints.md @@ -427,3 +427,14 @@ "A memory with only a base register" (match_operand 0 "mem_noofs_operand")) +(define_constraint "U12" + "@internal + An unsigned 12-bit integer constant." + (and (match_code "const_int") + (match_test "UNSIGNED_INT12 (ival)"))) + +(define_constraint "U16" + "@internal + An unsigned 16-bit integer constant" + (and (match_code "const_int") + (match_test "UNSIGNED_INT16 (ival)"))) diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md index 3c657c6..9542b22 100644 --- a/gcc/config/arc/predicates.md +++ b/gcc/config/arc/predicates.md @@ -819,3 +819,11 @@ (define_predicate "double_register_operand" (ior (match_test "even_register_operand (op, mode)") (match_test "arc_double_register_operand (op, mode)"))) + +(define_predicate "short_unsigned_const_operand" + (and (match_code "const_int") + (match_test "satisfies_constraint_U16 (op)"))) + +(define_predicate "arc_short_operand" + (ior (match_test "register_operand (op, mode)") + (match_test "short_unsigned_const_operand (op, mode)"))) diff --git a/gcc/testsuite/gcc.target/arc/umulsihi3_z.c b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c new file mode 100644 index 0000000..cf1c00d --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c @@ -0,0 +1,23 @@ +/* Check if the optimizers are not removing the umulsihi3_imm + instruction. */ +/* { dg-do run } */ +/* { dg-options "-O2 -fno-inline" } */ + +#include <stdint.h> + +static int32_t test (int16_t reg_val) +{ + int32_t x = (reg_val & 0xf) * 62500; + return x; +} + +int main (void) +{ + volatile int32_t x = 0xc172; + x = test (x); + + if (x != 0x0001e848) + __builtin_abort (); + return 0; +} + -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-28 17:12 ` [PATCH] " Claudiu Zissulescu @ 2016-04-28 17:46 ` Joern Wolfgang Rennecke 2016-04-28 20:31 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 17:46 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 28/04/16 18:10, Claudiu Zissulescu wrote: > Please find the updated patch. > > Claudiu > > gcc/ > 2016-04-28 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define. > * config/arc/arc.md (umulhisi3): Use arc_short_operand predicate. > (umulhisi3_imm): Update predicates and constraint letters. > (umulhisi3_reg): Declare instruction as commutative. > * config/arc/constraints.md (U12, U16): New constraints. I'm not sure how to feel about this. U16 looks intuitive, but we have traditionally used U for memory constraints. And we use it for ARC for that purpose, too, even though with a compatible constraint length of 3. I suppose it's fine if you're sure we never want to have an addressing mode that's best described with "12" or "16", or some other number we might want for an unsigned integer. Otherwise, I'd suggest using a traditional integer letter. 'J' is free. > > (define_expand "umulhisi3" > [(set (match_operand:SI 0 "register_operand" "") > (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "")) > - (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))] > + (zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""))))] > "TARGET_MPYW" > "{ > if (CONSTANT_P (operands[2])) > { > - emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2])); > - DONE; > + emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2])); > + DONE; Why do you remove half of the indentation? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-28 17:46 ` Joern Wolfgang Rennecke @ 2016-04-28 20:31 ` Claudiu Zissulescu 2016-04-28 20:57 ` Joern Wolfgang Rennecke 0 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 20:31 UTC (permalink / raw) To: Joern Wolfgang Rennecke, Claudiu Zissulescu, gcc-patches Cc: Francois.Bedard, jeremy.bennett > > Otherwise, I'd suggest using a traditional integer letter. 'J' is free. Thanks for the suggestion, I will use 'J'. > Why do you remove half of the indentation? Unwanted reformatting, sorry for this, I will revert it. I have the feeling you are happy with my new patch. Is there anything to be added to it besides fixing the above issues? Thanks, Claudiu ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-28 20:31 ` Claudiu Zissulescu @ 2016-04-28 20:57 ` Joern Wolfgang Rennecke 2016-04-29 8:41 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 20:57 UTC (permalink / raw) To: Claudiu Zissulescu, Claudiu Zissulescu, gcc-patches Cc: Francois.Bedard, jeremy.bennett On 28/04/16 21:31, Claudiu Zissulescu wrote: >> >> Otherwise, I'd suggest using a traditional integer letter. 'J' is free. > Thanks for the suggestion, I will use 'J'. > >> Why do you remove half of the indentation? > Unwanted reformatting, sorry for this, I will revert it. > > I have the feeling you are happy with my new patch. Is there anything > to be added to it besides fixing the above issues? No, otherwise it looks OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant. 2016-04-28 20:57 ` Joern Wolfgang Rennecke @ 2016-04-29 8:41 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-29 8:41 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett Committed r235623. Thanks, Claudiu > -----Original Message----- > From: Joern Wolfgang Rennecke [mailto:gnu@amylaar.uk] > Sent: Thursday, April 28, 2016 10:57 PM > To: Claudiu Zissulescu; Claudiu Zissulescu; gcc-patches@gcc.gnu.org > Cc: Francois.Bedard@synopsys.com; jeremy.bennett@embecosm.com > Subject: Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit > constant. > > > > On 28/04/16 21:31, Claudiu Zissulescu wrote: > >> > >> Otherwise, I'd suggest using a traditional integer letter. 'J' is free. > > Thanks for the suggestion, I will use 'J'. > > > >> Why do you remove half of the indentation? > > Unwanted reformatting, sorry for this, I will revert it. > > > > I have the feeling you are happy with my new patch. Is there anything > > to be added to it besides fixing the above issues? > No, otherwise it looks OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian. 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu ` (2 preceding siblings ...) 2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-28 10:29 ` Joern Wolfgang Rennecke 2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.c (arc_process_double_reg_moves): Fix for big-endian compilation. * config/arc/arc.md (addf3): Likewise. (subdf3): Likewise. (muldf3): Likewise. --- gcc/config/arc/arc.c | 12 ++++++++---- gcc/config/arc/arc.md | 18 +++++++++--------- 2 files changed, 17 insertions(+), 13 deletions(-) diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index d60db50..f4bef3e 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -8647,8 +8647,10 @@ arc_process_double_reg_moves (rtx *operands) { /* When we have 'mov D, r' or 'mov D, D' then get the target register pair for use with LR insn. */ - rtx destHigh = simplify_gen_subreg(SImode, dest, DFmode, 4); - rtx destLow = simplify_gen_subreg(SImode, dest, DFmode, 0); + rtx destHigh = simplify_gen_subreg (SImode, dest, DFmode, + TARGET_BIG_ENDIAN ? 0 : 4); + rtx destLow = simplify_gen_subreg (SImode, dest, DFmode, + TARGET_BIG_ENDIAN ? 4 : 0); /* Produce the two LR insns to get the high and low parts. */ emit_insn (gen_rtx_SET (destHigh, @@ -8665,8 +8667,10 @@ arc_process_double_reg_moves (rtx *operands) { /* When we have 'mov r, D' or 'mov D, D' and we have access to the LR insn get the target register pair. */ - rtx srcHigh = simplify_gen_subreg(SImode, src, DFmode, 4); - rtx srcLow = simplify_gen_subreg(SImode, src, DFmode, 0); + rtx srcHigh = simplify_gen_subreg (SImode, src, DFmode, + TARGET_BIG_ENDIAN ? 0 : 4); + rtx srcLow = simplify_gen_subreg (SImode, src, DFmode, + TARGET_BIG_ENDIAN ? 4 : 0); emit_insn (gen_rtx_UNSPEC_VOLATILE (Pmode, gen_rtvec (3, dest, srcHigh, srcLow), diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 9766547..74530b1 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -5681,9 +5681,9 @@ { if (GET_CODE (operands[2]) == CONST_DOUBLE) { - rtx high, low, tmp; - split_double (operands[2], &low, &high); - tmp = force_reg (SImode, high); + rtx first, second, tmp; + split_double (operands[2], &first, &second); + tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second); emit_insn (gen_adddf3_insn (operands[0], operands[1], operands[2], tmp, const0_rtx)); } @@ -5718,10 +5718,10 @@ if ((GET_CODE (operands[1]) == CONST_DOUBLE) || GET_CODE (operands[2]) == CONST_DOUBLE) { - rtx high, low, tmp; + rtx first, second, tmp; int const_index = ((GET_CODE (operands[1]) == CONST_DOUBLE) ? 1 : 2); - split_double (operands[const_index], &low, &high); - tmp = force_reg (SImode, high); + split_double (operands[const_index], &first, &second); + tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second); emit_insn (gen_subdf3_insn (operands[0], operands[1], operands[2], tmp, const0_rtx)); } @@ -5753,9 +5753,9 @@ { if (GET_CODE (operands[2]) == CONST_DOUBLE) { - rtx high, low, tmp; - split_double (operands[2], &low, &high); - tmp = force_reg (SImode, high); + rtx first, second, tmp; + split_double (operands[2], &first, &second); + tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second); emit_insn (gen_muldf3_insn (operands[0], operands[1], operands[2], tmp, const0_rtx)); } -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian. 2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu @ 2016-04-28 10:29 ` Joern Wolfgang Rennecke 2016-04-28 12:54 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 10:29 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 18/04/16 15:33, Claudiu Zissulescu wrote: > OK to apply? > Claudiu > > gcc/ > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.c (arc_process_double_reg_moves): Fix for > big-endian compilation. > * config/arc/arc.md (addf3): Likewise. > (subdf3): Likewise. > (muldf3): Likewise. > OK. FWIW, there is also a FIXME for a little-endian-centric use of split_double in arc.c:arc_rtx_costs. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian. 2016-04-28 10:29 ` Joern Wolfgang Rennecke @ 2016-04-28 12:54 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 12:54 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett Fixed naming in arc_rtx_costs, committed r235567. Thanks, Claudiu >> gcc/ > > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > > > * config/arc/arc.c (arc_process_double_reg_moves): Fix for > > big-endian compilation. > > * config/arc/arc.md (addf3): Likewise. > > (subdf3): Likewise. > > (muldf3): Likewise. > > > OK. > > FWIW, there is also a FIXME for a little-endian-centric use of > split_double in arc.c:arc_rtx_costs. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda. 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu ` (3 preceding siblings ...) 2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-28 10:05 ` Joern Wolfgang Rennecke 2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett The double precision floating point assist instructions are not implementing the reverse double subtract instruction (drsub) found in the FPX extension, hence, this patch. OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (cpu_facility): Add fpx variant. (subdf3): Prohibit use reverse sub when assist operations option is enabled. * config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub instructions only when FPX is enabled. * testsuite/gcc.target/arc/trsub.c: New test. --- gcc/config/arc/arc.md | 8 +++++++- gcc/config/arc/fpx.md | 7 ++++--- gcc/testsuite/gcc.target/arc/trsub.c | 10 ++++++++++ 3 files changed, 21 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/trsub.c diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index 4193d26..9766547 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -265,7 +265,7 @@ - get_attr_length (insn)"))) ; for ARCv2 we need to disable/enable different instruction alternatives -(define_attr "cpu_facility" "std,av1,av2" +(define_attr "cpu_facility" "std,av1,av2,fpx" (const_string "std")) ; We should consider all the instructions enabled until otherwise @@ -277,6 +277,10 @@ (and (eq_attr "cpu_facility" "av2") (not (match_test "TARGET_V2"))) (const_string "no") + + (and (eq_attr "cpu_facility" "fpx") + (match_test "TARGET_FP_DP_AX")) + (const_string "no") ] (const_string "yes"))) @@ -5709,6 +5713,8 @@ " if (TARGET_DPFP) { + if (TARGET_FP_DP_AX && (GET_CODE (operands[1]) == CONST_DOUBLE)) + operands[1] = force_reg (DFmode, operands[1]); if ((GET_CODE (operands[1]) == CONST_DOUBLE) || GET_CODE (operands[2]) == CONST_DOUBLE) { diff --git a/gcc/config/arc/fpx.md b/gcc/config/arc/fpx.md index b790600..2e11157 100644 --- a/gcc/config/arc/fpx.md +++ b/gcc/config/arc/fpx.md @@ -304,7 +304,8 @@ drsubh%F0%F2 0,%H1,%L1 drsubh%F0%F2 0,%3,%L1" [(set_attr "type" "dpfp_addsub") - (set_attr "length" "4,8,4,8")]) + (set_attr "length" "4,8,4,8") + (set_attr "cpu_facility" "*,*,fpx,fpx")]) ;; ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Peephole for following conversion @@ -613,5 +614,5 @@ drsubh%F0%F2 %H6, %H1, %L1 drsubh%F0%F2 %H6, %3, %L1" [(set_attr "type" "dpfp_addsub") - (set_attr "length" "4,8,4,8")] -) + (set_attr "length" "4,8,4,8") + (set_attr "cpu_facility" "*,*,fpx,fpx")]) diff --git a/gcc/testsuite/gcc.target/arc/trsub.c b/gcc/testsuite/gcc.target/arc/trsub.c new file mode 100644 index 0000000..031935f --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/trsub.c @@ -0,0 +1,10 @@ +/* Tests if we generate rsub instructions when compiling using + floating point assist instructions. */ +/* { dg-do compile } */ +/* { dg-options "-mfpu=fpuda -mcpu=arcem" } */ + +double foo (double a) +{ + return ((double) 0.12 - a); +} +/* { dg-final { scan-assembler-not "drsub.*" } } */ -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda. 2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu @ 2016-04-28 10:05 ` Joern Wolfgang Rennecke 2016-04-28 12:16 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 10:05 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 18/04/16 15:33, Claudiu Zissulescu wrote: > The double precision floating point assist instructions are not > implementing the reverse double subtract instruction (drsub) found in > the FPX extension, hence, this patch. > > OK to apply? > Claudiu > > gcc/ > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > * config/arc/arc.md (cpu_facility): Add fpx variant. > (subdf3): Prohibit use reverse sub when assist operations option > is enabled. > * config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub > instructions only when FPX is enabled. > * testsuite/gcc.target/arc/trsub.c: New test. > OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda. 2016-04-28 10:05 ` Joern Wolfgang Rennecke @ 2016-04-28 12:16 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 12:16 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett Committed r235562. Thanks, Claudiu > > > > gcc/ > > 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> > > > > * config/arc/arc.md (cpu_facility): Add fpx variant. > > (subdf3): Prohibit use reverse sub when assist operations option > > is enabled. > > * config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub > > instructions only when FPX is enabled. > > * testsuite/gcc.target/arc/trsub.c: New test. > > > OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu ` (4 preceding siblings ...) 2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu @ 2016-04-18 14:35 ` Claudiu Zissulescu 2016-04-28 11:27 ` Joern Wolfgang Rennecke 5 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett OK to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * testsuite/gcc.target/arc/ieee_eq.c: New test. libgcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/ieee-754/eqdf2.S: Handle FPX NaN. --- gcc/testsuite/gcc.target/arc/ieee_eq.c | 47 ++++++++++++++++++++++++++++++++++ libgcc/config/arc/ieee-754/eqdf2.S | 13 ++++++---- 2 files changed, 55 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c diff --git a/gcc/testsuite/gcc.target/arc/ieee_eq.c b/gcc/testsuite/gcc.target/arc/ieee_eq.c new file mode 100644 index 0000000..70aebad --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/ieee_eq.c @@ -0,0 +1,47 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <stdio.h> +#include <float.h> + +#define TEST_EQ(TYPE,X,Y,RES) \ + do { \ + volatile TYPE a, b; \ + a = (TYPE) X; \ + b = (TYPE) Y; \ + if ((a == b) != RES) \ + { \ + printf ("Runtime computation error @%d. %g " \ + "!= %g\n", __LINE__, a, b); \ + error = 1; \ + } \ + } while (0) + +#ifndef __HS__ +/* Special type of NaN found when using double FPX instructions. */ +static const unsigned long long __nan = 0x7FF0000080000000ULL; +# define W (*(double *) &__nan) +#else +# define W __builtin_nan ("") +#endif + +#define Q __builtin_nan ("") +#define H __builtin_inf () + +int main (void) +{ + int error = 0; + + TEST_EQ (double, 1, 1, 1); + TEST_EQ (double, 1, 2, 0); + TEST_EQ (double, W, W, 0); + TEST_EQ (double, Q, Q, 0); + TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1); + TEST_EQ (double, __DBL_MIN__, __DBL_MIN__, 1); + TEST_EQ (double, H, H, 1); + + if (error) + __builtin_abort (); + + return 0; +} diff --git a/libgcc/config/arc/ieee-754/eqdf2.S b/libgcc/config/arc/ieee-754/eqdf2.S index bc7d88e..3b23e04 100644 --- a/libgcc/config/arc/ieee-754/eqdf2.S +++ b/libgcc/config/arc/ieee-754/eqdf2.S @@ -58,11 +58,14 @@ __eqdf2: well predictable (as seen from the branch predictor). */ __eqdf2: brne.d DBL0H,DBL1H,.Lhighdiff - bmsk r12,DBL0H,20 -#ifdef DPFP_COMPAT - or.f 0,DBL0L,DBL1L - bset.ne r12,r12,21 -#endif /* DPFP_COMPAT */ +#ifndef __HS__ + /* The next two instructions are required to recognize the FPX + NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as + oposite to 0x7ff8_0000_0000_0000. */ + or.f 0,DBL0L,DBL1L + bset.ne DBL0H,DBL0H,19 +#endif /* __HS__ */ + bmsk r12,DBL0H,20 add1.f r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN. */ j_s.d [blink] cmp.cc DBL0L,DBL1L -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu @ 2016-04-28 11:27 ` Joern Wolfgang Rennecke 2016-04-28 11:35 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 11:27 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 18/04/16 15:33, Claudiu Zissulescu wrote: > OK to apply? No. You are clobbering DBL0H. Besides, why would you change any of the code, apart from the argument to #ifdef and the comments? ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 11:27 ` Joern Wolfgang Rennecke @ 2016-04-28 11:35 ` Claudiu Zissulescu 2016-04-28 11:41 ` Joern Wolfgang Rennecke 0 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 11:35 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett > Besides, why would you change any of the code, apart from the argument > to #ifdef and the comments? It is not working/giving wrong results. I think, the test shows you this if you run it without all the libgcc mods. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 11:35 ` Claudiu Zissulescu @ 2016-04-28 11:41 ` Joern Wolfgang Rennecke 2016-04-28 11:43 ` Claudiu Zissulescu 2016-04-28 14:12 ` Claudiu Zissulescu 0 siblings, 2 replies; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 11:41 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 28/04/16 12:35, Claudiu Zissulescu wrote: >> Besides, why would you change any of the code, apart from the argument >> to #ifdef and the comments? > It is not working/giving wrong results. I think, the test shows you this if you run it without all the libgcc mods. I can't. Where exactly does the test go wrong? Can you show a trace of __eqdf2 with register values? ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 11:41 ` Joern Wolfgang Rennecke @ 2016-04-28 11:43 ` Claudiu Zissulescu 2016-04-28 14:12 ` Claudiu Zissulescu 1 sibling, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 11:43 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett > > Where exactly does the test go wrong? I will try to trace it back when I develop it. It passed too long since then. Probably something related with big-endian. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 11:41 ` Joern Wolfgang Rennecke 2016-04-28 11:43 ` Claudiu Zissulescu @ 2016-04-28 14:12 ` Claudiu Zissulescu 2016-04-28 15:03 ` Joern Wolfgang Rennecke 1 sibling, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-28 14:12 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett Hi, > Where exactly does the test go wrong? The test which fails is this one: TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1); From the test file included in the patch. > Can you show a trace of __eqdf2 with register values? Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling: [0x000002a2] 0xc000 K Z ld_s r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff * [0x000002a4] 0xc101 K Z ld_s r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff * [0x000002a6] 0xc202 K Z ld_s r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff * [0x000002a8] 0xc303 K Z ld_s r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff * [0x000002aa] 0x0aea0000 K Z bl 0x2e8 : (w0) r31 <= 0x000002ae * [0x00000590] 0x091d00e1 K Z brne.d r1,r3,0x1c [0x00000594] 0x2153050c K Z bmsk r12,r1,0x14 : (w0) r12 <= 0x000fffff * [0x00000598] 0x200580be K Z or.f 0,r0,r2 * [0x0000059c] 0x24cf1562 K N bset.ne r12,r12,0x15 : (w0) r12 <= 0x002fffff * [0x000005a0] 0x2414904c K N add1.f r12,r12,r1 : (w0) r12 <= 0x000ffffd * [0x000005a4] 0x7fe0 K C j_s.d [blink] * [0x000005a6] 0x20cc8086 KD C cmp.cc r0,r2 For reference, the routine: .global __eqdf2 .balign 4 HIDDEN_FUNC(__eqdf2) /* Good performance as long as the difference in high word is well predictable (as seen from the branch predictor). */ __eqdf2: brne.d DBL0H,DBL1H,.Lhighdiff bmsk r12,DBL0H,20 #ifndef __HS__ /* The next two instructions are required to recognize the FPX NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as oposite to 0x7ff8_0000_0000_0000. */ or.f 0,DBL0L,DBL1L bset.ne r12,r12,21 #endif /* __HS__ */ add1.f r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN. */ j_s.d [blink] cmp.cc DBL0L,DBL1L .balign 4 .Lhighdiff: or r12,DBL0H,DBL1H or.f 0,DBL0L,DBL1L j_s.d [blink] bmsk.eq.f r12,r12,30 ENDFUNC(__eqdf2) All those results were collected using nsimfree. Please let me know if you need more info, Claudiu ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 14:12 ` Claudiu Zissulescu @ 2016-04-28 15:03 ` Joern Wolfgang Rennecke 2016-04-29 10:18 ` [PATCH] " Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-28 15:03 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 28/04/16 15:11, Claudiu Zissulescu wrote: > Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling: > > [0x000002a2] 0xc000 K Z ld_s r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff * > [0x000002a4] 0xc101 K Z ld_s r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff * > [0x000002a6] 0xc202 K Z ld_s r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff * > [0x000002a8] 0xc303 K Z ld_s r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff * > [0x000002aa] 0x0aea0000 K Z bl 0x2e8 : (w0) r31 <= 0x000002ae * > [0x00000590] 0x091d00e1 K Z brne.d r1,r3,0x1c > [0x00000594] 0x2153050c K Z bmsk r12,r1,0x14 : (w0) r12 <= 0x000fffff * > [0x00000598] 0x200580be K Z or.f 0,r0,r2 * > [0x0000059c] 0x24cf1562 K N bset.ne r12,r12,0x15 : (w0) r12 <= 0x002fffff * > [0x000005a0] 0x2414904c K N add1.f r12,r12,r1 : (w0) r12 <= 0x000ffffd * > [0x000005a4] 0x7fe0 K C j_s.d [blink] * > [0x000005a6] 0x20cc8086 KD C cmp.cc r0,r2 > > I see, we basically have an overflow. I think the DPFP_COMPAT / __HS__ variant should be something like: brne DBL0H,DBL1H,.Lhighdiff mov_s r12,0x00200000 or.f 0,DBL0L,DBL1L bset.ne r12,r12,0 add1.f r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN. */ j_s.d [blink] cmp.cc DBL0L,DBL1L ... Where the mov_s could be replaced with something else that loads the same value, depending on what instructions are supported. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-28 15:03 ` Joern Wolfgang Rennecke @ 2016-04-29 10:18 ` Claudiu Zissulescu 2016-04-29 10:23 ` Joern Wolfgang Rennecke 2016-04-29 10:27 ` Joern Wolfgang Rennecke 0 siblings, 2 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-29 10:18 UTC (permalink / raw) To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett This is the updated patch on handling FPX NaNs. Ok to apply? Claudiu gcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * testsuite/gcc.target/arc/ieee_eq.c: New test. libgcc/ 2016-04-18 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/ieee-754/eqdf2.S: Handle FPX NaN. --- gcc/testsuite/gcc.target/arc/ieee_eq.c | 47 ++++++++++++++++++++++++++++++++++ libgcc/config/arc/ieee-754/eqdf2.S | 15 +++++++---- 2 files changed, 57 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c diff --git a/gcc/testsuite/gcc.target/arc/ieee_eq.c b/gcc/testsuite/gcc.target/arc/ieee_eq.c new file mode 100644 index 0000000..70aebad --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/ieee_eq.c @@ -0,0 +1,47 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <stdio.h> +#include <float.h> + +#define TEST_EQ(TYPE,X,Y,RES) \ + do { \ + volatile TYPE a, b; \ + a = (TYPE) X; \ + b = (TYPE) Y; \ + if ((a == b) != RES) \ + { \ + printf ("Runtime computation error @%d. %g " \ + "!= %g\n", __LINE__, a, b); \ + error = 1; \ + } \ + } while (0) + +#ifndef __HS__ +/* Special type of NaN found when using double FPX instructions. */ +static const unsigned long long __nan = 0x7FF0000080000000ULL; +# define W (*(double *) &__nan) +#else +# define W __builtin_nan ("") +#endif + +#define Q __builtin_nan ("") +#define H __builtin_inf () + +int main (void) +{ + int error = 0; + + TEST_EQ (double, 1, 1, 1); + TEST_EQ (double, 1, 2, 0); + TEST_EQ (double, W, W, 0); + TEST_EQ (double, Q, Q, 0); + TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1); + TEST_EQ (double, __DBL_MIN__, __DBL_MIN__, 1); + TEST_EQ (double, H, H, 1); + + if (error) + __builtin_abort (); + + return 0; +} diff --git a/libgcc/config/arc/ieee-754/eqdf2.S b/libgcc/config/arc/ieee-754/eqdf2.S index bc7d88e..7e80ef5 100644 --- a/libgcc/config/arc/ieee-754/eqdf2.S +++ b/libgcc/config/arc/ieee-754/eqdf2.S @@ -58,11 +58,16 @@ __eqdf2: well predictable (as seen from the branch predictor). */ __eqdf2: brne.d DBL0H,DBL1H,.Lhighdiff - bmsk r12,DBL0H,20 -#ifdef DPFP_COMPAT - or.f 0,DBL0L,DBL1L - bset.ne r12,r12,21 -#endif /* DPFP_COMPAT */ +#ifndef __HS__ + /* The next two instructions are required to recognize the FPX + NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as + oposite to 0x7ff8_0000_0000_0000. */ + or.f 0,DBL0L,DBL1L + mov_s r12,0x00200000 + bset.ne r12,r12,0 +#else + bmsk r12,DBL0H,20 +#endif /* __HS__ */ add1.f r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN. */ j_s.d [blink] cmp.cc DBL0L,DBL1L -- 1.9.1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-29 10:18 ` [PATCH] " Claudiu Zissulescu @ 2016-04-29 10:23 ` Joern Wolfgang Rennecke 2016-04-29 10:27 ` Joern Wolfgang Rennecke 1 sibling, 0 replies; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-29 10:23 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 29/04/16 11:16, Claudiu Zissulescu wrote: > This is the updated patch on handling FPX NaNs. > > Ok to apply? > Claudiu > > OK. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-29 10:18 ` [PATCH] " Claudiu Zissulescu 2016-04-29 10:23 ` Joern Wolfgang Rennecke @ 2016-04-29 10:27 ` Joern Wolfgang Rennecke 2016-04-29 10:31 ` Claudiu Zissulescu 1 sibling, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-29 10:27 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett P.S.: the .d suffix on the branch was there just for scheduling purposes - not sure if that actually helped any chip's pipeline, or if it was just a bug in the documentation. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-29 10:27 ` Joern Wolfgang Rennecke @ 2016-04-29 10:31 ` Claudiu Zissulescu 2016-04-29 10:37 ` Joern Wolfgang Rennecke 0 siblings, 1 reply; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-29 10:31 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett It should do the job, at least for EM where the jump takes 2 cycle, and by means of using delay slots we can make all the cycles count. HS has a branch prediction mechanism, hence, filling up the delay slot doesn't have such a big impact like in EM or even earlier cpus. //Claudiu > -----Original Message----- > From: Joern Wolfgang Rennecke [mailto:gnu@amylaar.uk] > Sent: Friday, April 29, 2016 12:27 PM > To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org > Cc: Francois.Bedard@synopsys.com; jeremy.bennett@embecosm.com > Subject: Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point > library. > > P.S.: the .d suffix on the branch was there just for scheduling purposes - > not sure if that actually helped any chip's pipeline, or if it was just > a bug > in the documentation. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-29 10:31 ` Claudiu Zissulescu @ 2016-04-29 10:37 ` Joern Wolfgang Rennecke 2016-04-29 10:47 ` Claudiu Zissulescu 0 siblings, 1 reply; 34+ messages in thread From: Joern Wolfgang Rennecke @ 2016-04-29 10:37 UTC (permalink / raw) To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett On 29/04/16 11:31, Claudiu Zissulescu wrote: > It should do the job, at least for EM where the jump takes 2 cycle, and by means of using delay slots we can make all the cycles count. HS has a branch prediction mechanism, hence, filling up the delay slot doesn't have such a big impact like in EM or even earlier cpus. No, the alternative is to hide the delay slot, so if the branch is predicted properly, the case with different high words should be faster without the .d suffix. I.e. , eagerly filling the delay slot like this has a bigger - negative - impact on performance. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH] [ARC] Handle FPX NaN within optimized floating point library. 2016-04-29 10:37 ` Joern Wolfgang Rennecke @ 2016-04-29 10:47 ` Claudiu Zissulescu 0 siblings, 0 replies; 34+ messages in thread From: Claudiu Zissulescu @ 2016-04-29 10:47 UTC (permalink / raw) To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett > > It should do the job, at least for EM where the jump takes 2 cycle, and by > means of using delay slots we can make all the cycles count. HS has a branch > prediction mechanism, hence, filling up the delay slot doesn't have such a big > impact like in EM or even earlier cpus. > No, the alternative is to hide the delay slot, so if the branch is > predicted properly, the case with > different high words should be faster without the .d suffix. > > I.e. , eagerly filling the delay slot like this has a bigger - negative > - impact on performance. If we talking about HS, then we can add another flag 'T' which should instruct the branch prediction that we expect this branch to be taken. However, I haven't seen any impact of this flag on the code, and the compiler generates this. In general, the HS branch prediction has some particularities. Although what you say makes perfect sense, I am almost sure it doesn't apply in the case of HS because of the way how it is implemented. But this is a good point, I will try to keep it in mind and ask the hw guys what is best. //Claudiu ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2016-05-02 11:21 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu 2016-04-18 18:26 ` Claudiu Zissulescu 2016-04-28 12:31 ` Joern Wolfgang Rennecke 2016-05-02 11:21 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu 2016-04-28 10:30 ` Joern Wolfgang Rennecke 2016-04-28 13:10 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu 2016-04-28 11:47 ` Joern Wolfgang Rennecke 2016-04-28 17:12 ` [PATCH] " Claudiu Zissulescu 2016-04-28 17:46 ` Joern Wolfgang Rennecke 2016-04-28 20:31 ` Claudiu Zissulescu 2016-04-28 20:57 ` Joern Wolfgang Rennecke 2016-04-29 8:41 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu 2016-04-28 10:29 ` Joern Wolfgang Rennecke 2016-04-28 12:54 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu 2016-04-28 10:05 ` Joern Wolfgang Rennecke 2016-04-28 12:16 ` Claudiu Zissulescu 2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu 2016-04-28 11:27 ` Joern Wolfgang Rennecke 2016-04-28 11:35 ` Claudiu Zissulescu 2016-04-28 11:41 ` Joern Wolfgang Rennecke 2016-04-28 11:43 ` Claudiu Zissulescu 2016-04-28 14:12 ` Claudiu Zissulescu 2016-04-28 15:03 ` Joern Wolfgang Rennecke 2016-04-29 10:18 ` [PATCH] " Claudiu Zissulescu 2016-04-29 10:23 ` Joern Wolfgang Rennecke 2016-04-29 10:27 ` Joern Wolfgang Rennecke 2016-04-29 10:31 ` Claudiu Zissulescu 2016-04-29 10:37 ` Joern Wolfgang Rennecke 2016-04-29 10:47 ` Claudiu Zissulescu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).