* [Bug target/64345] [SH] Improve single bit extraction
2014-12-17 22:21 [Bug target/64345] New: [SH] Improve single bit extraction olegendo at gcc dot gnu.org
@ 2015-01-24 13:05 ` olegendo at gcc dot gnu.org
2015-07-19 8:54 ` olegendo at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-01-24 13:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64345
--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Jan 24 13:04:53 2015
New Revision: 220081
URL: https://gcc.gnu.org/viewcvs?rev=220081&root=gcc&view=rev
Log:
gcc/
PR target/49263
PR target/53987
PR target/64345
PR target/59533
PR target/52933
PR target/54236
PR target/51244
* config/sh/sh-protos.h
(sh_extending_set_of_reg::can_use_as_unextended_reg,
sh_extending_set_of_reg::use_as_unextended_reg,
sh_is_nott_insn, sh_movt_set_dest, sh_movrt_set_dest, sh_is_movt_insn,
sh_is_movrt_insn, sh_insn_operands_modified_between_p,
sh_reg_dead_or_unused_after_insn, sh_in_recog_treg_set_expr,
sh_recog_treg_set_expr, sh_split_treg_set_expr): New functions.
(sh_treg_insns): New class.
* config/sh/sh.c (TARGET_LEGITIMATE_COMBINED_INSN): Define target hook.
(scope_counter): New class.
(sh_legitimate_combined_insn, sh_is_nott_insn, sh_movt_set_dest,
sh_movrt_set_dest, sh_reg_dead_or_unused_after_insn,
sh_extending_set_of_reg::can_use_as_unextended_reg,
sh_extending_set_of_reg::use_as_unextended_reg, sh_recog_treg_set_expr,
sh_in_recog_treg_set_expr, sh_try_split_insn_simple,
sh_split_treg_set_expr): New functions.
(addsubcosts): Handle treg_set_expr.
(sh_rtx_costs): Handle IF_THEN_ELSE and ZERO_EXTRACT.
(sh_rtx_costs): Use arith_reg_operand in SIGN_EXTEND and ZERO_EXTEND.
(sh_rtx_costs): Handle additional bit test patterns in EQ and AND cases.
(sh_insn_operands_modified_between_p): Make non-static.
* config/sh/predicates.md (zero_extend_movu_operand): Allow
simple_mem_operand in addition to displacement_mem_operand.
(zero_extend_operand): Don't allow zero_extend_movu_operand.
(treg_set_expr, treg_set_expr_not_const01,
arith_reg_or_treg_set_expr): New predicates.
* config/sh/sh.md (tstsi_t): Use arith_reg_operand and
arith_or_int_operand instead of logical_operand. Convert to
insn_and_split. Try to optimize constant operand in splitter.
(tsthi_t, tstqi_t): Fold into *tst<mode>_t. Convert to insn_and_split.
(*tstqi_t_zero): Delete.
(*tst<mode>_t_subregs): Add !sh_in_recog_treg_set_expr split condition.
(tstsi_t_and_not): Delete.
(tst<mode>_t_zero_extract_eq): Rename to *tst<mode>_t_zero_extract.
Convert to insn_and_split.
(unnamed split, tstsi_t_zero_extract_xor,
tstsi_t_zero_extract_subreg_xor_little,
tstsi_t_zero_extract_subreg_xor_big): Delete.
(*tstsi_t_shift_mask): New insn_and_split.
(cmpeqsi_t, cmpgesi_t): Add new split for const_int 0 operands and try
to recombine with surrounding insns when splitting.
(*negtstsi): Add !sh_in_recog_treg_set_expr condition.
(cmp_div0s_0, cmp_div0s_1, *cmp_div0s_0, *cmp_div0s_1): Rewrite as ...
(cmp_div0s, *cmp_div0s_1, *cmp_div0s_2, *cmp_div0s_3, *cmp_div0s_4,
*cmp_div0s_5, *cmp_div0s_6): ... these new insn_and_split patterns.
(*cbranch_div0s: Delete.
(*addc): Convert to insn_and_split. Use treg_set_expr as 3rd operand.
Try to recombine with surrounding insns when splitting. Add operand
order variants.
(*addc_t_r, *addc_r_t): Use treg_set_expr_not_const01.
(*addc_r_r_1, *addc_r_lsb, *addc_r_r_lsb, *addc_r_lsb_r, *addc_r_msb,
*addc_r_r_msb, *addc_2r_msb): Delete.
(*addc_2r_lsb): Rename to *addc_2r_t. Use treg_set_expr. Add operand
order variant.
(*addc_negreg_t): New insn_and_split.
(*subc): Convert to insn_and_split. Use treg_set_expr as 3rd operand.
Try to recombine with surrounding insns when splitting.
Add operand order variants.
(*subc_negt_reg, *subc_negreg_t, *reg_lsb_t, *reg_msb_t): New
insn_and_split patterns.
(*rotcr): Use arith_reg_or_treg_set_expr. Try to recombine with
surrounding insns when splitting.
(unnamed rotcr split): Use arith_reg_or_treg_set_expr.
(*rotcl): Likewise. Add zero_extract variant.
(*ashrsi2_31): New insn_and_split.
(*negc): Convert to insn_and_split. Use treg_set_expr.
(*zero_extend<mode>si2_disp_mem): Update comment.
(movrt_negc, *movrt_negc, nott): Add !sh_in_recog_treg_set_expr split
condition.
(*mov_t_msb_neg, mov_neg_si_t): Use treg_set_expr. Try to recombine
with surrounding insns when splitting.
(any_treg_expr_to_reg): New insn_and_split.
(*neg_zero_extract_0, *neg_zero_extract_1, *neg_zero_extract_2,
*neg_zero_extract_3, *neg_zero_extract_4, *neg_zero_extract_5,
*neg_zero_extract_6, *zero_extract_0, *zero_extract_1,
*zero_extract_2): New single bit zero extract patterns.
(bld_reg, *bld_regqi): Fold into bld<mode>_reg.
(*get_thread_pointersi, store_gbr, *mov<mode>_gbr_load,
*mov<mode>_gbr_load, *mov<mode>_gbr_load, *mov<mode>_gbr_load,
*movdi_gbr_load): Use arith_reg_dest instead of register_operand for
set destination.
(set_thread_pointersi, load_gbr): Use arith_reg_operand instead of
register_operand for set source.
gcc/testsuite/
PR target/49263
PR target/53987
PR target/64345
PR target/59533
PR target/52933
PR target/54236
PR target/51244
* gcc.target/sh/pr64345-1.c: New.
* gcc.target/sh/pr64345-2.c: New.
* gcc.target/sh/pr59533-1.c: New.
* gcc.target/sh/pr49263.c: Adjust matching of expected insns.
* gcc.target/sh/pr52933-2.c: Likewise.
* gcc.target/sh/pr54089-1.c: Likewise.
* gcc.target/sh/pr54236-1.c: Likewise.
* gcc.target/sh/pr51244-20-sh2a.c: Likewise.
* gcc.target/sh/pr49263-1.c: Remove xfails.
* gcc.target/sh/pr49263-2.c: Likewise.
* gcc.target/sh/pr49263-3.c: Likewise.
* gcc.target/sh/pr53987-1.c: Likewise.
* gcc.target/sh/pr52933-1.c: Adjust matching of expected insns.
(test_24, test_25, test_26, test_27, test_28, test_29, test_30): New.
* gcc.target/sh/pr51244-12.c: Adjust matching of expected insns.
(test05, test06, test07, test08, test09, test10, test11, test12): New.
* gcc.target/sh/pr54236-3.c: Adjust matching of expected insns.
(test_002, test_003, test_004, test_005, test_006, test_007, test_008,
test_009): New.
* gcc.target/sh/pr51244-4.c: Adjust matching of expected insns.
(test_02): New.
Added:
trunk/gcc/testsuite/gcc.target/sh/pr59533-1.c
trunk/gcc/testsuite/gcc.target/sh/pr64345-1.c
trunk/gcc/testsuite/gcc.target/sh/pr64345-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/predicates.md
trunk/gcc/config/sh/sh-protos.h
trunk/gcc/config/sh/sh.c
trunk/gcc/config/sh/sh.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/sh/pr49263-1.c
trunk/gcc/testsuite/gcc.target/sh/pr49263-2.c
trunk/gcc/testsuite/gcc.target/sh/pr49263-3.c
trunk/gcc/testsuite/gcc.target/sh/pr49263.c
trunk/gcc/testsuite/gcc.target/sh/pr51244-12.c
trunk/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
trunk/gcc/testsuite/gcc.target/sh/pr51244-4.c
trunk/gcc/testsuite/gcc.target/sh/pr52933-1.c
trunk/gcc/testsuite/gcc.target/sh/pr52933-2.c
trunk/gcc/testsuite/gcc.target/sh/pr53987-1.c
trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c
trunk/gcc/testsuite/gcc.target/sh/pr54236-1.c
trunk/gcc/testsuite/gcc.target/sh/pr54236-3.c
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/64345] [SH] Improve single bit extraction
2014-12-17 22:21 [Bug target/64345] New: [SH] Improve single bit extraction olegendo at gcc dot gnu.org
2015-01-24 13:05 ` [Bug target/64345] " olegendo at gcc dot gnu.org
@ 2015-07-19 8:54 ` olegendo at gcc dot gnu.org
2015-07-19 13:11 ` olegendo at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-07-19 8:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64345
--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
A recent change in the middle end has triggered this:
FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times negc 2
The test
int
test_07 (int *vec)
{
/* Must not see a 'sett' or 'addc' here.
This is a case where combine tries to produce
'a + (0 - b) + 1' out of 'a - b + 1'.
On non-SH2A there is a 'tst + negc', on SH2A a 'bld + movt'. */
int z = vec[0];
int vi = vec[1];
int zi = vec[2];
if (zi != 0 && z < -1)
vi -= (((vi >> 7) & 0x01) << 1) - 1;
return vi;
}
fails to produce the expected tst + negc sequence.
A reduced case is:
int test (int vi)
{
return vi - (((vi >> 6) & 0x01) << 1);
}
-m2a -O2 (gcc 5):
bld #6,r4
movt r0
rts
add r0,r0
trunk:
mov #-5,r1
mov r4,r0
shld r1,r0
rts
and #2,r0
-m4 -O2 (gcc 5):
mov r4,r0
tst #64,r0
mov #-1,r0
negc r0,r0
rts
add r0,r0
trunk:
mov #-5,r1
mov r4,r0
shld r1,r0
rts
and #2,r0
For the more complex expression
int test (int vi)
{
return vi - (((vi >> 6) & 0x01) << 1) - 1;
}
-m4 -O2 (gcc 5):
mov r4,r0
mov #-1,r1
tst #64,r0
negc r1,r1
add r1,r1
rts
subc r1,r0
trunk:
mov #-5,r1
mov r4,r0
shld r1,r0
sett
and #2,r0
subc r0,r4
rts
mov r4,r0
It seems the zero_extract patterns are not used for this anymore. Instead
combine is looking for something like
(set (reg:SI 168 [ D.1649 ])
(and:SI (lshiftrt:SI (reg:SI 4 r4 [ vi ])
(const_int 5 [0x5]))
(const_int 2 [0x2])))
This pattern should be added as a zero_extract variant.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/64345] [SH] Improve single bit extraction
2014-12-17 22:21 [Bug target/64345] New: [SH] Improve single bit extraction olegendo at gcc dot gnu.org
2015-01-24 13:05 ` [Bug target/64345] " olegendo at gcc dot gnu.org
2015-07-19 8:54 ` olegendo at gcc dot gnu.org
@ 2015-07-19 13:11 ` olegendo at gcc dot gnu.org
2015-09-19 10:01 ` olegendo at gcc dot gnu.org
2015-09-21 13:49 ` olegendo at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-07-19 13:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64345
--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #2)
The following pattern seems to fix the test case.
===================================================================
--- gcc/config/sh/sh.md (revision 225987)
+++ gcc/config/sh/sh.md (working copy)
@@ -14220,6 +14220,31 @@
[(set (reg:SI T_REG)
(zero_extract:SI (match_dup 0) (const_int 1) (match_dup 1)))])
+(define_insn_and_split "*zero_extract_3"
+ [(set (match_operand:SI 0 "arith_reg_dest")
+ (and:SI (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand")
+ (match_operand 2 "const_int_operand"))
+ (match_operand 3 "const_int_operand")))
+ (clobber (reg:SI T_REG))]
+ "TARGET_SH1 && can_create_pseudo_p ()
+ && exact_log2 (INTVAL (operands[3])) >= 0"
+ "#"
+ "&& 1"
+ [(const_int 0)]
+{
+ int rshift = INTVAL (operands[2]);
+ int lshift = exact_log2 (INTVAL (operands[3]));
+
+ rtx tmp = gen_reg_rtx (SImode);
+ emit_insn (gen_rtx_PARALLEL (VOIDmode,
+ gen_rtvec (2,
+ gen_rtx_SET (tmp,
+ gen_rtx_ZERO_EXTRACT (SImode, operands[1], const1_rtx,
+ GEN_INT (rshift + lshift))),
+ gen_rtx_CLOBBER (VOIDmode, get_t_reg_rtx ()))));
+ emit_insn (gen_ashlsi3 (operands[0], tmp, GEN_INT (lshift)));
+})
+
;; -------------------------------------------------------------------------
;; SH2A instructions for bitwise operations.
;; FIXME: Convert multiple instruction insns to insn_and_split.
However, in some cases it might indeed be better to emit a shift-and sequence,
especially if the shift value is 1,2,8,16.
For instance
return (((vi >> 6) & 0x01) << 3);
results in
mov r4,r0
tst #64,r0
mov #-1,r0
negc r0,r0
shll2 r0
rts
add r0,r0
which is better as:
mov r4,r0
shll2 r0
shlr r0
rts
and #8,r0
Even more so when the bit test constant doesn't fit into 8 bits:
return (((vi >> 10) & 0x01) << 1);
mov.w .L2,r1
mov #-1,r0
tst r1,r4
negc r0,r0
rts
add r0,r0
.align 1
.L2:
.short 1024
better:
mov r4,r0
shlr8 r0
shlr r0
rts
and #2,r0
Generally, it seems better to use an shift-and sequence, unless the tst-negc
sequence can simplify the trailing shift for SH variants without dynamic
shifts.
^ permalink raw reply [flat|nested] 6+ messages in thread