* [patch,avr] minor tweaks for 8-bit operations
@ 2016-07-13 11:12 Georg-Johann Lay
2016-07-14 3:56 ` Senthil Kumar Selvaraj
2016-07-14 6:36 ` Denis Chertykov
0 siblings, 2 replies; 5+ messages in thread
From: Georg-Johann Lay @ 2016-07-13 11:12 UTC (permalink / raw)
To: GCC Patches; +Cc: Denis Chertykov
[-- Attachment #1: Type: text/plain, Size: 892 bytes --]
This patch contains some unrelated tweaks
- Supplying no-ldregs variant for andqi3, iorqi3 where a const_int mask affects
only 1 bit
- Some patterns that match situations with zero_extend that can be performed
with less instructions / register pressure.
- comparing HI against -1
Ok for trunk?
Johann
gcc/
Minor tweaks for QImode.
* config/avr/predicates.md (const_m255_to_m1_operand): New.
* config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
* config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
expensive.
(*cmphi.zero-extend.0, *cmphi.zero-extend.1)
(*usum_widenqihi3, *udiff_widenqihi3)
(*addhi3_zero_extend.const): New combiner insns.
(andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
just 1 bit is affected.
* config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
(avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
[-- Attachment #2: tweak-qi.diff --]
[-- Type: text/x-patch, Size: 8700 bytes --]
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c (revision 237643)
+++ config/avr/avr.c (working copy)
@@ -5346,6 +5346,34 @@ avr_out_compare (rtx_insn *insn, rtx *xo
}
}
+ /* Comparisons == -1 and != -1 of a d-register that's used after the
+ comparison. (If it's unused after we use CPI / SBCI or ADIW sequence
+ from below.) Instead of CPI Rlo,-1 / LDI Rx,-1 / CPC Rhi,Rx we can
+ use CPI Rlo,-1 / CPC Rhi,Rlo which is 1 instruction shorter:
+ If CPI is true then Rlo contains -1 and we can use Rlo instead of Rx
+ when CPC'ing the high part. If CPI is false then CPC cannot render
+ the result to true. This also works for the more generic case where
+ the constant is of the form 0xabab. */
+
+ if (n_bytes == 2
+ && xval != 0
+ && test_hard_reg_class (LD_REGS, xreg)
+ && compare_eq_p (insn)
+ && !reg_unused_after (insn, xreg))
+ {
+ rtx xlo8 = simplify_gen_subreg (QImode, xval, mode, 0);
+ rtx xhi8 = simplify_gen_subreg (QImode, xval, mode, 1);
+
+ if (INTVAL (xlo8) == INTVAL (xhi8))
+ {
+ xop[0] = xreg;
+ xop[1] = xlo8;
+
+ return avr_asm_len ("cpi %A0,%1" CR_TAB
+ "cpc %B0,%A0", xop, plen, 2);
+ }
+ }
+
for (i = 0; i < n_bytes; i++)
{
/* We compare byte-wise. */
@@ -7687,11 +7715,11 @@ avr_out_bitop (rtx insn, rtx *xop, int *
/* op[0]: 8-bit destination register
op[1]: 8-bit const int
- op[2]: 8-bit clobber register or SCRATCH
+ op[2]: 8-bit clobber register, SCRATCH or NULL_RTX.
op[3]: 8-bit register containing 0xff or NULL_RTX */
rtx op[4];
- op[2] = xop[3];
+ op[2] = QImode == mode ? NULL_RTX : xop[3];
op[3] = NULL_RTX;
if (plen)
Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md (revision 237643)
+++ config/avr/avr.md (working copy)
@@ -1240,6 +1240,33 @@ (define_insn "*addhi3.sign_extend1"
[(set_attr "length" "5")
(set_attr "cc" "clobber")])
+(define_insn "*addhi3_zero_extend.const"
+ [(set (match_operand:HI 0 "register_operand" "=d")
+ (plus:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "0"))
+ (match_operand:HI 2 "const_m255_to_m1_operand" "Cn8")))]
+ ""
+ "subi %A0,%n2\;sbc %B0,%B0"
+ [(set_attr "length" "2")
+ (set_attr "cc" "set_czn")])
+
+(define_insn "*usum_widenqihi3"
+ [(set (match_operand:HI 0 "register_operand" "=r")
+ (plus:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "0"))
+ (zero_extend:HI (match_operand:QI 2 "register_operand" "r"))))]
+ ""
+ "add %A0,%2\;clr %B0\;rol %B0"
+ [(set_attr "length" "3")
+ (set_attr "cc" "clobber")])
+
+(define_insn "*udiff_widenqihi3"
+ [(set (match_operand:HI 0 "register_operand" "=r")
+ (minus:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "0"))
+ (zero_extend:HI (match_operand:QI 2 "register_operand" "r"))))]
+ ""
+ "sub %A0,%2\;sbc %B0,%B0"
+ [(set_attr "length" "2")
+ (set_attr "cc" "set_czn")])
+
(define_insn "*addhi3_sp"
[(set (match_operand:HI 1 "stack_register_operand" "=q")
(plus:HI (match_operand:HI 2 "stack_register_operand" "q")
@@ -3102,15 +3129,16 @@ (define_insn "*udivmodsi4_call"
; and
(define_insn "andqi3"
- [(set (match_operand:QI 0 "register_operand" "=??r,d")
- (and:QI (match_operand:QI 1 "register_operand" "%0,0")
- (match_operand:QI 2 "nonmemory_operand" "r,i")))]
+ [(set (match_operand:QI 0 "register_operand" "=??r,d,*l")
+ (and:QI (match_operand:QI 1 "register_operand" "%0,0,0")
+ (match_operand:QI 2 "nonmemory_operand" "r,i,Ca1")))]
""
"@
and %0,%2
- andi %0,lo8(%2)"
- [(set_attr "length" "1,1")
- (set_attr "cc" "set_zn,set_zn")])
+ andi %0,lo8(%2)
+ * return avr_out_bitop (insn, operands, NULL);"
+ [(set_attr "length" "1,1,2")
+ (set_attr "cc" "set_zn,set_zn,none")])
(define_insn "andhi3"
[(set (match_operand:HI 0 "register_operand" "=??r,d,d,r ,r")
@@ -3184,15 +3212,16 @@ (define_peephole2 ; andi
;; ior
(define_insn "iorqi3"
- [(set (match_operand:QI 0 "register_operand" "=??r,d")
- (ior:QI (match_operand:QI 1 "register_operand" "%0,0")
- (match_operand:QI 2 "nonmemory_operand" "r,i")))]
+ [(set (match_operand:QI 0 "register_operand" "=??r,d,*l")
+ (ior:QI (match_operand:QI 1 "register_operand" "%0,0,0")
+ (match_operand:QI 2 "nonmemory_operand" "r,i,Co1")))]
""
"@
or %0,%2
- ori %0,lo8(%2)"
- [(set_attr "length" "1,1")
- (set_attr "cc" "set_zn,set_zn")])
+ ori %0,lo8(%2)
+ * return avr_out_bitop (insn, operands, NULL);"
+ [(set_attr "length" "1,1,2")
+ (set_attr "cc" "set_zn,set_zn,none")])
(define_insn "iorhi3"
[(set (match_operand:HI 0 "register_operand" "=??r,d,d,r ,r")
@@ -4607,6 +4636,25 @@ (define_insn "*cmpqi_sign_extend"
[(set_attr "cc" "compare")
(set_attr "length" "1")])
+
+(define_insn "*cmphi.zero-extend.0"
+ [(set (cc0)
+ (compare (zero_extend:HI (match_operand:QI 0 "register_operand" "r"))
+ (match_operand:HI 1 "register_operand" "r")))]
+ ""
+ "cp %0,%A1\;cpc __zero_reg__,%B1"
+ [(set_attr "cc" "compare")
+ (set_attr "length" "2")])
+
+(define_insn "*cmphi.zero-extend.1"
+ [(set (cc0)
+ (compare (match_operand:HI 0 "register_operand" "r")
+ (zero_extend:HI (match_operand:QI 1 "register_operand" "r"))))]
+ ""
+ "cp %A0,%1\;cpc %B0,__zero_reg__"
+ [(set_attr "cc" "compare")
+ (set_attr "length" "2")])
+
;; "*cmphi"
;; "*cmphq" "*cmpuhq"
;; "*cmpha" "*cmpuha"
Index: config/avr/constraints.md
===================================================================
--- config/avr/constraints.md (revision 237643)
+++ config/avr/constraints.md (working copy)
@@ -133,6 +133,11 @@ (define_constraint "C07"
(and (match_code "const_int")
(match_test "ival == 7")))
+(define_constraint "Ca1"
+ "Constant 1-byte integer that allows AND by means of CLT + BLD."
+ (and (match_code "const_int")
+ (match_test "avr_popcount_each_byte (op, 1, 1<<7)")))
+
(define_constraint "Ca2"
"Constant 2-byte integer that allows AND without clobber register."
(and (match_code "const_int")
@@ -148,6 +153,11 @@ (define_constraint "Ca4"
(and (match_code "const_int")
(match_test "avr_popcount_each_byte (op, 4, (1<<0) | (1<<7) | (1<<8))")))
+(define_constraint "Co1"
+ "Constant 1-byte integer that allows AND by means of SET + BLD."
+ (and (match_code "const_int")
+ (match_test "avr_popcount_each_byte (op, 1, 1<<1)")))
+
(define_constraint "Co2"
"Constant 2-byte integer that allows OR without clobber register."
(and (match_code "const_int")
@@ -193,6 +203,11 @@ (define_constraint "C0f"
(and (match_code "const_int")
(match_test "!avr_has_nibble_0xf (op)")))
+(define_constraint "Cn8"
+ "A negative constant integer in the range @minus{}255 @dots{} @minus{}1."
+ (and (match_code "const_int")
+ (match_test "IN_RANGE (ival, -255, -1)")))
+
;; CONST_FIXED is no element of 'n' so cook our own.
;; "i" or "s" would match but because the insn uses iterators that cover
;; INT_MODE, "i" or "s" is not always possible.
@@ -230,6 +245,12 @@ (define_constraint "Ym2"
(match_test "-2 == INTVAL (avr_to_int_mode (op))"))
(match_test "satisfies_constraint_Cm2 (op)")))
+(define_constraint "Yx2"
+ "Fixed-point or integer constant not in the range @minus{}2 @dots{} 2"
+ (and (ior (match_code "const_int")
+ (match_code "const_fixed"))
+ (match_test "!IN_RANGE (INTVAL (avr_to_int_mode (op)), -2, 2)")))
+
;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED.
(define_constraint "YIJ"
Index: config/avr/predicates.md
===================================================================
--- config/avr/predicates.md (revision 237643)
+++ config/avr/predicates.md (working copy)
@@ -114,6 +114,11 @@ (define_predicate "const_2_to_6_operand"
(and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 2, 6)")))
+;; Return 1 if OP is constant integer -255..-1.
+(define_predicate "const_m255_to_m1_operand"
+ (and (match_code "const_int")
+ (match_test "IN_RANGE (INTVAL (op), -255, -1)")))
+
;; Returns true if OP is either the constant zero or a register.
(define_predicate "reg_or_0_operand"
(ior (match_operand 0 "register_operand")
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch,avr] minor tweaks for 8-bit operations
2016-07-13 11:12 [patch,avr] minor tweaks for 8-bit operations Georg-Johann Lay
@ 2016-07-14 3:56 ` Senthil Kumar Selvaraj
2016-07-14 12:09 ` Georg-Johann Lay
2016-07-14 6:36 ` Denis Chertykov
1 sibling, 1 reply; 5+ messages in thread
From: Senthil Kumar Selvaraj @ 2016-07-14 3:56 UTC (permalink / raw)
To: Georg-Johann Lay; +Cc: GCC Patches, Denis Chertykov
Georg-Johann Lay writes:
> This patch contains some unrelated tweaks
>
> - Supplying no-ldregs variant for andqi3, iorqi3 where a const_int mask affects
> only 1 bit
>
> - Some patterns that match situations with zero_extend that can be performed
> with less instructions / register pressure.
From my (admittedly limited) attempts at code size benchmarking, this
one will help a lot I think, considering ints are used everywhere and
they end up always taking 2 regs, even when one would do. Do you have any numbers on the code size improvement this provides?
Regards
Senthil
>
> - comparing HI against -1
>
> Ok for trunk?
>
> Johann
>
>
> gcc/
> Minor tweaks for QImode.
>
> * config/avr/predicates.md (const_m255_to_m1_operand): New.
> * config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
> * config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
> expensive.
> (*cmphi.zero-extend.0, *cmphi.zero-extend.1)
> (*usum_widenqihi3, *udiff_widenqihi3)
> (*addhi3_zero_extend.const): New combiner insns.
> (andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
> just 1 bit is affected.
> * config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
> (avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch,avr] minor tweaks for 8-bit operations
2016-07-13 11:12 [patch,avr] minor tweaks for 8-bit operations Georg-Johann Lay
2016-07-14 3:56 ` Senthil Kumar Selvaraj
@ 2016-07-14 6:36 ` Denis Chertykov
2016-07-29 16:14 ` Georg-Johann Lay
1 sibling, 1 reply; 5+ messages in thread
From: Denis Chertykov @ 2016-07-14 6:36 UTC (permalink / raw)
To: Georg-Johann Lay; +Cc: GCC Patches
2016-07-13 14:12 GMT+03:00 Georg-Johann Lay <avr@gjlay.de>:
>
> This patch contains some unrelated tweaks
>
> - Supplying no-ldregs variant for andqi3, iorqi3 where a const_int mask affects only 1 bit
>
> - Some patterns that match situations with zero_extend that can be performed with less instructions / register pressure.
>
> - comparing HI against -1
>
> Ok for trunk?
>
> Johann
>
>
> gcc/
> Minor tweaks for QImode.
>
> * config/avr/predicates.md (const_m255_to_m1_operand): New.
> * config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
> * config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
> expensive.
> (*cmphi.zero-extend.0, *cmphi.zero-extend.1)
> (*usum_widenqihi3, *udiff_widenqihi3)
> (*addhi3_zero_extend.const): New combiner insns.
> (andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
> just 1 bit is affected.
> * config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
> (avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
Please apply.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch,avr] minor tweaks for 8-bit operations
2016-07-14 3:56 ` Senthil Kumar Selvaraj
@ 2016-07-14 12:09 ` Georg-Johann Lay
0 siblings, 0 replies; 5+ messages in thread
From: Georg-Johann Lay @ 2016-07-14 12:09 UTC (permalink / raw)
To: Senthil Kumar Selvaraj; +Cc: GCC Patches, Denis Chertykov
On 14.07.2016 05:55, Senthil Kumar Selvaraj wrote:
>
> Georg-Johann Lay writes:
>
>> This patch contains some unrelated tweaks
>>
>> [...]
>>
>> - Some patterns that match situations with zero_extend that can be performed
>> with less instructions / register pressure.
>
> From my (admittedly limited) attempts at code size benchmarking, this
> one will help a lot I think, considering ints are used everywhere and
> they end up always taking 2 regs, even when one would do. Do you have any numbers on the code size improvement this provides?
>
> Regards
> Senthil
No, I usually propose such patches when I am coming across respective situation
in generated asm or in a dump file and it's clear that the change won't reduce
performance loss in other places. My expectations on the overall gains are
low, in particular if some subreg-lowering already split from one HI to two QI
so that there is no zero_extend any more that could be caught.
For the QImoe plus, and, ior with const_int it might pay to add a "d" clobber
if the register ends up in "l" regs. The compiler sometimes reloads the whole
stuff into d-regs, i.e. instead of
LDI R20, -42
ADD R10, R20
we get
MOV R20, R10
ADD R20, -42
MOV R10, R20
I even came across situations where reload reloads the const_int to a "l"
register which might cost up to 4 instructions for the constant alone!
Johann
>>
>> - comparing HI against -1
>>
>> Ok for trunk?
>>
>> Johann
>>
>>
>> gcc/
>> Minor tweaks for QImode.
>>
>> * config/avr/predicates.md (const_m255_to_m1_operand): New.
>> * config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
>> * config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
>> expensive.
>> (*cmphi.zero-extend.0, *cmphi.zero-extend.1)
>> (*usum_widenqihi3, *udiff_widenqihi3)
>> (*addhi3_zero_extend.const): New combiner insns.
>> (andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
>> just 1 bit is affected.
>> * config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
>> (avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch,avr] minor tweaks for 8-bit operations
2016-07-14 6:36 ` Denis Chertykov
@ 2016-07-29 16:14 ` Georg-Johann Lay
0 siblings, 0 replies; 5+ messages in thread
From: Georg-Johann Lay @ 2016-07-29 16:14 UTC (permalink / raw)
To: Denis Chertykov; +Cc: GCC Patches
On 14.07.2016 08:36, Denis Chertykov wrote:
> 2016-07-13 14:12 GMT+03:00 Georg-Johann Lay <avr@gjlay.de>:
>>
>> This patch contains some unrelated tweaks
>>
>> - Supplying no-ldregs variant for andqi3, iorqi3 where a const_int mask affects only 1 bit
>>
>> - Some patterns that match situations with zero_extend that can be performed with less instructions / register pressure.
>>
>> - comparing HI against -1
>>
>> Ok for trunk?
>>
>> Johann
>>
>>
>> gcc/
>> Minor tweaks for QImode.
>>
>> * config/avr/predicates.md (const_m255_to_m1_operand): New.
>> * config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
>> * config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
>> expensive.
>> (*cmphi.zero-extend.0, *cmphi.zero-extend.1)
>> (*usum_widenqihi3, *udiff_widenqihi3)
>> (*addhi3_zero_extend.const): New combiner insns.
>> (andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
>> just 1 bit is affected.
>> * config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
>> (avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
>
>
>
> Please apply.
>
Also applied this addendum; not wrong code but to match the comments.
Johann
* config/avr/avr.c (avr_out_compare): Use const0_rtx instead of 0
when testing for compares against constants of the form 0xabab.
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c (revision 238878)
+++ config/avr/avr.c (revision 238879)
@@ -5357,7 +5357,7 @@ avr_out_compare (rtx_insn *insn, rtx *xo
the constant is of the form 0xabab. */
if (n_bytes == 2
- && xval != 0
+ && xval != const0_rtx
&& test_hard_reg_class (LD_REGS, xreg)
&& compare_eq_p (insn)
&& !reg_unused_after (insn, xreg))
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-29 16:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-13 11:12 [patch,avr] minor tweaks for 8-bit operations Georg-Johann Lay
2016-07-14 3:56 ` Senthil Kumar Selvaraj
2016-07-14 12:09 ` Georg-Johann Lay
2016-07-14 6:36 ` Denis Chertykov
2016-07-29 16:14 ` Georg-Johann Lay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).