* Re: [PATCH v2] PR target/30315 i386 missed optimization detecting overflows
@ 2007-08-07 15:16 Uros Bizjak
2007-08-07 18:03 ` Rask Ingemann Lambertsen
2007-08-09 23:11 ` Rask Ingemann Lambertsen
0 siblings, 2 replies; 12+ messages in thread
From: Uros Bizjak @ 2007-08-07 15:16 UTC (permalink / raw)
To: GCC Patches; +Cc: Rask Ingemann Lambertsen
Hello!
> 2007-08-04 Rask Ingemann Lambertsen <rask@sygehus.dk>
>
> PR target/30315
> * config/i386/i386.md (*adddi3_cc_overflow1_rex64): New.
> (*adddi3_cc_overflow2_rex64): New.
> (*addsi3_cc_overflow1): New.
> (*addsi3_cc_overflow2): New.
> (*addhi3_cc_overflow1): New.
> (*addhi3_cc_overflow2): New.
> (*addqi3_cc_overflow1): New.
> (*addqi3_cc_overflow2): New.
> (*subdi3_cc_overflow1_rex64): New.
> (*subdi3_cc_overflow2_rex64): New.
> (*subsi3_cc_overflow1): New.
> (*subsi3_cc_overflow2): New.
> (*subhi3_cc_overflow1): New.
> (*subhi3_cc_overflow2): New.
> (*subqi3_cc_overflow1): New.
> (*subqi3_cc_overflow2): New.
> * config/i386/predicates.md (fcmov_comparison_operator): Accept
> CCCmode for LTU, GTU, LEU and GEU.
> (ix86_comparison_operator): Likewise.
> (ix86_carry_flag_operator): Carry set if LTU or GTU in CCCmode.
> * gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
> (ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
> or MINUS expressions.
You should simply name a vector form of the pattern, like
("*sub<mode>3_xx_overflow1") in the ChangeLog, where quotes could be
added according to your personal preference.
Regarding your patch - IMO the form of the pattern where you only
clobber scratch register is missing. You could compare i.e. existing
*addsi_2 and *addsi_3 patterns, where your patch currently implements
only *addsi_2-like form. This would optimize following example:
--cut here--
void
fooia (unsigned int a, unsigned int b)
{
unsigned int sum = a + b;
if (sum < a)
abort ();
}
--cut here--
Currently, gcc generates:
addl %edi, %esi
subq $8, %rsp
.LCFI0:
cmpl %esi, %edi
ja .L5
addq $8, %rsp
ret
.L5:
call abort
BTW: You introduced excellent approach to macroize these patterns. Do
you perhaps plan to macroize existing patterns using these macros as
well? ;-)
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] PR target/30315 i386 missed optimization detecting overflows
2007-08-07 15:16 [PATCH v2] PR target/30315 i386 missed optimization detecting overflows Uros Bizjak
@ 2007-08-07 18:03 ` Rask Ingemann Lambertsen
2007-08-09 23:11 ` Rask Ingemann Lambertsen
1 sibling, 0 replies; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-07 18:03 UTC (permalink / raw)
To: Uros Bizjak; +Cc: GCC Patches
On Tue, Aug 07, 2007 at 05:15:56PM +0200, Uros Bizjak wrote:
> Hello!
>
> You should simply name a vector form of the pattern, like
> ("*sub<mode>3_xx_overflow1") in the ChangeLog, where quotes could be
> added according to your personal preference.
Ok.
> Regarding your patch - IMO the form of the pattern where you only
> clobber scratch register is missing.
It is. I'll fix that and update the testcase also.
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] PR target/30315 i386 missed optimization detecting overflows
2007-08-07 15:16 [PATCH v2] PR target/30315 i386 missed optimization detecting overflows Uros Bizjak
2007-08-07 18:03 ` Rask Ingemann Lambertsen
@ 2007-08-09 23:11 ` Rask Ingemann Lambertsen
2007-08-10 22:22 ` Uros Bizjak
1 sibling, 1 reply; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-09 23:11 UTC (permalink / raw)
To: Uros Bizjak; +Cc: GCC Patches
On Tue, Aug 07, 2007 at 05:15:56PM +0200, Uros Bizjak wrote:
> You should simply name a vector form of the pattern, like
> ("*sub<mode>3_xx_overflow1") in the ChangeLog, where quotes could be
> added according to your personal preference.
>
> Regarding your patch - IMO the form of the pattern where you only
> clobber scratch register is missing. You could compare i.e. existing
> *addsi_2 and *addsi_3 patterns, where your patch currently implements
> only *addsi_2-like form. This would optimize following example:
Added. I also added zext versions, but the testcase doesn't trigger them,
so there's another missed optimization there.
> BTW: You introduced excellent approach to macroize these patterns. Do
> you perhaps plan to macroize existing patterns using these macros as
> well? ;-)
Well, for a start, I've chosen the macros such that they can be used for
other patterns as well. I've macroized a little further this time.
I bootstrapped and tested this revised patch on x86_64-unknown-linux-gnu
with no regressions. Ok for trunk?
2007-08-10 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* config/i386/i386.md (SWI)(addsub): New.
(*<optab><mode>3_cc_overflow1): New.
(*<optab><mode>3_cc_overflow2): New.
(*<optab><mode>3_cconly_overflow1): New.
(*<optab><mode>3_cconly_overflow2): New.
(*<optab>si3_zext_cc_overflow1): New.
(*<optab>si3_zext_cc_overflow2): New.
* config/i386/predicates.md (fcmov_comparison_operator): Accept
CCCmode for LTU, GTU, LEU and GEU.
(ix86_comparison_operator): Likewise.
(ix86_carry_flag_operator): Carry flag is set if LTU or GTU in CCCmode.
* gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
(ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
or MINUS expressions.
testsuite/
2007-08-10 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* gcc.target/i386/pr30315.c: New.
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md (revision 127179)
+++ gcc/config/i386/i386.md (working copy)
@@ -488,6 +488,36 @@
[(set_attr "length" "128")
(set_attr "type" "multi")])
+(define_code_macro addsub [plus minus])
+
+;; Base name to use for define_insn and such.
+(define_code_attr optab [(plus "add") (minus "sub")])
+
+;; Mark commutative operators as such in constraints.
+(define_code_attr pct [(plus "%") (minus "")])
+
+;; Instruction mnemonic for output templates.
+(define_code_attr mnemonic [(plus "add") (minus "sub")])
+
+;; All single word integer modes.
+(define_mode_macro SWI [QI HI SI (DI "TARGET_64BIT")])
+
+;; Instruction suffix for integer modes.
+(define_mode_attr imodesuffix [(QI "b") (HI "w") (SI "l") (DI "q")])
+
+;; Register class for integer modes.
+(define_mode_attr r [(QI "q") (HI "r") (SI "r") (DI "r")])
+
+;; Immediate operand constraint for integer modes.
+(define_mode_attr i [(QI "i") (HI "i") (SI "i") (DI "e")])
+
+;; General operand predicate for integer modes.
+(define_mode_attr general_operand
+ [(QI "general_operand")
+ (HI "general_operand")
+ (SI "general_operand")
+ (DI "x86_64_general_operand")])
+
;; All x87 floating point modes
(define_mode_macro X87MODEF [SF DF XF])
@@ -4855,6 +4885,82 @@
[(set_attr "type" "alu")
(set_attr "mode" "DI")])
+(define_insn "*<optab><mode>3_cc_overflow1"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SWI (match_operand:SWI 1 "nonimmediate_operand" "<pct>0,0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>,<r>m"))
+ (match_dup 1)))
+ (set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m,<r>")
+ (addsub:SWI (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+ "<mnemonic>{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab><mode>3_cc_overflow2"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SWI (match_operand:SWI 1 "nonimmediate_operand" "<pct>0,0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>,<r>m"))
+ (match_dup 2)))
+ (set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m,<r>")
+ (addsub:SWI (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+ "<mnemonic>{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab><mode>3_cconly_overflow1"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SWI (match_operand:SWI 1 "nonimmediate_operand" "<pct>0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>m"))
+ (match_dup 1)))
+ (clobber (match_scratch:SWI 0 "=<r>"))]
+ "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+ "<mnemonic>{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab><mode>3_cconly_overflow2"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SWI (match_operand:SWI 1 "nonimmediate_operand" "<pct>0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>m"))
+ (match_dup 2)))
+ (clobber (match_scratch:SWI 0 "=<r>"))]
+ "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+ "<mnemonic>{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab>si3_zext_cc_overflow1"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SI (match_operand:SI 1 "nonimmediate_operand" "<pct>0")
+ (match_operand:SI 2 "general_operand" "g"))
+ (match_dup 1)))
+ (set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI (addsub:SI (match_dup 1) (match_dup 2))))]
+ "TARGET_64BIT && ix86_binary_operator_ok (<CODE>, SImode, operands)"
+ "<mnemonic>{l}\t{%2, %k0|%k0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "SI")])
+
+(define_insn "*<optab>si3_zext_cc_overflow2"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (addsub:SI (match_operand:SI 1 "nonimmediate_operand" "<pct>0")
+ (match_operand:SI 2 "general_operand" "g"))
+ (match_dup 2)))
+ (set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI (addsub:SI (match_dup 1) (match_dup 2))))]
+ "TARGET_64BIT && ix86_binary_operator_ok (<CODE>, SImode, operands)"
+ "<mnemonic>{l}\t{%2, %k0|%k0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "SI")])
+
(define_insn "addqi3_carry"
[(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
(plus:QI (plus:QI (match_operand:QI 3 "ix86_carry_flag_operator" "")
Index: gcc/config/i386/predicates.md
===================================================================
--- gcc/config/i386/predicates.md (revision 127179)
+++ gcc/config/i386/predicates.md (working copy)
@@ -879,7 +879,8 @@
switch (code)
{
case LTU: case GTU: case LEU: case GEU:
- if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode)
+ if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode
+ || inmode == CCCmode)
return 1;
return 0;
case ORDERED: case UNORDERED:
@@ -924,7 +925,11 @@
|| inmode == CCGOCmode || inmode == CCNOmode)
return 1;
return 0;
- case LTU: case GTU: case LEU: case ORDERED: case UNORDERED: case GEU:
+ case LTU: case GTU: case LEU: case GEU:
+ if (inmode == CCmode || inmode == CCCmode)
+ return 1;
+ return 0;
+ case ORDERED: case UNORDERED:
if (inmode == CCmode)
return 1;
return 0;
@@ -939,7 +944,7 @@
;; Return 1 if OP is a valid comparison operator testing carry flag to be set.
(define_predicate "ix86_carry_flag_operator"
- (match_code "ltu,lt,unlt,gt,ungt,le,unle,ge,unge,ltgt,uneq")
+ (match_code "ltu,lt,unlt,gtu,gt,ungt,le,unle,ge,unge,ltgt,uneq")
{
enum machine_mode inmode = GET_MODE (XEXP (op, 0));
enum rtx_code code = GET_CODE (op);
@@ -957,6 +962,8 @@
return 0;
code = ix86_fp_compare_code_to_integer (code);
}
+ else if (inmode == CCCmode)
+ return code == LTU || code == GTU;
else if (inmode != CCmode)
return 0;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 127179)
+++ gcc/config/i386/i386.c (working copy)
@@ -8254,8 +8254,12 @@ put_condition_code (enum rtx_code code,
case GTU:
/* ??? Use "nbe" instead of "a" for fcmov lossage on some assemblers.
Those same assemblers have the same but opposite lossage on cmov. */
- gcc_assert (mode == CCmode);
- suffix = fp ? "nbe" : "a";
+ if (mode == CCmode)
+ suffix = fp ? "nbe" : "a";
+ else if (mode == CCCmode)
+ suffix = "b";
+ else
+ gcc_unreachable ();
break;
case LT:
switch (mode)
@@ -8275,7 +8279,7 @@ put_condition_code (enum rtx_code code,
}
break;
case LTU:
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = "b";
break;
case GE:
@@ -8297,7 +8301,7 @@ put_condition_code (enum rtx_code code,
break;
case GEU:
/* ??? As above. */
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = fp ? "nb" : "ae";
break;
case LE:
@@ -8305,8 +8309,13 @@ put_condition_code (enum rtx_code code,
suffix = "le";
break;
case LEU:
- gcc_assert (mode == CCmode);
- suffix = "be";
+ /* ??? As above. */
+ if (mode == CCmode)
+ suffix = "be";
+ else if (mode == CCCmode)
+ suffix = fp ? "nb" : "ae";
+ else
+ gcc_unreachable ();
break;
case UNORDERED:
suffix = fp ? "u" : "p";
@@ -11033,10 +11042,23 @@ ix86_cc_mode (enum rtx_code code, rtx op
return CCZmode;
/* Codes needing carry flag. */
case GEU: /* CF=0 */
- case GTU: /* CF=0 & ZF=0 */
case LTU: /* CF=1 */
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == PLUS
+ && (rtx_equal_p (op1, XEXP (op0, 0))
+ || rtx_equal_p (op1, XEXP (op0, 1))))
+ return CCCmode;
+ else
+ return CCmode;
+ case GTU: /* CF=0 & ZF=0 */
case LEU: /* CF=1 | ZF=1 */
- return CCmode;
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == MINUS
+ && (rtx_equal_p (op1, XEXP (op0, 0))
+ || rtx_equal_p (op1, XEXP (op0, 1))))
+ return CCCmode;
+ else
+ return CCmode;
/* Codes possibly doable only with sign flag when
comparing against zero. */
case GE: /* SF=OF or SF=0 */
Index: gcc/testsuite/gcc.target/i386/pr30315.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
@@ -0,0 +1,101 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "cmp" } } */
+
+extern void abort (void);
+int c;
+
+#define PLUSCC1(T, t, C) \
+T pluscc##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ abort (); \
+ return sum; \
+}
+#define PLUSCC(T, t) PLUSCC1(T, t, a) PLUSCC1(T, t, b)
+
+#define INCCC1(T, t, C) \
+T inccc##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ c ++; \
+ return sum; \
+}
+#define INCCC(T, t) INCCC1(T, t, a) INCCC1(T, t, b)
+
+#define PLUSCCONLY1(T, t, C) \
+void pluscconly##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ abort (); \
+}
+#define PLUSCCONLY(T, t) PLUSCCONLY1(T, t, a) PLUSCCONLY1(T, t, b)
+
+#define MINUSCC1(T, t, C) \
+T minuscc##t##C (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > C) \
+ abort (); \
+ return difference; \
+}
+#define MINUSCC(T, t) MINUSCC1(T, t, a) MINUSCC1(T, t, b)
+
+#define DECCC1(T, t, C) \
+T deccc##t##C (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > C) \
+ c --; \
+ return difference; \
+}
+#define DECCC(T, t) DECCC1(T, t, a) DECCC1(T, t, b)
+
+#define MINUSCCONLY1(T, t, C) \
+void minuscconly##t##C (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > C) \
+ abort (); \
+}
+#define MINUSCCONLY(T, t) MINUSCCONLY1(T, t, a) MINUSCCONLY1(T, t, b)
+
+#define TEST(T, t) \
+ PLUSCC(T, t) \
+ PLUSCCONLY(T, t) \
+ INCCC(T, t) \
+ MINUSCC(T, t) \
+ MINUSCCONLY(T, t) \
+ DECCC(T, t)
+
+TEST (unsigned long, l)
+TEST (unsigned int, i)
+TEST (unsigned short, s)
+TEST (unsigned char, c)
+
+#define PLUSCCZEXT(C) \
+unsigned long pluscczext##C (unsigned int a, unsigned int b) \
+{ \
+ unsigned int sum = a + b; \
+ if (sum < C) \
+ abort (); \
+ return sum; \
+}
+
+PLUSCCZEXT(a)
+PLUSCCZEXT(b)
+
+#define MINUSCCZEXT(C) \
+unsigned long minuscczext##C (unsigned int a, unsigned int b) \
+{ \
+ unsigned int difference = a - b; \
+ if (difference > C) \
+ abort (); \
+ return difference; \
+}
+
+MINUSCCZEXT(a)
+MINUSCCZEXT(b)
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] PR target/30315 i386 missed optimization detecting overflows
2007-08-09 23:11 ` Rask Ingemann Lambertsen
@ 2007-08-10 22:22 ` Uros Bizjak
2007-08-14 11:04 ` [PATCH v4] " Rask Ingemann Lambertsen
0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2007-08-10 22:22 UTC (permalink / raw)
To: Rask Ingemann Lambertsen; +Cc: GCC Patches
Rask Ingemann Lambertsen wrote:
> 2007-08-10 Rask Ingemann Lambertsen <rask@sygehus.dk>
>
> PR target/30315
> * config/i386/i386.md (SWI)(addsub): New.
> (*<optab><mode>3_cc_overflow1): New.
> (*<optab><mode>3_cc_overflow2): New.
> (*<optab><mode>3_cconly_overflow1): New.
> (*<optab><mode>3_cconly_overflow2): New.
> (*<optab>si3_zext_cc_overflow1): New.
> (*<optab>si3_zext_cc_overflow2): New.
> * config/i386/predicates.md (fcmov_comparison_operator): Accept
> CCCmode for LTU, GTU, LEU and GEU.
> (ix86_comparison_operator): Likewise.
> (ix86_carry_flag_operator): Carry flag is set if LTU or GTU in CCCmode.
> * gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
> (ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
> or MINUS expressions.
>
> testsuite/
>
> 2007-08-10 Rask Ingemann Lambertsen <rask@sygehus.dk>
>
> PR target/30315
> * gcc.target/i386/pr30315.c: New.
>
>
This is OK for mainline, but please consider a few minor remarks below.
Thanks,
Uros.
> Index: gcc/config/i386/i386.md
> ===================================================================
> --- gcc/config/i386/i386.md (revision 127179)
> +++ gcc/config/i386/i386.md (working copy)
> @@ -488,6 +488,36 @@
> [(set_attr "length" "128")
> (set_attr "type" "multi")])
>
> +(define_code_macro addsub [plus minus])
> +
>
Perhaps a better name for a code macro would be something like
"plusminus" ...
> +;; Base name to use for define_insn and such.
> +(define_code_attr optab [(plus "add") (minus "sub")])
>
... leaving "addsub" keyword for a more descriptive define_insn
macroized pattern name.
> +
> +;; Mark commutative operators as such in constraints.
> +(define_code_attr pct [(plus "%") (minus "")])
>
Perhaps "comm" instead of "pct"?
> +
> +;; Instruction mnemonic for output templates.
> +(define_code_attr mnemonic [(plus "add") (minus "sub")])
>
The definition above is not needed, "addsub" keyword can also be used
instead.
> +
> +;; All single word integer modes.
> +(define_mode_macro SWI [QI HI SI (DI "TARGET_64BIT")])
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-10 22:22 ` Uros Bizjak
@ 2007-08-14 11:04 ` Rask Ingemann Lambertsen
2007-08-14 12:25 ` Uros Bizjak
2007-08-14 13:35 ` Paolo Bonzini
0 siblings, 2 replies; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-14 11:04 UTC (permalink / raw)
To: Uros Bizjak; +Cc: GCC Patches
On Sat, Aug 11, 2007 at 12:21:29AM +0200, Uros Bizjak wrote:
>
> >Index: gcc/config/i386/i386.md
> >===================================================================
> >--- gcc/config/i386/i386.md (revision 127179)
> >+++ gcc/config/i386/i386.md (working copy)
> >@@ -488,6 +488,36 @@
> > [(set_attr "length" "128")
> > (set_attr "type" "multi")])
> >
> >+(define_code_macro addsub [plus minus])
> >+
> >
>
> Perhaps a better name for a code macro would be something like
> "plusminus" ...
> >+;; Base name to use for define_insn and such.
> >+(define_code_attr optab [(plus "add") (minus "sub")])
> >
>
> ... leaving "addsub" keyword for a more descriptive define_insn
> macroized pattern name.
> >+
> >+;; Mark commutative operators as such in constraints.
> >+(define_code_attr pct [(plus "%") (minus "")])
> >
>
> Perhaps "comm" instead of "pct"?
>
> >+
> >+;; Instruction mnemonic for output templates.
> >+(define_code_attr mnemonic [(plus "add") (minus "sub")])
> >
>
> The definition above is not needed, "addsub" keyword can also be used
> instead.
Here is a revised patch. I made the above changes and also used combine's
hook for canonicalizing comparison so we don't need two versions of each
pattern. Additionally, I made the sub*cconly_overflow pattern clobberless.
The testcase was adjusted accordingly. Bootstrapped and tested on
x86_64-unknown-linux-gnu (all default languages, --enable-checking=yes,rtl)
with no regressions. Ok for trunk?
2007-08-12 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* config/i386/i386.h (CANONICALIZE_COMPARISON): New.
* config/i386/i386.md (plusminus)(addsub)(SWI): New.
(*<addsub><mode>3_cc_overflow): New.
(*add<mode>3_cconly_overflow): New.
(*sub<mode>3_cconly_overflow): New.
(*<addsub>si3_zext_cc_overflow): New.
* config/i386/predicates.md (fcmov_comparison_operator): Accept
CCCmode for LTU, GTU, LEU and GEU.
(ix86_comparison_operator): Likewise.
(ix86_carry_flag_operator): Carry flag is set if LTU or GTU in CCCmode.
* gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
(ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
or MINUS expressions.
testsuite/
2007-08-12 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* gcc.target/i386/pr30315.c: New.
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h (revision 127179)
+++ gcc/config/i386/i386.h (working copy)
@@ -2065,6 +2065,16 @@ do { \
#define SELECT_CC_MODE(OP, X, Y) ix86_cc_mode ((OP), (X), (Y))
+/* Canonicalize overflow checks to save on the insn patterns. We change
+ "a + b < b" into "a + b < a" and "a + b >= b" into "a + b >= a". */
+#define CANONICALIZE_COMPARISON(code, op0, op1) \
+{ \
+ if ((code == LTU || code == GEU) \
+ && GET_CODE (op0) == PLUS \
+ && rtx_equal_p (op1, XEXP (op0, 1))) \
+ op1 = XEXP (op0, 0); \
+}
+
/* Return nonzero if MODE implies a floating point inequality can be
reversed. */
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md (revision 127179)
+++ gcc/config/i386/i386.md (working copy)
@@ -488,6 +488,33 @@
[(set_attr "length" "128")
(set_attr "type" "multi")])
+(define_code_macro plusminus [plus minus])
+
+;; Base name for define_insn and insn mnemonic.
+(define_code_attr addsub [(plus "add") (minus "sub")])
+
+;; Mark commutative operators as such in constraints.
+(define_code_attr comm [(plus "%") (minus "")])
+
+;; All single word integer modes.
+(define_mode_macro SWI [QI HI SI (DI "TARGET_64BIT")])
+
+;; Instruction suffix for integer modes.
+(define_mode_attr imodesuffix [(QI "b") (HI "w") (SI "l") (DI "q")])
+
+;; Register class for integer modes.
+(define_mode_attr r [(QI "q") (HI "r") (SI "r") (DI "r")])
+
+;; Immediate operand constraint for integer modes.
+(define_mode_attr i [(QI "i") (HI "i") (SI "i") (DI "e")])
+
+;; General operand predicate for integer modes.
+(define_mode_attr general_operand
+ [(QI "general_operand")
+ (HI "general_operand")
+ (SI "general_operand")
+ (DI "x86_64_general_operand")])
+
;; All x87 floating point modes
(define_mode_macro X87MODEF [SF DF XF])
@@ -4855,6 +4882,56 @@
[(set_attr "type" "alu")
(set_attr "mode" "DI")])
+(define_insn "*<addsub><mode>3_cc_overflow"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plusminus:SWI
+ (match_operand:SWI 1 "nonimmediate_operand" "<comm>0,0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>,<r>m"))
+ (match_dup 1)))
+ (set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m,<r>")
+ (plusminus:SWI (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+ "<addsub>{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*add<mode>3_cconly_overflow"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plus:SWI (match_operand:SWI 1 "nonimmediate_operand" "%0")
+ (match_operand:SWI 2 "<general_operand>" "<r><i>m"))
+ (match_dup 1)))
+ (clobber (match_scratch:SWI 0 "=<r>"))]
+ "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+ "add{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*sub<mode>3_cconly_overflow"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (minus:SWI (match_operand:SWI 0 "nonimmediate_operand" "<r>m,<r>")
+ (match_operand:SWI 1 "<general_operand>" "<r><i>,<r>m"))
+ (match_dup 0)))]
+ ""
+ "cmp{<imodesuffix>}\t{%1, %0|%0, %1}"
+ [(set_attr "type" "icmp")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*<addsub>si3_zext_cc_overflow"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plusminus:SI (match_operand:SI 1 "nonimmediate_operand" "<comm>0")
+ (match_operand:SI 2 "general_operand" "g"))
+ (match_dup 1)))
+ (set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI (plusminus:SI (match_dup 1) (match_dup 2))))]
+ "TARGET_64BIT && ix86_binary_operator_ok (<CODE>, SImode, operands)"
+ "<addsub>{l}\t{%2, %k0|%k0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "SI")])
+
(define_insn "addqi3_carry"
[(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
(plus:QI (plus:QI (match_operand:QI 3 "ix86_carry_flag_operator" "")
Index: gcc/config/i386/predicates.md
===================================================================
--- gcc/config/i386/predicates.md (revision 127179)
+++ gcc/config/i386/predicates.md (working copy)
@@ -879,7 +879,8 @@
switch (code)
{
case LTU: case GTU: case LEU: case GEU:
- if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode)
+ if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode
+ || inmode == CCCmode)
return 1;
return 0;
case ORDERED: case UNORDERED:
@@ -924,7 +925,11 @@
|| inmode == CCGOCmode || inmode == CCNOmode)
return 1;
return 0;
- case LTU: case GTU: case LEU: case ORDERED: case UNORDERED: case GEU:
+ case LTU: case GTU: case LEU: case GEU:
+ if (inmode == CCmode || inmode == CCCmode)
+ return 1;
+ return 0;
+ case ORDERED: case UNORDERED:
if (inmode == CCmode)
return 1;
return 0;
@@ -939,7 +944,7 @@
;; Return 1 if OP is a valid comparison operator testing carry flag to be set.
(define_predicate "ix86_carry_flag_operator"
- (match_code "ltu,lt,unlt,gt,ungt,le,unle,ge,unge,ltgt,uneq")
+ (match_code "ltu,lt,unlt,gtu,gt,ungt,le,unle,ge,unge,ltgt,uneq")
{
enum machine_mode inmode = GET_MODE (XEXP (op, 0));
enum rtx_code code = GET_CODE (op);
@@ -957,6 +962,8 @@
return 0;
code = ix86_fp_compare_code_to_integer (code);
}
+ else if (inmode == CCCmode)
+ return code == LTU || code == GTU;
else if (inmode != CCmode)
return 0;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 127179)
+++ gcc/config/i386/i386.c (working copy)
@@ -8254,8 +8254,12 @@ put_condition_code (enum rtx_code code,
case GTU:
/* ??? Use "nbe" instead of "a" for fcmov lossage on some assemblers.
Those same assemblers have the same but opposite lossage on cmov. */
- gcc_assert (mode == CCmode);
- suffix = fp ? "nbe" : "a";
+ if (mode == CCmode)
+ suffix = fp ? "nbe" : "a";
+ else if (mode == CCCmode)
+ suffix = "b";
+ else
+ gcc_unreachable ();
break;
case LT:
switch (mode)
@@ -8275,7 +8279,7 @@ put_condition_code (enum rtx_code code,
}
break;
case LTU:
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = "b";
break;
case GE:
@@ -8297,7 +8301,7 @@ put_condition_code (enum rtx_code code,
break;
case GEU:
/* ??? As above. */
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = fp ? "nb" : "ae";
break;
case LE:
@@ -8305,8 +8309,13 @@ put_condition_code (enum rtx_code code,
suffix = "le";
break;
case LEU:
- gcc_assert (mode == CCmode);
- suffix = "be";
+ /* ??? As above. */
+ if (mode == CCmode)
+ suffix = "be";
+ else if (mode == CCCmode)
+ suffix = fp ? "nb" : "ae";
+ else
+ gcc_unreachable ();
break;
case UNORDERED:
suffix = fp ? "u" : "p";
@@ -11033,10 +11042,21 @@ ix86_cc_mode (enum rtx_code code, rtx op
return CCZmode;
/* Codes needing carry flag. */
case GEU: /* CF=0 */
- case GTU: /* CF=0 & ZF=0 */
case LTU: /* CF=1 */
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == PLUS
+ && rtx_equal_p (op1, XEXP (op0, 0)))
+ return CCCmode;
+ else
+ return CCmode;
+ case GTU: /* CF=0 & ZF=0 */
case LEU: /* CF=1 | ZF=1 */
- return CCmode;
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == MINUS
+ && rtx_equal_p (op1, XEXP (op0, 0)))
+ return CCCmode;
+ else
+ return CCmode;
/* Codes possibly doable only with sign flag when
comparing against zero. */
case GE: /* SF=OF or SF=0 */
Index: gcc/testsuite/gcc.target/i386/pr30315.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
@@ -0,0 +1,97 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "cmp" 4 } } */
+
+extern void abort (void);
+int c;
+
+#define PLUSCC1(T, t, C) \
+T pluscc##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ abort (); \
+ return sum; \
+}
+#define PLUSCC(T, t) PLUSCC1(T, t, a) PLUSCC1(T, t, b)
+
+#define INCCC1(T, t, C) \
+T inccc##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ c ++; \
+ return sum; \
+}
+#define INCCC(T, t) INCCC1(T, t, a) INCCC1(T, t, b)
+
+#define PLUSCCONLY1(T, t, C) \
+void pluscconly##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ abort (); \
+}
+#define PLUSCCONLY(T, t) PLUSCCONLY1(T, t, a) PLUSCCONLY1(T, t, b)
+
+#define MINUSCC(T, t) \
+T minuscc##t (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > a) \
+ abort (); \
+ return difference; \
+}
+
+#define DECCC(T, t) \
+T deccc##t (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > a) \
+ c --; \
+ return difference; \
+}
+
+#define MINUSCCONLY(T, t) \
+void minuscconly##t (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > a) \
+ abort (); \
+}
+
+#define TEST(T, t) \
+ PLUSCC(T, t) \
+ PLUSCCONLY(T, t) \
+ INCCC(T, t) \
+ MINUSCC(T, t) \
+ MINUSCCONLY(T, t) \
+ DECCC(T, t)
+
+TEST (unsigned long, l)
+TEST (unsigned int, i)
+TEST (unsigned short, s)
+TEST (unsigned char, c)
+
+#define PLUSCCZEXT(C) \
+unsigned long pluscczext##C (unsigned int a, unsigned int b) \
+{ \
+ unsigned int sum = a + b; \
+ if (sum < C) \
+ abort (); \
+ return sum; \
+}
+
+PLUSCCZEXT(a)
+PLUSCCZEXT(b)
+
+#define MINUSCCZEXT \
+unsigned long minuscczext (unsigned int a, unsigned int b) \
+{ \
+ unsigned int difference = a - b; \
+ if (difference > a) \
+ abort (); \
+ return difference; \
+}
+
+MINUSCCZEXT
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-14 11:04 ` [PATCH v4] " Rask Ingemann Lambertsen
@ 2007-08-14 12:25 ` Uros Bizjak
2007-08-14 13:35 ` Paolo Bonzini
1 sibling, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2007-08-14 12:25 UTC (permalink / raw)
To: Rask Ingemann Lambertsen; +Cc: GCC Patches
Rask Ingemann Lambertsen wrote:
> Here is a revised patch. I made the above changes and also used combine's
> hook for canonicalizing comparison so we don't need two versions of each
> pattern. Additionally, I made the sub*cconly_overflow pattern clobberless.
> The testcase was adjusted accordingly. Bootstrapped and tested on
> x86_64-unknown-linux-gnu (all default languages, --enable-checking=yes,rtl)
> with no regressions. Ok for trunk?
>
> 2007-08-12 Rask Ingemann Lambertsen <rask@sygehus.dk>
>
> PR target/30315
> * config/i386/i386.h (CANONICALIZE_COMPARISON): New.
> * config/i386/i386.md (plusminus)(addsub)(SWI): New.
> (*<addsub><mode>3_cc_overflow): New.
> (*add<mode>3_cconly_overflow): New.
> (*sub<mode>3_cconly_overflow): New.
> (*<addsub>si3_zext_cc_overflow): New.
> * config/i386/predicates.md (fcmov_comparison_operator): Accept
> CCCmode for LTU, GTU, LEU and GEU.
> (ix86_comparison_operator): Likewise.
> (ix86_carry_flag_operator): Carry flag is set if LTU or GTU in CCCmode.
> * gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
> (ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
> or MINUS expressions.
>
> testsuite/
>
> 2007-08-12 Rask Ingemann Lambertsen <rask@sygehus.dk>
>
> PR target/30315
> * gcc.target/i386/pr30315.c: New.
>
This is OK for mainline.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-14 11:04 ` [PATCH v4] " Rask Ingemann Lambertsen
2007-08-14 12:25 ` Uros Bizjak
@ 2007-08-14 13:35 ` Paolo Bonzini
2007-08-14 16:55 ` Rask Ingemann Lambertsen
2007-08-29 17:52 ` Rask Ingemann Lambertsen
1 sibling, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2007-08-14 13:35 UTC (permalink / raw)
To: Rask Ingemann Lambertsen; +Cc: Uros Bizjak, GCC Patches
> +/* Canonicalize overflow checks to save on the insn patterns. We change
> + "a + b < b" into "a + b < a" and "a + b >= b" into "a + b >= a". */
> +#define CANONICALIZE_COMPARISON(code, op0, op1) \
> +{ \
> + if ((code == LTU || code == GEU) \
> + && GET_CODE (op0) == PLUS \
> + && rtx_equal_p (op1, XEXP (op0, 1))) \
> + op1 = XEXP (op0, 0); \
> +}
Sorry for complicating your useful work.
It looks to me like this canonicalization belongs in simplify-rtx.c. Of
course, this can be done in a follow-up patch.
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-14 13:35 ` Paolo Bonzini
@ 2007-08-14 16:55 ` Rask Ingemann Lambertsen
2007-08-29 17:52 ` Rask Ingemann Lambertsen
1 sibling, 0 replies; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-14 16:55 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Uros Bizjak, GCC Patches
On Tue, Aug 14, 2007 at 03:34:53PM +0200, Paolo Bonzini wrote:
>
> >+/* Canonicalize overflow checks to save on the insn patterns. We change
> >+ "a + b < b" into "a + b < a" and "a + b >= b" into "a + b >= a". */
> >+#define CANONICALIZE_COMPARISON(code, op0, op1) \
> >+{ \
> >+ if ((code == LTU || code == GEU) \
> >+ && GET_CODE (op0) == PLUS \
> >+ && rtx_equal_p (op1, XEXP (op0, 1))) \
> >+ op1 = XEXP (op0, 0); \
> >+}
>
> Sorry for complicating your useful work.
>
> It looks to me like this canonicalization belongs in simplify-rtx.c. Of
> course, this can be done in a follow-up patch.
I agree. We also want to do something at the tree level so
sum = a + b;
if (sum < a)
abort ();
if (sum < b)
abort ();
is optimized into
sum = a + b;
if (sum < a)
abort ();
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-14 13:35 ` Paolo Bonzini
2007-08-14 16:55 ` Rask Ingemann Lambertsen
@ 2007-08-29 17:52 ` Rask Ingemann Lambertsen
2007-08-30 11:42 ` Rask Ingemann Lambertsen
1 sibling, 1 reply; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-29 17:52 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Uros Bizjak, GCC Patches
On Tue, Aug 14, 2007 at 03:34:53PM +0200, Paolo Bonzini wrote:
>
> It looks to me like this canonicalization belongs in simplify-rtx.c. Of
> course, this can be done in a follow-up patch.
So here is the follow-up patch. Bootstrapped and tested on
x86_64-unknown-linux-gnu with no new failures. I also checked that make dvi
generates no new warnings. Ok for trunk?
2007-08-28 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* config/i386/i386.h (CANONICALIZE_COMPARISON): Delete.
* simplify-rtx.c (simplify_relational_operation_1): Add the
canonicalization from i386.h.
* doc/md.texi (Canonicalization of Instructions): Document it.
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h (revision 127838)
+++ gcc/config/i386/i386.h (working copy)
@@ -2068,16 +2068,6 @@ do { \
#define SELECT_CC_MODE(OP, X, Y) ix86_cc_mode ((OP), (X), (Y))
-/* Canonicalize overflow checks to save on the insn patterns. We change
- "a + b < b" into "a + b < a" and "a + b >= b" into "a + b >= a". */
-#define CANONICALIZE_COMPARISON(code, op0, op1) \
-{ \
- if ((code == LTU || code == GEU) \
- && GET_CODE (op0) == PLUS \
- && rtx_equal_p (op1, XEXP (op0, 1))) \
- op1 = XEXP (op0, 0); \
-}
-
/* Return nonzero if MODE implies a floating point inequality can be
reversed. */
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c (revision 127838)
+++ gcc/simplify-rtx.c (working copy)
@@ -3722,6 +3722,12 @@ simplify_relational_operation_1 (enum rt
}
}
+ /* Canonicalize (LTU/GEU (PLUS a b) b) as (LTU/GEU (PLUS a b) a). */
+ if ((code == LTU || code == GEU)
+ && GET_CODE (op0) == PLUS
+ && rtx_equal_p (op1, XEXP (op0, 1)))
+ return simplify_gen_relational (code, mode, cmp_mode, op0, XEXP (op0, 0));
+
if (op1 == const0_rtx)
{
/* Canonicalize (GTU x 0) as (NE x 0). */
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi (revision 127838)
+++ gcc/doc/md.texi (working copy)
@@ -1,5 +1,5 @@
@c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001,
-@c 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
+@c 2002, 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
@c This is part of the GCC manual.
@c For copying conditions, see the file gcc.texi.
@@ -5446,6 +5446,11 @@ An operand of @code{neg}, @code{not}, @c
above.
@item
+@code{(ltu (plus @var{a} @var{b}) @var{b})} is converted to
+@code{(ltu (plus @var{a} @var{b}) @var{a})}. Likewise with @code{geu} instead
+of @code{ltu}.
+
+@item
@code{(minus @var{x} (const_int @var{n}))} is converted to
@code{(plus @var{x} (const_int @var{-n}))}.
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-29 17:52 ` Rask Ingemann Lambertsen
@ 2007-08-30 11:42 ` Rask Ingemann Lambertsen
2007-08-30 21:21 ` Rask Ingemann Lambertsen
0 siblings, 1 reply; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-30 11:42 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Uros Bizjak, GCC Patches
On Wed, Aug 29, 2007 at 07:30:37PM +0200, Rask Ingemann Lambertsen wrote:
> Index: gcc/simplify-rtx.c
> ===================================================================
> --- gcc/simplify-rtx.c (revision 127838)
> +++ gcc/simplify-rtx.c (working copy)
> @@ -3722,6 +3722,12 @@ simplify_relational_operation_1 (enum rt
> }
> }
>
> + /* Canonicalize (LTU/GEU (PLUS a b) b) as (LTU/GEU (PLUS a b) a). */
> + if ((code == LTU || code == GEU)
> + && GET_CODE (op0) == PLUS
> + && rtx_equal_p (op1, XEXP (op0, 1)))
> + return simplify_gen_relational (code, mode, cmp_mode, op0, XEXP (op0, 0));
> +
I've changed my mind on this, we want (ltu (plus a b) b) instead. Why? A
can be complex expression. Consider the test case from
<URL:http://gcc.gnu.org/ml/gcc-patches/2007-08/msg02152.html>. For a
STORE_FLAG_VALUE == 1 target, the canonicalization I proposed gives
(parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:SI (plus:SI (ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0]))
(reg/v:SI 60 [ sum.10 ]))
(reg/v:SI 64 [ daddr ]))
(plus:SI (ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0]))
(reg/v:SI 60 [ sum.10 ]))))
(set (reg/v:SI 59 [ sum.11 ])
(plus:SI (plus:SI (ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0]))
(reg/v:SI 60 [ sum.10 ]))
(reg/v:SI 64 [ daddr ])))
])
which is a mess. The pattern becomes simpler if it's written
(parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:SI (plus:SI (ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0]))
(reg/v:SI 60 [ sum.10 ]))
(reg/v:SI 64 [ daddr ]))
(reg/v:SI 64 [ daddr ])))
(set (reg/v:SI 59 [ sum.11 ])
(plus:SI (plus:SI (ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0]))
(reg/v:SI 60 [ sum.10 ]))
(reg/v:SI 64 [ daddr ])))
])
or, for a STORE_FLAG_VALUE == -1 target,
(parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:SI (minus:SI (reg/v:SI 60 [ sum.10 ])
(ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0])))
(reg/v:SI 64 [ daddr ]))
(reg/v:SI 64 [ daddr ])))
(set (reg/v:SI 59 [ sum.11 ])
(plus:SI (minus:SI (reg/v:SI 60 [ sum.10 ])
(ltu:SI (reg:CCC 17 flags)
(const_int 0 [0x0])))
(reg/v:SI 64 [ daddr ])))
])
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] PR target/30315 i386 missed optimization detecting overflows
2007-08-30 11:42 ` Rask Ingemann Lambertsen
@ 2007-08-30 21:21 ` Rask Ingemann Lambertsen
0 siblings, 0 replies; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-30 21:21 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Uros Bizjak, GCC Patches
On Thu, Aug 30, 2007 at 12:47:22PM +0200, Rask Ingemann Lambertsen wrote:
> On Wed, Aug 29, 2007 at 07:30:37PM +0200, Rask Ingemann Lambertsen wrote:
> > Index: gcc/simplify-rtx.c
> > ===================================================================
> > --- gcc/simplify-rtx.c (revision 127838)
> > +++ gcc/simplify-rtx.c (working copy)
> > @@ -3722,6 +3722,12 @@ simplify_relational_operation_1 (enum rt
> > }
> > }
> >
> > + /* Canonicalize (LTU/GEU (PLUS a b) b) as (LTU/GEU (PLUS a b) a). */
> > + if ((code == LTU || code == GEU)
> > + && GET_CODE (op0) == PLUS
> > + && rtx_equal_p (op1, XEXP (op0, 1)))
> > + return simplify_gen_relational (code, mode, cmp_mode, op0, XEXP (op0, 0));
> > +
>
> I've changed my mind on this, we want (ltu (plus a b) b) instead. Why? A
> can be complex expression.
And I've changed my mind back, since the original canonicalization allows
macroization of insn patterns because PLUS and MINUS then have the
comparison operands in the same order: (compare (plus/minus a b) a).
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2] PR target/30315 i386 missed optimization detecting overflows
2007-08-03 19:35 Rask Ingemann Lambertsen
@ 2007-08-04 13:07 ` Rask Ingemann Lambertsen
0 siblings, 0 replies; 12+ messages in thread
From: Rask Ingemann Lambertsen @ 2007-08-04 13:07 UTC (permalink / raw)
To: gcc-patches
Here is a revised patch. Compared with the previous one, I
1) Fixed the ix86_carry_flag_operator predicate to not reject GTU.
2) Added HImode and QImode versions using a mode macro.
3) Changed the testcase to use macros and test "unsigned short" and
"unsigned char" also. Additionally, it now tests that constructs like
if (sum < a)
c ++;
and
if (difference > b)
c --;
are optimized too. This revealed the bug mentioned in 1).
Multiword modes such as uint128_t (or uint64_t with -m32) are still not
optimized. Additional insn patterns and splitters are needed for that.
Bootstrapped and tested on x86_64-unknown-linux-gnu with no regressions.
Ok for trunk?
:ADDPATCH target i386:
2007-08-04 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* config/i386/i386.md (*adddi3_cc_overflow1_rex64): New.
(*adddi3_cc_overflow2_rex64): New.
(*addsi3_cc_overflow1): New.
(*addsi3_cc_overflow2): New.
(*addhi3_cc_overflow1): New.
(*addhi3_cc_overflow2): New.
(*addqi3_cc_overflow1): New.
(*addqi3_cc_overflow2): New.
(*subdi3_cc_overflow1_rex64): New.
(*subdi3_cc_overflow2_rex64): New.
(*subsi3_cc_overflow1): New.
(*subsi3_cc_overflow2): New.
(*subhi3_cc_overflow1): New.
(*subhi3_cc_overflow2): New.
(*subqi3_cc_overflow1): New.
(*subqi3_cc_overflow2): New.
* config/i386/predicates.md (fcmov_comparison_operator): Accept
CCCmode for LTU, GTU, LEU and GEU.
(ix86_comparison_operator): Likewise.
(ix86_carry_flag_operator): Carry set if LTU or GTU in CCCmode.
* gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
(ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
or MINUS expressions.
testsuite/
2007-08-04 Rask Ingemann Lambertsen <rask@sygehus.dk>
PR target/30315
* gcc/testsuite/gcc.target/i386/pr30315.c: New.
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md (revision 127179)
+++ gcc/config/i386/i386.md (working copy)
@@ -488,6 +488,15 @@
[(set_attr "length" "128")
(set_attr "type" "multi")])
+;; All integer modes of at most 32 bits.
+(define_mode_macro LE32I [QI HI SI])
+
+;; Instruction suffix for integer modes.
+(define_mode_attr imodesuffix [(QI "b") (HI "w") (SI "l")])
+
+;; Register class for integer modes.
+(define_mode_attr r [(QI "q") (HI "r") (SI "r")])
+
;; All x87 floating point modes
(define_mode_macro X87MODEF [SF DF XF])
@@ -4855,6 +4864,32 @@
[(set_attr "type" "alu")
(set_attr "mode" "DI")])
+(define_insn "*adddi3_cc_overflow1_rex64"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plus:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0")
+ (match_operand:DI 2 "x86_64_general_operand" "re,rm"))
+ (match_dup 1)))
+ (set (match_operand:DI 0 "nonimmediate_operand" "=rm,r")
+ (plus:DI (match_dup 1) (match_dup 2)))]
+ "TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)"
+ "add{q}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "DI")])
+
+(define_insn "*adddi3_cc_overflow2_rex64"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plus:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0")
+ (match_operand:DI 2 "x86_64_general_operand" "re,rm"))
+ (match_dup 2)))
+ (set (match_operand:DI 0 "nonimmediate_operand" "=rm,r")
+ (plus:DI (match_dup 1) (match_dup 2)))]
+ "TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)"
+ "add{q}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "DI")])
+
(define_insn "addqi3_carry"
[(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
(plus:QI (plus:QI (match_operand:QI 3 "ix86_carry_flag_operator" "")
@@ -4916,6 +4951,32 @@
[(set_attr "type" "alu")
(set_attr "mode" "SI")])
+(define_insn "*add<mode>3_cc_overflow1"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plus:LE32I (match_operand:LE32I 1 "nonimmediate_operand" "%0,0")
+ (match_operand:LE32I 2 "general_operand" "<r>i,<r>m"))
+ (match_dup 1)))
+ (set (match_operand:LE32I 0 "nonimmediate_operand" "=<r>m,<r>")
+ (plus:LE32I (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+ "add{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*add<mode>3_cc_overflow2"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (plus:LE32I (match_operand:LE32I 1 "nonimmediate_operand" "%0,0")
+ (match_operand:LE32I 2 "general_operand" "<r>i,<r>m"))
+ (match_dup 2)))
+ (set (match_operand:LE32I 0 "nonimmediate_operand" "=<r>m,<r>")
+ (plus:LE32I (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+ "add{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
(define_insn "addqi3_cc"
[(set (reg:CC FLAGS_REG)
(unspec:CC [(match_operand:QI 1 "nonimmediate_operand" "%0,0")
@@ -6608,6 +6669,32 @@
[(set_attr "type" "alu")
(set_attr "mode" "DI")])
+(define_insn "*subdi3_cc_overflow1_rex64"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (minus:DI (match_operand:DI 1 "nonimmediate_operand" "0,0")
+ (match_operand:DI 2 "x86_64_general_operand" "re,rm"))
+ (match_dup 1)))
+ (set (match_operand:DI 0 "nonimmediate_operand" "=rm,r")
+ (minus:DI (match_dup 1) (match_dup 2)))]
+ "TARGET_64BIT && ix86_binary_operator_ok (MINUS, DImode, operands)"
+ "sub{q}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "DI")])
+
+(define_insn "*subdi3_cc_overflow2_rex64"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (minus:DI (match_operand:DI 1 "nonimmediate_operand" "0,0")
+ (match_operand:DI 2 "x86_64_general_operand" "re,rm"))
+ (match_dup 2)))
+ (set (match_operand:DI 0 "nonimmediate_operand" "=rm,r")
+ (minus:DI (match_dup 1) (match_dup 2)))]
+ "TARGET_64BIT && ix86_binary_operator_ok (MINUS, DImode, operands)"
+ "sub{q}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "DI")])
+
(define_insn "subqi3_carry"
[(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
(minus:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
@@ -6700,6 +6787,32 @@
[(set_attr "type" "alu")
(set_attr "mode" "SI")])
+(define_insn "*sub<mode>3_cc_overflow1"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (minus:LE32I (match_operand:LE32I 1 "nonimmediate_operand" "0,0")
+ (match_operand:LE32I 2 "general_operand" "<r>i,<r>m"))
+ (match_dup 1)))
+ (set (match_operand:LE32I 0 "nonimmediate_operand" "=<r>m,<r>")
+ (minus:LE32I (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)"
+ "sub{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "*sub<mode>3_cc_overflow2"
+ [(set (reg:CCC FLAGS_REG)
+ (compare:CCC
+ (minus:LE32I (match_operand:LE32I 1 "nonimmediate_operand" "0,0")
+ (match_operand:LE32I 2 "general_operand" "<r>i,<r>m"))
+ (match_dup 2)))
+ (set (match_operand:LE32I 0 "nonimmediate_operand" "=<r>m,<r>")
+ (minus:LE32I (match_dup 1) (match_dup 2)))]
+ "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)"
+ "sub{<imodesuffix>}\t{%2, %0|%0, %2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "<MODE>")])
+
(define_insn "*subsi_2_zext"
[(set (reg FLAGS_REG)
(compare
Index: gcc/config/i386/predicates.md
===================================================================
--- gcc/config/i386/predicates.md (revision 127179)
+++ gcc/config/i386/predicates.md (working copy)
@@ -879,7 +879,8 @@
switch (code)
{
case LTU: case GTU: case LEU: case GEU:
- if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode)
+ if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode
+ || inmode == CCCmode)
return 1;
return 0;
case ORDERED: case UNORDERED:
@@ -924,7 +925,11 @@
|| inmode == CCGOCmode || inmode == CCNOmode)
return 1;
return 0;
- case LTU: case GTU: case LEU: case ORDERED: case UNORDERED: case GEU:
+ case LTU: case GTU: case LEU: case GEU:
+ if (inmode == CCmode || inmode == CCCmode)
+ return 1;
+ return 0;
+ case ORDERED: case UNORDERED:
if (inmode == CCmode)
return 1;
return 0;
@@ -939,7 +944,7 @@
;; Return 1 if OP is a valid comparison operator testing carry flag to be set.
(define_predicate "ix86_carry_flag_operator"
- (match_code "ltu,lt,unlt,gt,ungt,le,unle,ge,unge,ltgt,uneq")
+ (match_code "ltu,lt,unlt,gtu,gt,ungt,le,unle,ge,unge,ltgt,uneq")
{
enum machine_mode inmode = GET_MODE (XEXP (op, 0));
enum rtx_code code = GET_CODE (op);
@@ -957,6 +962,8 @@
return 0;
code = ix86_fp_compare_code_to_integer (code);
}
+ else if (inmode == CCCmode)
+ return code == LTU || code == GTU;
else if (inmode != CCmode)
return 0;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 127179)
+++ gcc/config/i386/i386.c (working copy)
@@ -8254,8 +8254,12 @@ put_condition_code (enum rtx_code code,
case GTU:
/* ??? Use "nbe" instead of "a" for fcmov lossage on some assemblers.
Those same assemblers have the same but opposite lossage on cmov. */
- gcc_assert (mode == CCmode);
- suffix = fp ? "nbe" : "a";
+ if (mode == CCmode)
+ suffix = fp ? "nbe" : "a";
+ else if (mode == CCCmode)
+ suffix = "b";
+ else
+ gcc_unreachable ();
break;
case LT:
switch (mode)
@@ -8275,7 +8279,7 @@ put_condition_code (enum rtx_code code,
}
break;
case LTU:
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = "b";
break;
case GE:
@@ -8297,7 +8301,7 @@ put_condition_code (enum rtx_code code,
break;
case GEU:
/* ??? As above. */
- gcc_assert (mode == CCmode);
+ gcc_assert (mode == CCmode || mode == CCCmode);
suffix = fp ? "nb" : "ae";
break;
case LE:
@@ -8305,8 +8309,13 @@ put_condition_code (enum rtx_code code,
suffix = "le";
break;
case LEU:
- gcc_assert (mode == CCmode);
- suffix = "be";
+ /* ??? As above. */
+ if (mode == CCmode)
+ suffix = "be";
+ else if (mode == CCCmode)
+ suffix = fp ? "nb" : "ae";
+ else
+ gcc_unreachable ();
break;
case UNORDERED:
suffix = fp ? "u" : "p";
@@ -11033,10 +11042,23 @@ ix86_cc_mode (enum rtx_code code, rtx op
return CCZmode;
/* Codes needing carry flag. */
case GEU: /* CF=0 */
- case GTU: /* CF=0 & ZF=0 */
case LTU: /* CF=1 */
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == PLUS
+ && (rtx_equal_p (op1, XEXP (op0, 0))
+ || rtx_equal_p (op1, XEXP (op0, 1))))
+ return CCCmode;
+ else
+ return CCmode;
+ case GTU: /* CF=0 & ZF=0 */
case LEU: /* CF=1 | ZF=1 */
- return CCmode;
+ /* Detect overflow checks. They need just the carry flag. */
+ if (GET_CODE (op0) == MINUS
+ && (rtx_equal_p (op1, XEXP (op0, 0))
+ || rtx_equal_p (op1, XEXP (op0, 1))))
+ return CCCmode;
+ else
+ return CCmode;
/* Codes possibly doable only with sign flag when
comparing against zero. */
case GE: /* SF=OF or SF=0 */
Index: gcc/testsuite/gcc.target/i386/pr30315.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr30315.c (revision 0)
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "cmp" } } */
+
+extern void abort (void);
+long int c;
+
+#define FOO1(T, t, C) \
+T foo##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ abort (); \
+ return sum; \
+}
+#define FOO(T, t) FOO1(T, t, a) FOO1(T, t, b)
+
+#define QWE1(T, t, C) \
+T qwe##t##C (T a, T b) \
+{ \
+ T sum = a + b; \
+ if (sum < C) \
+ c ++; \
+ return sum; \
+}
+#define QWE(T, t) QWE1(T, t, a) QWE1(T, t, b)
+
+#define BAR1(T, t, C) \
+T bar##t##C (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > C) \
+ abort (); \
+ return difference; \
+}
+#define BAR(T, t) BAR1(T, t, a) BAR1(T, t, b)
+
+#define BAZ1(T, t, C) \
+T baz##t##C (T a, T b) \
+{ \
+ T difference = a - b; \
+ if (difference > C) \
+ c --; \
+ return difference; \
+}
+#define BAZ(T, t) BAZ1(T, t, a) BAZ1(T, t, b)
+
+#define TEST(T, t) FOO(T, t) BAR(T, t) BAZ(T, t) QWE(T, t)
+
+TEST (unsigned long, l)
+TEST (unsigned int, i)
+TEST (unsigned short, s)
+TEST (unsigned char, c)
--
Rask Ingemann Lambertsen
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-08-30 19:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-07 15:16 [PATCH v2] PR target/30315 i386 missed optimization detecting overflows Uros Bizjak
2007-08-07 18:03 ` Rask Ingemann Lambertsen
2007-08-09 23:11 ` Rask Ingemann Lambertsen
2007-08-10 22:22 ` Uros Bizjak
2007-08-14 11:04 ` [PATCH v4] " Rask Ingemann Lambertsen
2007-08-14 12:25 ` Uros Bizjak
2007-08-14 13:35 ` Paolo Bonzini
2007-08-14 16:55 ` Rask Ingemann Lambertsen
2007-08-29 17:52 ` Rask Ingemann Lambertsen
2007-08-30 11:42 ` Rask Ingemann Lambertsen
2007-08-30 21:21 ` Rask Ingemann Lambertsen
-- strict thread matches above, loose matches on Subject: below --
2007-08-03 19:35 Rask Ingemann Lambertsen
2007-08-04 13:07 ` [PATCH v2] " Rask Ingemann Lambertsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).