* [PATCH (1/7)] New optab framework for widening multiplies
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
@ 2011-06-23 14:39 ` Andrew Stubbs
2011-07-09 15:38 ` Andrew Stubbs
2011-06-23 14:41 ` [PATCH (2/7)] Widening multiplies by more than one mode Andrew Stubbs
` (8 subsequent siblings)
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:39 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 2996 bytes --]
This patch should have no effect on the compiler output. It merely
replaces one way to represent widening operations with another, and
refactors the other parts of the compiler to match. The rest of the
patch set uses this new framework to implement the optimization
improvements.
I considered and discarded many approaches to this patch before arriving
at this solution, and I feel sure that there'll be somebody out there
who will think I chose the wrong one, so let me first explain how I got
here ....
The aim is to be able to encode and query optabs that have any given
input mode, and any given output mode. This is similar to the
convert_optab, but not compatible with that optab since it is handled
completely differently in the code.
(Just to be clear, the existing widening multiply support only covers
instructions that widen by *one* mode, so it's only ever been necessary
to know the output mode, up to now.)
Option 1 was to add a second dimension to the handlers table in optab_d,
but I discarded this option because it would increase the memory usage
by the square of the number of modes, which is a bit much.
Option 2 was to add a whole new optab, similar to optab_d, but with a
second dimension like convert_optab_d, however this turned out to cause
way too many pointer type mismatches in the code, and would have been
very difficult to fix up.
Option 3 was to add new optab entries for widening by two modes, by
three modes, and so on. True, I would only need to add one extra set for
what I need, but there would be so many places in the code that compare
against smul_widen_optab, for example, that would need to be taught
about these, that it seemed like a bad idea.
Option 4 was to have a separate table that contained the widening
operations, and refer to that whenever a widening entry in the main
optab is referenced, but I found that there was no easy way to do the
mapping without putting some sort of switch table in
widening_optab_handler, and that negates the other advantages.
So, what I've done in the end is add a new pointer entry "widening" into
optab_d, and dynamically build the widening operations table for each
optab that needs it. I've then added new accessor functions that take
both input and output modes, and altered the code to use them where
appropriate.
The down-side of this approach is that the optab entries for widening
operations now have two "handlers" tables, one of which is redundant.
That said, those cases are in the minority, and it is the smaller table
which is unused.
If people find that very distasteful, it might be possible to remove the
*_widen_optab entries and unify smul_optab with smul_widen_optab, and so
on, and save space that way. I've not done so yet, but I expect I could
if people feel strongly about it.
As a side-effect, it's now possible for any optab to be "widening",
should some target happen to have a widening add, shift, or whatever.
Is this patch OK?
Andrew
[-- Attachment #2: widening-multiplies-1.patch --]
[-- Type: text/x-patch, Size: 14510 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* expr.c (expand_expr_real_2): Use widening_optab_handler.
* genopinit.c (optabs): Use set_widening_optab_handler for $N.
(gen_insn): $N now means $a must be wider than $b, not consecutive.
* optabs.c (expand_widen_pattern_expr): Use widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (widening_optab_handlers): New struct.
(optab_d): New member, 'widening'.
(widening_optab_handler): New function.
(set_widening_optab_handler): New function.
* tree-ssa-math-opts.c (convert_mult_to_widen): Use
widening_optab_handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7634,7 +7634,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
this_optab = usmul_widen_optab;
if (mode == GET_MODE_2XWIDER_MODE (innermode))
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7661,7 +7662,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
if (mode == GET_MODE_2XWIDER_MODE (innermode)
&& TREE_CODE (treeop0) != INTEGER_CST)
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
EXPAND_NORMAL);
@@ -7669,7 +7671,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (optab_handler (other_optab, mode) != CODE_FOR_nothing
+ if (widening_optab_handler (other_optab, mode, innermode)
+ != CODE_FOR_nothing
&& innermode == word_mode)
{
rtx htem, hipart;
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -46,10 +46,12 @@ along with GCC; see the file COPYING3. If not see
used. $A and $B are replaced with the full name of the mode; $a and $b
are replaced with the short form of the name, as above.
- If $N is present in the pattern, it means the two modes must be consecutive
- widths in the same mode class (e.g, QImode and HImode). $I means that
- only full integer modes should be considered for the next mode, and $F
- means that only float modes should be considered.
+ If $N is present in the pattern, it means the two modes must be in
+ the same mode class, and $b must be greater than $a (e.g, QImode
+ and HImode).
+
+ $I means that only full integer modes should be considered for the
+ next mode, and $F means that only float modes should be considered.
$P means that both full and partial integer modes should be considered.
$Q means that only fixed-point modes should be considered.
@@ -99,17 +101,17 @@ static const char * const optabs[] =
"set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
"set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
"set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
- "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
- "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
- "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
- "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
- "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
- "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
- "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
- "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
- "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
- "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
- "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
+ "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
+ "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
+ "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)",
+ "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)",
+ "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)",
+ "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)",
+ "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)",
+ "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)",
+ "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)",
+ "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)",
+ "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)",
"set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))",
"set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))",
"set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))",
@@ -305,7 +307,7 @@ gen_insn (rtx insn)
{
int force_float = 0, force_int = 0, force_partial_int = 0;
int force_fixed = 0;
- int force_consec = 0;
+ int force_wider = 0;
int matches = 1;
for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++)
@@ -323,7 +325,7 @@ gen_insn (rtx insn)
switch (*++pp)
{
case 'N':
- force_consec = 1;
+ force_wider = 1;
break;
case 'I':
force_int = 1;
@@ -392,7 +394,10 @@ gen_insn (rtx insn)
|| mode_class[i] == MODE_VECTOR_FRACT
|| mode_class[i] == MODE_VECTOR_UFRACT
|| mode_class[i] == MODE_VECTOR_ACCUM
- || mode_class[i] == MODE_VECTOR_UACCUM))
+ || mode_class[i] == MODE_VECTOR_UACCUM)
+ && (! force_wider
+ || *pp == 'a'
+ || m1 < i))
break;
}
@@ -412,8 +417,7 @@ gen_insn (rtx insn)
}
if (matches && pp[0] == '$' && pp[1] == ')'
- && *np == 0
- && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2))
+ && *np == 0)
break;
}
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -515,8 +515,8 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)));
+ icode = widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx target, int unsignedp, enum optab_methods methods,
rtx last)
{
- enum insn_code icode = optab_handler (binoptab, mode);
+ enum machine_mode from_mode = GET_MODE (op0);
+ enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1389,7 +1390,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && optab_handler (binoptab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ != CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
unsignedp, methods, last);
@@ -1429,8 +1431,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab),
- GET_MODE_WIDER_MODE (mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (mode), mode)
!= CODE_FOR_nothing))
{
temp = expand_binop (GET_MODE_WIDER_MODE (mode),
@@ -1458,12 +1461,14 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ if (widening_optab_handler (binoptab, wider_mode, mode)
+ != CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -1896,8 +1901,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
&& optab_handler (add_optab, word_mode) != CODE_FOR_nothing)
{
rtx product = NULL_RTX;
-
- if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (umul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
true, methods);
@@ -1906,7 +1911,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
}
if (product == NULL_RTX
- && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (smul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
false, methods);
@@ -1997,7 +2003,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ if (widening_optab_handler (binoptab, wider_mode, mode)
+ != CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
{
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -42,6 +42,11 @@ struct optab_handlers
int insn_code;
};
+struct widening_optab_handlers
+{
+ struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES];
+};
+
struct optab_d
{
enum rtx_code code;
@@ -50,6 +55,7 @@ struct optab_d
void (*libcall_gen)(struct optab_d *, const char *name, char suffix,
enum machine_mode);
struct optab_handlers handlers[NUM_MACHINE_MODES];
+ struct widening_optab_handlers *widening;
};
typedef struct optab_d * optab;
@@ -876,6 +882,23 @@ optab_handler (optab op, enum machine_mode mode)
+ (int) CODE_FOR_nothing);
}
+/* Like optab_handler, but for widening_operations that have a TO_MODE and
+ a FROM_MODE. */
+
+static inline enum insn_code
+widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode)
+{
+ if (to_mode == from_mode)
+ return optab_handler (op, to_mode);
+
+ if (op->widening)
+ return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ + (int) CODE_FOR_nothing);
+
+ return CODE_FOR_nothing;
+}
+
/* Record that insn CODE should be used to implement mode MODE of OP. */
static inline void
@@ -884,6 +907,26 @@ set_optab_handler (optab op, enum machine_mode mode, enum insn_code code)
op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing;
}
+/* Like set_optab_handler, but for widening operations that have a TO_MODE
+ and a FROM_MODE. */
+
+static inline void
+set_widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode, enum insn_code code)
+{
+ if (to_mode == from_mode)
+ set_optab_handler (op, to_mode, code);
+ else
+ {
+ if (op->widening == NULL)
+ op->widening = (struct widening_optab_handlers *)
+ xcalloc (1, sizeof (struct widening_optab_handlers));
+
+ op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ = (int) code - (int) CODE_FOR_nothing;
+ }
+}
+
/* Return the insn used to perform conversion OP from mode FROM_MODE
to mode TO_MODE; return CODE_FOR_nothing if the target does not have
such an insn. */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2047,6 +2047,8 @@ convert_mult_to_widen (gimple stmt)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
+ enum machine_mode to_mode, from_mode;
+ optab op;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2056,12 +2058,17 @@ convert_mult_to_widen (gimple stmt)
if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
- handler = optab_handler (umul_widen_optab, TYPE_MODE (type));
+ op = umul_widen_optab;
else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
- handler = optab_handler (smul_widen_optab, TYPE_MODE (type));
+ op = smul_widen_optab;
else
- handler = optab_handler (usmul_widen_optab, TYPE_MODE (type));
+ op = usmul_widen_optab;
+
+ handler = widening_optab_handler (op, to_mode, from_mode);
if (handler == CODE_FOR_nothing)
return false;
@@ -2090,6 +2097,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
enum tree_code wmult_code;
+ enum insn_code handler;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2163,7 +2171,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
+ == CODE_FOR_nothing)
return false;
/* ??? May need some type verification here? */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-06-23 14:39 ` [PATCH (1/7)] New optab framework for widening multiplies Andrew Stubbs
@ 2011-07-09 15:38 ` Andrew Stubbs
2011-07-14 15:29 ` Andrew Stubbs
2011-07-22 13:01 ` Bernd Schmidt
0 siblings, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-09 15:38 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]
On 23/06/11 15:37, Andrew Stubbs wrote:
> This patch should have no effect on the compiler output. It merely
> replaces one way to represent widening operations with another, and
> refactors the other parts of the compiler to match. The rest of the
> patch set uses this new framework to implement the optimization
> improvements.
>
> I considered and discarded many approaches to this patch before arriving
> at this solution, and I feel sure that there'll be somebody out there
> who will think I chose the wrong one, so let me first explain how I got
> here ....
>
> The aim is to be able to encode and query optabs that have any given
> input mode, and any given output mode. This is similar to the
> convert_optab, but not compatible with that optab since it is handled
> completely differently in the code.
>
> (Just to be clear, the existing widening multiply support only covers
> instructions that widen by *one* mode, so it's only ever been necessary
> to know the output mode, up to now.)
>
> Option 1 was to add a second dimension to the handlers table in optab_d,
> but I discarded this option because it would increase the memory usage
> by the square of the number of modes, which is a bit much.
>
> Option 2 was to add a whole new optab, similar to optab_d, but with a
> second dimension like convert_optab_d, however this turned out to cause
> way too many pointer type mismatches in the code, and would have been
> very difficult to fix up.
>
> Option 3 was to add new optab entries for widening by two modes, by
> three modes, and so on. True, I would only need to add one extra set for
> what I need, but there would be so many places in the code that compare
> against smul_widen_optab, for example, that would need to be taught
> about these, that it seemed like a bad idea.
>
> Option 4 was to have a separate table that contained the widening
> operations, and refer to that whenever a widening entry in the main
> optab is referenced, but I found that there was no easy way to do the
> mapping without putting some sort of switch table in
> widening_optab_handler, and that negates the other advantages.
>
> So, what I've done in the end is add a new pointer entry "widening" into
> optab_d, and dynamically build the widening operations table for each
> optab that needs it. I've then added new accessor functions that take
> both input and output modes, and altered the code to use them where
> appropriate.
>
> The down-side of this approach is that the optab entries for widening
> operations now have two "handlers" tables, one of which is redundant.
> That said, those cases are in the minority, and it is the smaller table
> which is unused.
>
> If people find that very distasteful, it might be possible to remove the
> *_widen_optab entries and unify smul_optab with smul_widen_optab, and so
> on, and save space that way. I've not done so yet, but I expect I could
> if people feel strongly about it.
>
> As a side-effect, it's now possible for any optab to be "widening",
> should some target happen to have a widening add, shift, or whatever.
>
> Is this patch OK?
This update has been rebaselined to fix some conflicts with other recent
commits in this area.
I also identified a small bug which resulted in the operands to some
commutative operations being reversed. I don't believe the bug did any
harm, logically speaking, but I suppose there could be a testcase that
resulted in worse code being generated. With this fix, I now see exactly
matching output in all my testcases.
Andrew
[-- Attachment #2: widening-multiplies-1.patch --]
[-- Type: text/x-patch, Size: 14205 bytes --]
2011-07-09 Andrew Stubbs <ams@codesourcery.com>
gcc/
* expr.c (expand_expr_real_2): Use widening_optab_handler.
* genopinit.c (optabs): Use set_widening_optab_handler for $N.
(gen_insn): $N now means $a must be wider than $b, not consecutive.
* optabs.c (expand_widen_pattern_expr): Use widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (widening_optab_handlers): New struct.
(optab_d): New member, 'widening'.
(widening_optab_handler): New function.
(set_widening_optab_handler): New function.
* tree-ssa-math-opts.c (convert_mult_to_widen): Use
widening_optab_handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7640,7 +7640,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
this_optab = usmul_widen_optab;
if (mode == GET_MODE_2XWIDER_MODE (innermode))
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7667,7 +7668,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
if (mode == GET_MODE_2XWIDER_MODE (innermode)
&& TREE_CODE (treeop0) != INTEGER_CST)
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
EXPAND_NORMAL);
@@ -7675,7 +7677,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (optab_handler (other_optab, mode) != CODE_FOR_nothing
+ if (widening_optab_handler (other_optab, mode, innermode)
+ != CODE_FOR_nothing
&& innermode == word_mode)
{
rtx htem, hipart;
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -46,10 +46,12 @@ along with GCC; see the file COPYING3. If not see
used. $A and $B are replaced with the full name of the mode; $a and $b
are replaced with the short form of the name, as above.
- If $N is present in the pattern, it means the two modes must be consecutive
- widths in the same mode class (e.g, QImode and HImode). $I means that
- only full integer modes should be considered for the next mode, and $F
- means that only float modes should be considered.
+ If $N is present in the pattern, it means the two modes must be in
+ the same mode class, and $b must be greater than $a (e.g, QImode
+ and HImode).
+
+ $I means that only full integer modes should be considered for the
+ next mode, and $F means that only float modes should be considered.
$P means that both full and partial integer modes should be considered.
$Q means that only fixed-point modes should be considered.
@@ -99,17 +101,17 @@ static const char * const optabs[] =
"set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
"set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
"set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
- "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
- "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
- "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
- "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
- "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
- "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
- "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
- "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
- "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
- "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
- "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
+ "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
+ "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
+ "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)",
+ "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)",
+ "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)",
+ "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)",
+ "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)",
+ "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)",
+ "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)",
+ "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)",
+ "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)",
"set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))",
"set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))",
"set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))",
@@ -305,7 +307,7 @@ gen_insn (rtx insn)
{
int force_float = 0, force_int = 0, force_partial_int = 0;
int force_fixed = 0;
- int force_consec = 0;
+ int force_wider = 0;
int matches = 1;
for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++)
@@ -323,7 +325,7 @@ gen_insn (rtx insn)
switch (*++pp)
{
case 'N':
- force_consec = 1;
+ force_wider = 1;
break;
case 'I':
force_int = 1;
@@ -392,7 +394,10 @@ gen_insn (rtx insn)
|| mode_class[i] == MODE_VECTOR_FRACT
|| mode_class[i] == MODE_VECTOR_UFRACT
|| mode_class[i] == MODE_VECTOR_ACCUM
- || mode_class[i] == MODE_VECTOR_UACCUM))
+ || mode_class[i] == MODE_VECTOR_UACCUM)
+ && (! force_wider
+ || *pp == 'a'
+ || m1 < i))
break;
}
@@ -412,8 +417,7 @@ gen_insn (rtx insn)
}
if (matches && pp[0] == '$' && pp[1] == ')'
- && *np == 0
- && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2))
+ && *np == 0)
break;
}
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -515,8 +515,8 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)));
+ icode = widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx target, int unsignedp, enum optab_methods methods,
rtx last)
{
- enum insn_code icode = optab_handler (binoptab, mode);
+ enum machine_mode from_mode = GET_MODE (op0);
+ enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1389,7 +1390,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && optab_handler (binoptab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ != CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
unsignedp, methods, last);
@@ -1429,8 +1431,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (binoptab == smul_optab
&& GET_MODE_2XWIDER_MODE (mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab),
- GET_MODE_2XWIDER_MODE (mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_2XWIDER_MODE (mode), mode)
!= CODE_FOR_nothing))
{
temp = expand_binop (GET_MODE_2XWIDER_MODE (mode),
@@ -1457,12 +1460,14 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ if (optab_handler (binoptab, wider_mode)
+ != CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -1895,8 +1900,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
&& optab_handler (add_optab, word_mode) != CODE_FOR_nothing)
{
rtx product = NULL_RTX;
-
- if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (umul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
true, methods);
@@ -1905,7 +1910,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
}
if (product == NULL_RTX
- && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (smul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
false, methods);
@@ -1996,7 +2002,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ if (widening_optab_handler (binoptab, wider_mode, mode)
+ != CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
{
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -42,6 +42,11 @@ struct optab_handlers
int insn_code;
};
+struct widening_optab_handlers
+{
+ struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES];
+};
+
struct optab_d
{
enum rtx_code code;
@@ -50,6 +55,7 @@ struct optab_d
void (*libcall_gen)(struct optab_d *, const char *name, char suffix,
enum machine_mode);
struct optab_handlers handlers[NUM_MACHINE_MODES];
+ struct widening_optab_handlers *widening;
};
typedef struct optab_d * optab;
@@ -876,6 +882,23 @@ optab_handler (optab op, enum machine_mode mode)
+ (int) CODE_FOR_nothing);
}
+/* Like optab_handler, but for widening_operations that have a TO_MODE and
+ a FROM_MODE. */
+
+static inline enum insn_code
+widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode)
+{
+ if (to_mode == from_mode)
+ return optab_handler (op, to_mode);
+
+ if (op->widening)
+ return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ + (int) CODE_FOR_nothing);
+
+ return CODE_FOR_nothing;
+}
+
/* Record that insn CODE should be used to implement mode MODE of OP. */
static inline void
@@ -884,6 +907,26 @@ set_optab_handler (optab op, enum machine_mode mode, enum insn_code code)
op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing;
}
+/* Like set_optab_handler, but for widening operations that have a TO_MODE
+ and a FROM_MODE. */
+
+static inline void
+set_widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode, enum insn_code code)
+{
+ if (to_mode == from_mode)
+ set_optab_handler (op, to_mode, code);
+ else
+ {
+ if (op->widening == NULL)
+ op->widening = (struct widening_optab_handlers *)
+ xcalloc (1, sizeof (struct widening_optab_handlers));
+
+ op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ = (int) code - (int) CODE_FOR_nothing;
+ }
+}
+
/* Return the insn used to perform conversion OP from mode FROM_MODE
to mode TO_MODE; return CODE_FOR_nothing if the target does not have
such an insn. */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2055,6 +2055,8 @@ convert_mult_to_widen (gimple stmt)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
+ enum machine_mode to_mode, from_mode;
+ optab op;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2064,12 +2066,17 @@ convert_mult_to_widen (gimple stmt)
if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
- handler = optab_handler (umul_widen_optab, TYPE_MODE (type));
+ op = umul_widen_optab;
else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
- handler = optab_handler (smul_widen_optab, TYPE_MODE (type));
+ op = smul_widen_optab;
else
- handler = optab_handler (usmul_widen_optab, TYPE_MODE (type));
+ op = usmul_widen_optab;
+
+ handler = widening_optab_handler (op, to_mode, from_mode);
if (handler == CODE_FOR_nothing)
return false;
@@ -2171,7 +2178,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
+ == CODE_FOR_nothing)
return false;
/* ??? May need some type verification here? */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-09 15:38 ` Andrew Stubbs
@ 2011-07-14 15:29 ` Andrew Stubbs
2011-07-22 13:01 ` Bernd Schmidt
1 sibling, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 15:29 UTC (permalink / raw)
Cc: gcc-patches, patches
Ping. This is the last unreviewed patch in this series ...
Thanks
Andrew
On 09/07/11 15:43, Andrew Stubbs wrote:
> On 23/06/11 15:37, Andrew Stubbs wrote:
>> This patch should have no effect on the compiler output. It merely
>> replaces one way to represent widening operations with another, and
>> refactors the other parts of the compiler to match. The rest of the
>> patch set uses this new framework to implement the optimization
>> improvements.
>>
>> I considered and discarded many approaches to this patch before arriving
>> at this solution, and I feel sure that there'll be somebody out there
>> who will think I chose the wrong one, so let me first explain how I got
>> here ....
>>
>> The aim is to be able to encode and query optabs that have any given
>> input mode, and any given output mode. This is similar to the
>> convert_optab, but not compatible with that optab since it is handled
>> completely differently in the code.
>>
>> (Just to be clear, the existing widening multiply support only covers
>> instructions that widen by *one* mode, so it's only ever been necessary
>> to know the output mode, up to now.)
>>
>> Option 1 was to add a second dimension to the handlers table in optab_d,
>> but I discarded this option because it would increase the memory usage
>> by the square of the number of modes, which is a bit much.
>>
>> Option 2 was to add a whole new optab, similar to optab_d, but with a
>> second dimension like convert_optab_d, however this turned out to cause
>> way too many pointer type mismatches in the code, and would have been
>> very difficult to fix up.
>>
>> Option 3 was to add new optab entries for widening by two modes, by
>> three modes, and so on. True, I would only need to add one extra set for
>> what I need, but there would be so many places in the code that compare
>> against smul_widen_optab, for example, that would need to be taught
>> about these, that it seemed like a bad idea.
>>
>> Option 4 was to have a separate table that contained the widening
>> operations, and refer to that whenever a widening entry in the main
>> optab is referenced, but I found that there was no easy way to do the
>> mapping without putting some sort of switch table in
>> widening_optab_handler, and that negates the other advantages.
>>
>> So, what I've done in the end is add a new pointer entry "widening" into
>> optab_d, and dynamically build the widening operations table for each
>> optab that needs it. I've then added new accessor functions that take
>> both input and output modes, and altered the code to use them where
>> appropriate.
>>
>> The down-side of this approach is that the optab entries for widening
>> operations now have two "handlers" tables, one of which is redundant.
>> That said, those cases are in the minority, and it is the smaller table
>> which is unused.
>>
>> If people find that very distasteful, it might be possible to remove the
>> *_widen_optab entries and unify smul_optab with smul_widen_optab, and so
>> on, and save space that way. I've not done so yet, but I expect I could
>> if people feel strongly about it.
>>
>> As a side-effect, it's now possible for any optab to be "widening",
>> should some target happen to have a widening add, shift, or whatever.
>>
>> Is this patch OK?
>
> This update has been rebaselined to fix some conflicts with other recent
> commits in this area.
>
> I also identified a small bug which resulted in the operands to some
> commutative operations being reversed. I don't believe the bug did any
> harm, logically speaking, but I suppose there could be a testcase that
> resulted in worse code being generated. With this fix, I now see exactly
> matching output in all my testcases.
>
> Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-09 15:38 ` Andrew Stubbs
2011-07-14 15:29 ` Andrew Stubbs
@ 2011-07-22 13:01 ` Bernd Schmidt
2011-07-22 13:50 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Bernd Schmidt @ 2011-07-22 13:01 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On 07/09/11 16:43, Andrew Stubbs wrote:
>> So, what I've done in the end is add a new pointer entry "widening" into
>> optab_d, and dynamically build the widening operations table for each
>> optab that needs it. I've then added new accessor functions that take
>> both input and output modes, and altered the code to use them where
>> appropriate.
I think this is a reasonable approach given the way our code is structured.
> @@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
> rtx target, int unsignedp, enum optab_methods methods,
> rtx last)
> {
> - enum insn_code icode = optab_handler (binoptab, mode);
> + enum machine_mode from_mode = GET_MODE (op0);
> + enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
Please add a new function along the lines of
enum machine_mode
widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
{
if (GET_MODE (op1) == VOIDmode)
return GET_MODE (op0);
gcc_assert (GET_MODE (op0) == GET_MODE (op1);
return GET_MODE (op0);
}
I'll want to extend this at some point to allow widening multiplies
where only one operand is widened (with a new set of optabs).
> - if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
> + if (optab_handler (binoptab, wider_mode)
> + != CODE_FOR_nothing
Spurious formatting change.
Otherwise ok.
Bernd
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-22 13:01 ` Bernd Schmidt
@ 2011-07-22 13:50 ` Andrew Stubbs
2011-07-22 14:01 ` Bernd Schmidt
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 13:50 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: gcc-patches, patches
On 22/07/11 13:34, Bernd Schmidt wrote:
>> @@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
>> > rtx target, int unsignedp, enum optab_methods methods,
>> > rtx last)
>> > {
>> > - enum insn_code icode = optab_handler (binoptab, mode);
>> > + enum machine_mode from_mode = GET_MODE (op0);
>> > + enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
> Please add a new function along the lines of
>
> enum machine_mode
> widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
> {
> if (GET_MODE (op1) == VOIDmode)
> return GET_MODE (op0);
> gcc_assert (GET_MODE (op0) == GET_MODE (op1);
> return GET_MODE (op0);
> }
>
> I'll want to extend this at some point to allow widening multiplies
> where only one operand is widened (with a new set of optabs).
Sorry, I don't quite understand what you're getting at here?
expand_binop_directly is only ever used, I think, when the tree
optimizer has already identified what insn to use. Both before and after
my patch, the tree-cfg gimple verification requires that both op0 and
op1 are the same mode, and non-widening operation are always he same
mode, so I think my code is perfectly adequate. Is that not so?
If you want to add support for machine instructions that only widen one
input, then that's surely a separate problem? If the target mode is
smaller than the combined size of the inputs, then the changes to the
widening_mul pass would be non-trivial.
If the point is just to be absolutely certain that the inputs are valid
then I'm happy to add the function. BTW, did you mean the have the
unused parameter?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-22 13:50 ` Andrew Stubbs
@ 2011-07-22 14:01 ` Bernd Schmidt
2011-07-22 15:52 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Bernd Schmidt @ 2011-07-22 14:01 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On 07/22/11 15:27, Andrew Stubbs wrote:
> On 22/07/11 13:34, Bernd Schmidt wrote:
>>> @@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode,
>>> optab binoptab,
>>> > rtx target, int unsignedp, enum optab_methods methods,
>>> > rtx last)
>>> > {
>>> > - enum insn_code icode = optab_handler (binoptab, mode);
>>> > + enum machine_mode from_mode = GET_MODE (op0);
>>> > + enum insn_code icode = widening_optab_handler (binoptab, mode,
>>> from_mode);
>> Please add a new function along the lines of
>>
>> enum machine_mode
>> widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
>> {
>> if (GET_MODE (op1) == VOIDmode)
>> return GET_MODE (op0);
>> gcc_assert (GET_MODE (op0) == GET_MODE (op1);
>> return GET_MODE (op0);
>> }
>>
>> I'll want to extend this at some point to allow widening multiplies
>> where only one operand is widened (with a new set of optabs).
>
> Sorry, I don't quite understand what you're getting at here?
>
> expand_binop_directly is only ever used, I think, when the tree
> optimizer has already identified what insn to use. Both before and after
> my patch, the tree-cfg gimple verification requires that both op0 and
> op1 are the same mode, and non-widening operation are always he same
> mode, so I think my code is perfectly adequate. Is that not so?
For the moment, yes.
Oh well, let's shelve it and do it later.
Bernd
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-22 14:01 ` Bernd Schmidt
@ 2011-07-22 15:52 ` Andrew Stubbs
2011-08-19 14:41 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 15:52 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 167 bytes --]
On 22/07/11 14:28, Bernd Schmidt wrote:
> Oh well, let's shelve it and do it later.
Here's an updated patch with the formatting problem you found fixed.
OK?
Andrew
[-- Attachment #2: widening-multiplies-1.patch --]
[-- Type: text/x-patch, Size: 14052 bytes --]
2011-07-22 Andrew Stubbs <ams@codesourcery.com>
gcc/
* expr.c (expand_expr_real_2): Use widening_optab_handler.
* genopinit.c (optabs): Use set_widening_optab_handler for $N.
(gen_insn): $N now means $a must be wider than $b, not consecutive.
* optabs.c (expand_widen_pattern_expr): Use widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (widening_optab_handlers): New struct.
(optab_d): New member, 'widening'.
(widening_optab_handler): New function.
(set_widening_optab_handler): New function.
* tree-ssa-math-opts.c (convert_mult_to_widen): Use
widening_optab_handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7662,7 +7662,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
this_optab = usmul_widen_optab;
if (mode == GET_MODE_2XWIDER_MODE (innermode))
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7689,7 +7690,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
if (mode == GET_MODE_2XWIDER_MODE (innermode)
&& TREE_CODE (treeop0) != INTEGER_CST)
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
EXPAND_NORMAL);
@@ -7697,7 +7699,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (optab_handler (other_optab, mode) != CODE_FOR_nothing
+ if (widening_optab_handler (other_optab, mode, innermode)
+ != CODE_FOR_nothing
&& innermode == word_mode)
{
rtx htem, hipart;
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -46,10 +46,12 @@ along with GCC; see the file COPYING3. If not see
used. $A and $B are replaced with the full name of the mode; $a and $b
are replaced with the short form of the name, as above.
- If $N is present in the pattern, it means the two modes must be consecutive
- widths in the same mode class (e.g, QImode and HImode). $I means that
- only full integer modes should be considered for the next mode, and $F
- means that only float modes should be considered.
+ If $N is present in the pattern, it means the two modes must be in
+ the same mode class, and $b must be greater than $a (e.g, QImode
+ and HImode).
+
+ $I means that only full integer modes should be considered for the
+ next mode, and $F means that only float modes should be considered.
$P means that both full and partial integer modes should be considered.
$Q means that only fixed-point modes should be considered.
@@ -99,17 +101,17 @@ static const char * const optabs[] =
"set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
"set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
"set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
- "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
- "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
- "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
- "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
- "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
- "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
- "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
- "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
- "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
- "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
- "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
+ "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
+ "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
+ "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)",
+ "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)",
+ "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)",
+ "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)",
+ "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)",
+ "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)",
+ "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)",
+ "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)",
+ "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)",
"set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))",
"set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))",
"set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))",
@@ -305,7 +307,7 @@ gen_insn (rtx insn)
{
int force_float = 0, force_int = 0, force_partial_int = 0;
int force_fixed = 0;
- int force_consec = 0;
+ int force_wider = 0;
int matches = 1;
for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++)
@@ -323,7 +325,7 @@ gen_insn (rtx insn)
switch (*++pp)
{
case 'N':
- force_consec = 1;
+ force_wider = 1;
break;
case 'I':
force_int = 1;
@@ -392,7 +394,10 @@ gen_insn (rtx insn)
|| mode_class[i] == MODE_VECTOR_FRACT
|| mode_class[i] == MODE_VECTOR_UFRACT
|| mode_class[i] == MODE_VECTOR_ACCUM
- || mode_class[i] == MODE_VECTOR_UACCUM))
+ || mode_class[i] == MODE_VECTOR_UACCUM)
+ && (! force_wider
+ || *pp == 'a'
+ || m1 < i))
break;
}
@@ -412,8 +417,7 @@ gen_insn (rtx insn)
}
if (matches && pp[0] == '$' && pp[1] == ')'
- && *np == 0
- && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2))
+ && *np == 0)
break;
}
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -515,8 +515,8 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)));
+ icode = widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1242,7 +1242,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx target, int unsignedp, enum optab_methods methods,
rtx last)
{
- enum insn_code icode = optab_handler (binoptab, mode);
+ enum machine_mode from_mode = GET_MODE (op0);
+ enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1389,7 +1390,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && optab_handler (binoptab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ != CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
unsignedp, methods, last);
@@ -1429,8 +1431,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (binoptab == smul_optab
&& GET_MODE_2XWIDER_MODE (mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab),
- GET_MODE_2XWIDER_MODE (mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_2XWIDER_MODE (mode), mode)
!= CODE_FOR_nothing))
{
temp = expand_binop (GET_MODE_2XWIDER_MODE (mode),
@@ -1460,9 +1463,10 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -1895,8 +1899,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
&& optab_handler (add_optab, word_mode) != CODE_FOR_nothing)
{
rtx product = NULL_RTX;
-
- if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (umul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
true, methods);
@@ -1905,7 +1909,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
}
if (product == NULL_RTX
- && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (smul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
false, methods);
@@ -1996,7 +2001,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ if (widening_optab_handler (binoptab, wider_mode, mode)
+ != CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
{
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -42,6 +42,11 @@ struct optab_handlers
int insn_code;
};
+struct widening_optab_handlers
+{
+ struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES];
+};
+
struct optab_d
{
enum rtx_code code;
@@ -50,6 +55,7 @@ struct optab_d
void (*libcall_gen)(struct optab_d *, const char *name, char suffix,
enum machine_mode);
struct optab_handlers handlers[NUM_MACHINE_MODES];
+ struct widening_optab_handlers *widening;
};
typedef struct optab_d * optab;
@@ -876,6 +882,23 @@ optab_handler (optab op, enum machine_mode mode)
+ (int) CODE_FOR_nothing);
}
+/* Like optab_handler, but for widening_operations that have a TO_MODE and
+ a FROM_MODE. */
+
+static inline enum insn_code
+widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode)
+{
+ if (to_mode == from_mode)
+ return optab_handler (op, to_mode);
+
+ if (op->widening)
+ return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ + (int) CODE_FOR_nothing);
+
+ return CODE_FOR_nothing;
+}
+
/* Record that insn CODE should be used to implement mode MODE of OP. */
static inline void
@@ -884,6 +907,26 @@ set_optab_handler (optab op, enum machine_mode mode, enum insn_code code)
op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing;
}
+/* Like set_optab_handler, but for widening operations that have a TO_MODE
+ and a FROM_MODE. */
+
+static inline void
+set_widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode, enum insn_code code)
+{
+ if (to_mode == from_mode)
+ set_optab_handler (op, to_mode, code);
+ else
+ {
+ if (op->widening == NULL)
+ op->widening = (struct widening_optab_handlers *)
+ xcalloc (1, sizeof (struct widening_optab_handlers));
+
+ op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ = (int) code - (int) CODE_FOR_nothing;
+ }
+}
+
/* Return the insn used to perform conversion OP from mode FROM_MODE
to mode TO_MODE; return CODE_FOR_nothing if the target does not have
such an insn. */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2055,6 +2055,8 @@ convert_mult_to_widen (gimple stmt)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
+ enum machine_mode to_mode, from_mode;
+ optab op;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2064,12 +2066,17 @@ convert_mult_to_widen (gimple stmt)
if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
- handler = optab_handler (umul_widen_optab, TYPE_MODE (type));
+ op = umul_widen_optab;
else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
- handler = optab_handler (smul_widen_optab, TYPE_MODE (type));
+ op = smul_widen_optab;
else
- handler = optab_handler (usmul_widen_optab, TYPE_MODE (type));
+ op = usmul_widen_optab;
+
+ handler = widening_optab_handler (op, to_mode, from_mode);
if (handler == CODE_FOR_nothing)
return false;
@@ -2171,7 +2178,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
+ == CODE_FOR_nothing)
return false;
/* ??? May need some type verification here? */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-07-22 15:52 ` Andrew Stubbs
@ 2011-08-19 14:41 ` Andrew Stubbs
2011-08-19 14:55 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 14:41 UTC (permalink / raw)
Cc: Bernd Schmidt, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
On 22/07/11 16:34, Andrew Stubbs wrote:
> On 22/07/11 14:28, Bernd Schmidt wrote:
>> Oh well, let's shelve it and do it later.
>
> Here's an updated patch with the formatting problem you found fixed.
I've just committed an updated version of this patch (attached).
I found a number of subtle bugs while I was testing, and these have now
been corrected. In particular, I found that VOIDmode constants were not
handled correctly; I've added a function "widened_mode" along the lines
originally suggested by Benrd to deal with this. I also found one case
where different code was produced to previously, although it was
actually corrected later in the patch series I've fixed it here now.
Andrew
[-- Attachment #2: widening-multiplies-1.patch --]
[-- Type: text/x-patch, Size: 15169 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* expr.c (expand_expr_real_2): Use widening_optab_handler.
* genopinit.c (optabs): Use set_widening_optab_handler for $N.
(gen_insn): $N now means $a must be wider than $b, not consecutive.
* optabs.c (widened_mode): New function.
(expand_widen_pattern_expr): Use widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (widening_optab_handlers): New struct.
(optab_d): New member, 'widening'.
(widening_optab_handler): New function.
(set_widening_optab_handler): New function.
* tree-ssa-math-opts.c (convert_mult_to_widen): Use
widening_optab_handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8005,7 +8005,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
this_optab = usmul_widen_optab;
if (mode == GET_MODE_2XWIDER_MODE (innermode))
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -8032,7 +8033,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
if (mode == GET_MODE_2XWIDER_MODE (innermode)
&& TREE_CODE (treeop0) != INTEGER_CST)
{
- if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, mode, innermode)
+ != CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
EXPAND_NORMAL);
@@ -8040,7 +8042,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (optab_handler (other_optab, mode) != CODE_FOR_nothing
+ if (widening_optab_handler (other_optab, mode, innermode)
+ != CODE_FOR_nothing
&& innermode == word_mode)
{
rtx htem, hipart;
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -46,10 +46,12 @@ along with GCC; see the file COPYING3. If not see
used. $A and $B are replaced with the full name of the mode; $a and $b
are replaced with the short form of the name, as above.
- If $N is present in the pattern, it means the two modes must be consecutive
- widths in the same mode class (e.g, QImode and HImode). $I means that
- only full integer modes should be considered for the next mode, and $F
- means that only float modes should be considered.
+ If $N is present in the pattern, it means the two modes must be in
+ the same mode class, and $b must be greater than $a (e.g, QImode
+ and HImode).
+
+ $I means that only full integer modes should be considered for the
+ next mode, and $F means that only float modes should be considered.
$P means that both full and partial integer modes should be considered.
$Q means that only fixed-point modes should be considered.
@@ -99,17 +101,17 @@ static const char * const optabs[] =
"set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
"set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
"set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
- "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
- "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
- "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
- "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
- "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
- "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
- "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
- "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
- "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
- "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
- "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
+ "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
+ "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
+ "set_widening_optab_handler (usmul_widen_optab, $B, $A, CODE_FOR_$(usmul$a$b3$)$N)",
+ "set_widening_optab_handler (smadd_widen_optab, $B, $A, CODE_FOR_$(madd$a$b4$)$N)",
+ "set_widening_optab_handler (umadd_widen_optab, $B, $A, CODE_FOR_$(umadd$a$b4$)$N)",
+ "set_widening_optab_handler (ssmadd_widen_optab, $B, $A, CODE_FOR_$(ssmadd$a$b4$)$N)",
+ "set_widening_optab_handler (usmadd_widen_optab, $B, $A, CODE_FOR_$(usmadd$a$b4$)$N)",
+ "set_widening_optab_handler (smsub_widen_optab, $B, $A, CODE_FOR_$(msub$a$b4$)$N)",
+ "set_widening_optab_handler (umsub_widen_optab, $B, $A, CODE_FOR_$(umsub$a$b4$)$N)",
+ "set_widening_optab_handler (ssmsub_widen_optab, $B, $A, CODE_FOR_$(ssmsub$a$b4$)$N)",
+ "set_widening_optab_handler (usmsub_widen_optab, $B, $A, CODE_FOR_$(usmsub$a$b4$)$N)",
"set_optab_handler (sdiv_optab, $A, CODE_FOR_$(div$a3$))",
"set_optab_handler (ssdiv_optab, $A, CODE_FOR_$(ssdiv$Q$a3$))",
"set_optab_handler (sdivv_optab, $A, CODE_FOR_$(div$V$I$a3$))",
@@ -305,7 +307,7 @@ gen_insn (rtx insn)
{
int force_float = 0, force_int = 0, force_partial_int = 0;
int force_fixed = 0;
- int force_consec = 0;
+ int force_wider = 0;
int matches = 1;
for (pp = optabs[pindex]; pp[0] != '$' || pp[1] != '('; pp++)
@@ -323,7 +325,7 @@ gen_insn (rtx insn)
switch (*++pp)
{
case 'N':
- force_consec = 1;
+ force_wider = 1;
break;
case 'I':
force_int = 1;
@@ -392,7 +394,10 @@ gen_insn (rtx insn)
|| mode_class[i] == MODE_VECTOR_FRACT
|| mode_class[i] == MODE_VECTOR_UFRACT
|| mode_class[i] == MODE_VECTOR_ACCUM
- || mode_class[i] == MODE_VECTOR_UACCUM))
+ || mode_class[i] == MODE_VECTOR_UACCUM)
+ && (! force_wider
+ || *pp == 'a'
+ || m1 < i))
break;
}
@@ -412,8 +417,7 @@ gen_insn (rtx insn)
}
if (matches && pp[0] == '$' && pp[1] == ')'
- && *np == 0
- && (! force_consec || (int) GET_MODE_WIDER_MODE(m1) == m2))
+ && *np == 0)
break;
}
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -225,6 +225,30 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
return 1;
}
\f
+/* Given two input operands, OP0 and OP1, determine what the correct from_mode
+ for a widening operation would be. In most cases this would be OP0, but if
+ that's a constant it'll be VOIDmode, which isn't useful. */
+
+static enum machine_mode
+widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
+{
+ enum machine_mode m0 = GET_MODE (op0);
+ enum machine_mode m1 = GET_MODE (op1);
+ enum machine_mode result;
+
+ if (m0 == VOIDmode && m1 == VOIDmode)
+ return to_mode;
+ else if (m0 == VOIDmode || GET_MODE_SIZE (m0) < GET_MODE_SIZE (m1))
+ result = m1;
+ else
+ result = m0;
+
+ if (GET_MODE_SIZE (result) > GET_MODE_SIZE (to_mode))
+ return to_mode;
+
+ return result;
+}
+\f
/* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -515,8 +539,8 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)));
+ icode = widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1242,7 +1266,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx target, int unsignedp, enum optab_methods methods,
rtx last)
{
- enum insn_code icode = optab_handler (binoptab, mode);
+ enum machine_mode from_mode = widened_mode (mode, op0, op1);
+ enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1389,7 +1414,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && optab_handler (binoptab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (binoptab, mode,
+ widened_mode (mode, op0, op1))
+ != CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
unsignedp, methods, last);
@@ -1429,8 +1456,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (binoptab == smul_optab
&& GET_MODE_2XWIDER_MODE (mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab),
- GET_MODE_2XWIDER_MODE (mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_2XWIDER_MODE (mode), mode)
!= CODE_FOR_nothing))
{
temp = expand_binop (GET_MODE_2XWIDER_MODE (mode),
@@ -1460,9 +1488,10 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode))
+ && (widening_optab_handler ((unsignedp ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -1895,8 +1924,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
&& optab_handler (add_optab, word_mode) != CODE_FOR_nothing)
{
rtx product = NULL_RTX;
-
- if (optab_handler (umul_widen_optab, mode) != CODE_FOR_nothing)
+ if (widening_optab_handler (umul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
true, methods);
@@ -1905,7 +1934,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
}
if (product == NULL_RTX
- && optab_handler (smul_widen_optab, mode) != CODE_FOR_nothing)
+ && widening_optab_handler (smul_widen_optab, mode, word_mode)
+ != CODE_FOR_nothing)
{
product = expand_doubleword_mult (mode, op0, op1, target,
false, methods);
@@ -1997,6 +2027,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+ || widening_optab_handler (binoptab, wider_mode, mode)
+ != CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
{
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -42,6 +42,11 @@ struct optab_handlers
int insn_code;
};
+struct widening_optab_handlers
+{
+ struct optab_handlers handlers[NUM_MACHINE_MODES][NUM_MACHINE_MODES];
+};
+
struct optab_d
{
enum rtx_code code;
@@ -50,6 +55,7 @@ struct optab_d
void (*libcall_gen)(struct optab_d *, const char *name, char suffix,
enum machine_mode);
struct optab_handlers handlers[NUM_MACHINE_MODES];
+ struct widening_optab_handlers *widening;
};
typedef struct optab_d * optab;
@@ -879,6 +885,23 @@ optab_handler (optab op, enum machine_mode mode)
+ (int) CODE_FOR_nothing);
}
+/* Like optab_handler, but for widening_operations that have a TO_MODE and
+ a FROM_MODE. */
+
+static inline enum insn_code
+widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode)
+{
+ if (to_mode == from_mode || from_mode == VOIDmode)
+ return optab_handler (op, to_mode);
+
+ if (op->widening)
+ return (enum insn_code) (op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ + (int) CODE_FOR_nothing);
+
+ return CODE_FOR_nothing;
+}
+
/* Record that insn CODE should be used to implement mode MODE of OP. */
static inline void
@@ -887,6 +910,26 @@ set_optab_handler (optab op, enum machine_mode mode, enum insn_code code)
op->handlers[(int) mode].insn_code = (int) code - (int) CODE_FOR_nothing;
}
+/* Like set_optab_handler, but for widening operations that have a TO_MODE
+ and a FROM_MODE. */
+
+static inline void
+set_widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode, enum insn_code code)
+{
+ if (to_mode == from_mode)
+ set_optab_handler (op, to_mode, code);
+ else
+ {
+ if (op->widening == NULL)
+ op->widening = (struct widening_optab_handlers *)
+ xcalloc (1, sizeof (struct widening_optab_handlers));
+
+ op->widening->handlers[(int) to_mode][(int) from_mode].insn_code
+ = (int) code - (int) CODE_FOR_nothing;
+ }
+}
+
/* Return the insn used to perform conversion OP from mode FROM_MODE
to mode TO_MODE; return CODE_FOR_nothing if the target does not have
such an insn. */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2056,6 +2056,8 @@ convert_mult_to_widen (gimple stmt)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
+ enum machine_mode to_mode, from_mode;
+ optab op;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2065,12 +2067,17 @@ convert_mult_to_widen (gimple stmt)
if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
- handler = optab_handler (umul_widen_optab, TYPE_MODE (type));
+ op = umul_widen_optab;
else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
- handler = optab_handler (smul_widen_optab, TYPE_MODE (type));
+ op = smul_widen_optab;
else
- handler = optab_handler (usmul_widen_optab, TYPE_MODE (type));
+ op = usmul_widen_optab;
+
+ handler = widening_optab_handler (op, to_mode, from_mode);
if (handler == CODE_FOR_nothing)
return false;
@@ -2172,7 +2179,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (optab_handler (this_optab, TYPE_MODE (type)) == CODE_FOR_nothing)
+ if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
+ == CODE_FOR_nothing)
return false;
/* ??? May need some type verification here? */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-08-19 14:41 ` Andrew Stubbs
@ 2011-08-19 14:55 ` Richard Guenther
2011-08-19 15:07 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-08-19 14:55 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Bernd Schmidt, gcc-patches, patches
On Fri, Aug 19, 2011 at 4:18 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 22/07/11 16:34, Andrew Stubbs wrote:
>>
>> On 22/07/11 14:28, Bernd Schmidt wrote:
>>>
>>> Oh well, let's shelve it and do it later.
>>
>> Here's an updated patch with the formatting problem you found fixed.
>
> I've just committed an updated version of this patch (attached).
>
> I found a number of subtle bugs while I was testing, and these have now been
> corrected. In particular, I found that VOIDmode constants were not handled
> correctly; I've added a function "widened_mode" along the lines originally
> suggested by Benrd to deal with this. I also found one case where different
> code was produced to previously, although it was actually corrected later in
> the patch series I've fixed it here now.
Seems one in the series has broken bootstrap on x86_64 when building
the 32bit libgcc multilib in stage1.
Richard.
> Andrew
>
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-08-19 14:55 ` Richard Guenther
@ 2011-08-19 15:07 ` Andrew Stubbs
2011-08-19 16:40 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 15:07 UTC (permalink / raw)
To: Richard Guenther; +Cc: Bernd Schmidt, gcc-patches, patches
On 19/08/11 15:45, Richard Guenther wrote:
> Seems one in the series has broken bootstrap on x86_64 when building
> the 32bit libgcc multilib in stage1.
Oh? Hopefully that'll be fixed when I complete the patchset. Patches 8
and 9 (of 7) did fix issues with the earlier patches.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (1/7)] New optab framework for widening multiplies
2011-08-19 15:07 ` Andrew Stubbs
@ 2011-08-19 16:40 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 16:40 UTC (permalink / raw)
Cc: Richard Guenther, Bernd Schmidt, gcc-patches, patches
On 19/08/11 15:51, Andrew Stubbs wrote:
> On 19/08/11 15:45, Richard Guenther wrote:
>> Seems one in the series has broken bootstrap on x86_64 when building
>> the 32bit libgcc multilib in stage1.
>
> Oh? Hopefully that'll be fixed when I complete the patchset. Patches 8
> and 9 (of 7) did fix issues with the earlier patches.
Seems fine now. Sorry for the trouble.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (2/7)] Widening multiplies by more than one mode
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
2011-06-23 14:39 ` [PATCH (1/7)] New optab framework for widening multiplies Andrew Stubbs
@ 2011-06-23 14:41 ` Andrew Stubbs
2011-07-12 10:15 ` Andrew Stubbs
2011-06-23 14:42 ` [PATCH (3/7)] Widening multiply-and-accumulate pattern matching Andrew Stubbs
` (7 subsequent siblings)
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:41 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 803 bytes --]
This patch has two effects:
1. It permits the use of widening multiply instructions that widen by
more than one mode. E.g. HImode -> DImode.
2. It enables the use of widening multiply instructions for (extended)
inputs of narrower mode than the instruction takes. E.g. QImode ->
DImode where only HI->DI or SI->DI is available.
Hopefully, most of the patch is self-explanatory, but here are few notes:
The code introduces a temporary FIXME comment; this will be removed
later in the patch series. In fact, this is not a new restriction;
previously "type1" and "type2" were implicitly identical because they
were required to be one mode smaller than "type".
I regard the ARM portion of this patch as obvious, so I don't think I
need an ARM maintainer to read this.
Is the patch OK?
Andrew
[-- Attachment #2: widening-multiplies-2.patch --]
[-- Type: text/x-patch, Size: 10879 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/arm/arm.md (maddhidi4): Remove '*' from name.
* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
* optabs.c (find_widening_optab_handler): New function.
(expand_widen_pattern_expr): Use find_widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (find_widening_optab_handler): New prototype.
* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
type precision rules.
(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Allow widening by
more than one mode.
Explicitly disallow mis-matched input types.
(convert_mult_to_widen): Use find_widening_optab_handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
)
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
[(set (match_operand:DI 0 "s_register_operand" "=r")
(plus:DI
(mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7632,19 +7632,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
{
enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
this_optab = usmul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode))
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+ != CODE_FOR_nothing)
{
- if (widening_optab_handler (this_optab, mode, innermode)
- != CODE_FOR_nothing)
- {
- if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
- expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
- else
- expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
- goto binop3;
- }
+ if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+ expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+ else
+ expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+ goto binop3;
}
}
/* Check for a multiplication with matching signedness. */
@@ -7659,10 +7656,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode)
- && TREE_CODE (treeop0) != INTEGER_CST)
+ if (TREE_CODE (treeop0) != INTEGER_CST)
{
- if (widening_optab_handler (this_optab, mode, innermode)
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
!= CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7671,7 +7667,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (widening_optab_handler (other_optab, mode, innermode)
+ if (find_widening_optab_handler (other_optab, mode, innermode, 0)
!= CODE_FOR_nothing
&& innermode == word_mode)
{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -225,6 +225,32 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
return 1;
}
\f
+/* Find a widening optab even if it doesn't widen as much as we want.
+ E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+ direct HI->SI insn, then return SI->DI, if that exists.
+ If PERMIT_NON_WIDENING is non-zero then this can be used with
+ non-widening optabs also. */
+
+enum insn_code
+find_widening_optab_handler (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening)
+{
+ for (; (permit_non_widening || from_mode != to_mode)
+ && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
+ && from_mode != VOIDmode;
+ from_mode = GET_MODE_WIDER_MODE (from_mode))
+ {
+ enum insn_code handler = widening_optab_handler (op, to_mode,
+ from_mode);
+
+ if (handler != CODE_FOR_nothing)
+ return handler;
+ }
+
+ return CODE_FOR_nothing;
+}
+\f
/* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -515,8 +541,9 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = widening_optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
+ icode = find_widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)),
+ tmode0, 0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1243,7 +1270,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx last)
{
enum machine_mode from_mode = GET_MODE (op0);
- enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
+ enum insn_code icode = find_widening_optab_handler (binoptab, mode,
+ from_mode, 1);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1390,7 +1418,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ && find_widening_optab_handler (binoptab, mode, GET_MODE (op0), 1)
!= CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
@@ -1461,14 +1489,15 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (widening_optab_handler (binoptab, wider_mode, mode)
+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
!= CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (widening_optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode),
- mode)
+ && (find_widening_optab_handler ((unsignedp
+ ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode, 0)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -2003,7 +2032,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (widening_optab_handler (binoptab, wider_mode, mode)
+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
!= CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -807,6 +807,10 @@ extern rtx expand_copysign (rtx, rtx, rtx);
extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
+/* Find a widening optab even if it doesn't widen as much as we want. */
+extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
+ enum machine_mode, int);
+
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
shift amount vs. machines that take a vector for the shift amount. */
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3577,7 +3577,7 @@ do_pointer_plus_expr_check:
case WIDEN_MULT_EXPR:
if (TREE_CODE (lhs_type) != INTEGER_TYPE)
return true;
- return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type))
+ return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type))
|| (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)));
case WIDEN_SUM_EXPR:
@@ -3668,7 +3668,7 @@ verify_gimple_assign_ternary (gimple stmt)
&& !FIXED_POINT_TYPE_P (rhs1_type))
|| !useless_type_conversion_p (rhs1_type, rhs2_type)
|| !useless_type_conversion_p (lhs_type, rhs3_type)
- || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
+ || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)
|| TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))
{
error ("type mismatch in widening multiply-accumulate expression");
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1950,8 +1950,8 @@ struct gimple_opt_pass pass_optimize_bswap =
/* Return true if RHS is a suitable operand for a widening multiplication.
There are two cases:
- - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT
- if so, and store its type in *TYPE_OUT.
+ - RHS makes some value at least twice as wide. Store that value
+ in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT.
- RHS is an integer constant. Store that value in *NEW_RHS_OUT if so,
but leave *TYPE_OUT untouched. */
@@ -1979,7 +1979,7 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
- || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type))
+ || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
*new_rhs_out = rhs1;
@@ -2035,6 +2035,10 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
+ /* FIXME: remove this restriction. */
+ if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
+ return false;
+
return true;
}
@@ -2068,7 +2072,7 @@ convert_mult_to_widen (gimple stmt)
else
op = usmul_widen_optab;
- handler = widening_optab_handler (op, to_mode, from_mode);
+ handler = find_widening_optab_handler (op, to_mode, from_mode, 0);
if (handler == CODE_FOR_nothing)
return false;
@@ -2171,8 +2175,10 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
- == CODE_FOR_nothing)
+ handler = find_widening_optab_handler (this_optab, TYPE_MODE (type),
+ TYPE_MODE (type1), 0);
+
+ if (handler == CODE_FOR_nothing)
return false;
/* ??? May need some type verification here? */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-06-23 14:41 ` [PATCH (2/7)] Widening multiplies by more than one mode Andrew Stubbs
@ 2011-07-12 10:15 ` Andrew Stubbs
2011-07-12 11:05 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-12 10:15 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 1368 bytes --]
On 23/06/11 15:39, Andrew Stubbs wrote:
> This patch has two effects:
>
> 1. It permits the use of widening multiply instructions that widen by
> more than one mode. E.g. HImode -> DImode.
>
> 2. It enables the use of widening multiply instructions for (extended)
> inputs of narrower mode than the instruction takes. E.g. QImode ->
> DImode where only HI->DI or SI->DI is available.
>
> Hopefully, most of the patch is self-explanatory, but here are few notes:
>
> The code introduces a temporary FIXME comment; this will be removed
> later in the patch series. In fact, this is not a new restriction;
> previously "type1" and "type2" were implicitly identical because they
> were required to be one mode smaller than "type".
>
> I regard the ARM portion of this patch as obvious, so I don't think I
> need an ARM maintainer to read this.
>
> Is the patch OK?
I found a bug in this patch. It seems I do need to add casts for the
inputs to widening multiplies (even though I know the registers are
already fine), because otherwise something is insisting on truncating
the values to the minimum width, which isn't helpful when it's actually
an instruction with wider inputs.
The mode changing bits from patch 4 have therefore been moved here. I've
made the changes Richard Guenther requested there, I think.
Otherwise, the patch is the same as before.
Andrew
[-- Attachment #2: widening-multiplies-2.patch --]
[-- Type: text/x-patch, Size: 15998 bytes --]
2011-07-11 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/arm/arm.md (maddhidi4): Remove '*' from name.
* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
* optabs.c (find_widening_optab_handler_and_mode): New function.
(expand_widen_pattern_expr): Use find_widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (find_widening_optab_handler): New macro define.
(find_widening_optab_handler_and_mode): New prototype.
* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
type precision rules.
(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
* tree-ssa-math-opts.c (build_and_insert_cast): New function.
(is_widening_mult_rhs_p): Allow widening by more than one mode.
Explicitly disallow mis-matched input types.
(convert_mult_to_widen): Use find_widening_optab_handler, and cast
input types to fit the new handler.
(convert_plusminus_to_widen): Likewise.
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
)
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
[(set (match_operand:DI 0 "s_register_operand" "=r")
(plus:DI
(mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7638,19 +7638,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
{
enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
this_optab = usmul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode))
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+ != CODE_FOR_nothing)
{
- if (widening_optab_handler (this_optab, mode, innermode)
- != CODE_FOR_nothing)
- {
- if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
- expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
- else
- expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
- goto binop3;
- }
+ if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+ expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+ else
+ expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+ goto binop3;
}
}
/* Check for a multiplication with matching signedness. */
@@ -7665,10 +7662,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode)
- && TREE_CODE (treeop0) != INTEGER_CST)
+ if (TREE_CODE (treeop0) != INTEGER_CST)
{
- if (widening_optab_handler (this_optab, mode, innermode)
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
!= CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7677,7 +7673,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (widening_optab_handler (other_optab, mode, innermode)
+ if (find_widening_optab_handler (other_optab, mode, innermode, 0)
!= CODE_FOR_nothing
&& innermode == word_mode)
{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -225,6 +225,37 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
return 1;
}
\f
+/* Find a widening optab even if it doesn't widen as much as we want.
+ E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+ direct HI->SI insn, then return SI->DI, if that exists.
+ If PERMIT_NON_WIDENING is non-zero then this can be used with
+ non-widening optabs also. */
+
+enum insn_code
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
+{
+ for (; (permit_non_widening || from_mode != to_mode)
+ && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
+ && from_mode != VOIDmode;
+ from_mode = GET_MODE_WIDER_MODE (from_mode))
+ {
+ enum insn_code handler = widening_optab_handler (op, to_mode,
+ from_mode);
+
+ if (handler != CODE_FOR_nothing)
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
+ }
+
+ return CODE_FOR_nothing;
+}
+\f
/* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -515,8 +546,9 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = widening_optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
+ icode = find_widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)),
+ tmode0, 0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1243,7 +1275,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx last)
{
enum machine_mode from_mode = GET_MODE (op0);
- enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
+ enum insn_code icode = find_widening_optab_handler (binoptab, mode,
+ from_mode, 1);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1390,7 +1423,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ && find_widening_optab_handler (binoptab, mode, GET_MODE (op0), 1)
!= CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
@@ -1464,10 +1497,11 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
!= CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (widening_optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode),
- mode)
+ && (find_widening_optab_handler ((unsignedp
+ ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode, 0)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -2002,7 +2036,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (widening_optab_handler (binoptab, wider_mode, mode)
+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
!= CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -807,6 +807,15 @@ extern rtx expand_copysign (rtx, rtx, rtx);
extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
+/* Find a widening optab even if it doesn't widen as much as we want. */
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
+
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
shift amount vs. machines that take a vector for the shift amount. */
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3577,7 +3577,7 @@ do_pointer_plus_expr_check:
case WIDEN_MULT_EXPR:
if (TREE_CODE (lhs_type) != INTEGER_TYPE)
return true;
- return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type))
+ return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type))
|| (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)));
case WIDEN_SUM_EXPR:
@@ -3668,7 +3668,7 @@ verify_gimple_assign_ternary (gimple stmt)
&& !FIXED_POINT_TYPE_P (rhs1_type))
|| !useless_type_conversion_p (rhs1_type, rhs2_type)
|| !useless_type_conversion_p (lhs_type, rhs3_type)
- || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
+ || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)
|| TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))
{
error ("type mismatch in widening multiply-accumulate expression");
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1086,6 +1086,16 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TARGET. Insert the statement
+ prior to GSI's current position, and return the fresh SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val)
+{
+ return build_and_insert_binop (gsi, loc, target, CONVERT_EXPR, val, NULL);
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -1958,8 +1968,8 @@ struct gimple_opt_pass pass_optimize_bswap =
/* Return true if RHS is a suitable operand for a widening multiplication.
There are two cases:
- - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT
- if so, and store its type in *TYPE_OUT.
+ - RHS makes some value at least twice as wide. Store that value
+ in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT.
- RHS is an integer constant. Store that value in *NEW_RHS_OUT if so,
but leave *TYPE_OUT untouched. */
@@ -1987,7 +1997,7 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
- || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type))
+ || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
*new_rhs_out = rhs1;
@@ -2043,6 +2053,10 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
+ /* FIXME: remove this restriction. */
+ if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
+ return false;
+
return true;
}
@@ -2051,7 +2065,7 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
@@ -2076,13 +2090,34 @@ convert_mult_to_widen (gimple stmt)
else
op = usmul_widen_optab;
- handler = widening_optab_handler (op, to_mode, from_mode);
+ handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode,
+ 0, &from_mode);
if (handler == CODE_FOR_nothing)
return false;
- gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
- gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
+ if (from_mode != TYPE_MODE (type1))
+ {
+ location_t loc = gimple_location (stmt);
+ tree tmp1, tmp2;
+
+ tmp1 = create_tmp_var (
+ build_nonstandard_integer_type (
+ GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
+ NULL);
+ tmp2 = TYPE_UNSIGNED (type1) == TYPE_UNSIGNED (type2)
+ ? tmp1
+ : create_tmp_var (
+ build_nonstandard_integer_type (
+ GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
+ NULL);
+
+ rhs1 = build_and_insert_cast (gsi, loc, tmp1, rhs1);
+ rhs2 = build_and_insert_cast (gsi, loc, tmp2, rhs2);
+ }
+
+ gimple_assign_set_rhs1 (stmt, rhs1);
+ gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
update_stmt (stmt);
widen_mul_stats.widen_mults_inserted++;
@@ -2105,6 +2140,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
enum tree_code wmult_code;
+ enum insn_code handler;
+ enum machine_mode from_mode;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2138,36 +2175,27 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
- if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
+ /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
+ is_widening_mult_p, but we still need the rhs returns.
+
+ It might also appear that it would be sufficient to use the existing
+ operands of the widening multiply, but that would limit the choice of
+ multiply-and-accumulate instructions. */
+ if (code == PLUS_EXPR
+ && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
}
- else if (rhs2_code == MULT_EXPR)
+ else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
}
- else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs2;
- }
- else if (rhs2_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs1;
- }
else
return false;
@@ -2178,15 +2206,29 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
- == CODE_FOR_nothing)
+ handler = find_widening_optab_handler_and_mode (this_optab,
+ TYPE_MODE (type),
+ TYPE_MODE (type1), 0,
+ &from_mode);
+
+ if (handler == CODE_FOR_nothing)
return false;
- /* ??? May need some type verification here? */
+ if (TYPE_MODE (type1) != from_mode)
+ {
+ location_t loc = gimple_location (stmt);
+ tree tmp;
+
+ tmp = create_tmp_var (
+ build_nonstandard_integer_type (
+ GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
+ NULL);
+
+ mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
+ mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
+ }
- gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code,
- fold_convert (type1, mult_rhs1),
- fold_convert (type2, mult_rhs2),
+ gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
widen_mul_stats.maccs_inserted++;
@@ -2398,7 +2440,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 10:15 ` Andrew Stubbs
@ 2011-07-12 11:05 ` Richard Guenther
2011-07-12 11:14 ` Richard Guenther
2011-07-14 14:17 ` Andrew Stubbs
0 siblings, 2 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-12 11:05 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Tue, Jul 12, 2011 at 11:50 AM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 23/06/11 15:39, Andrew Stubbs wrote:
>>
>> This patch has two effects:
>>
>> 1. It permits the use of widening multiply instructions that widen by
>> more than one mode. E.g. HImode -> DImode.
>>
>> 2. It enables the use of widening multiply instructions for (extended)
>> inputs of narrower mode than the instruction takes. E.g. QImode ->
>> DImode where only HI->DI or SI->DI is available.
>>
>> Hopefully, most of the patch is self-explanatory, but here are few notes:
>>
>> The code introduces a temporary FIXME comment; this will be removed
>> later in the patch series. In fact, this is not a new restriction;
>> previously "type1" and "type2" were implicitly identical because they
>> were required to be one mode smaller than "type".
>>
>> I regard the ARM portion of this patch as obvious, so I don't think I
>> need an ARM maintainer to read this.
>>
>> Is the patch OK?
>
> I found a bug in this patch. It seems I do need to add casts for the inputs
> to widening multiplies (even though I know the registers are already fine),
> because otherwise something is insisting on truncating the values to the
> minimum width, which isn't helpful when it's actually an instruction with
> wider inputs.
>
> The mode changing bits from patch 4 have therefore been moved here. I've
> made the changes Richard Guenther requested there, I think.
>
> Otherwise, the patch is the same as before.
I wonder if we want to restrict the WIDEN_* operations to operate
on types that have matching type/mode precision(**). Consider
struct {
int a : 7;
int b : 7;
} x;
short c = x.a * x.b;
which will be represented as (short)((int)<7-bit-type-with-QImode> *
(int)<7-bit-type-with-QImode>).
I wonder if you can do some experiments with bitfield types and see
if your patch series handles them correctly.
As for the patch, please update tree.def with the new requirements
for the WIDEN_* codes.
As for the bitfield precisions, we probably want to reject types that
do not have TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE
(type)). Or maybe we can allow them if we generate
correct and good code for them?
+ tmp2 = TYPE_UNSIGNED (type1) == TYPE_UNSIGNED (type2)
+ ? tmp1
+ : create_tmp_var (
+ build_nonstandard_integer_type (
+ GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
+ NULL);
please use an if () stmt to avoid gross formatting.
+ if (TYPE_MODE (type1) != from_mode)
these kind of checks are unsafe if type1 does not have a TYPE_PRECISION
equal to its mode precision.
Thanks,
Richard.
> Andrew
>
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 11:05 ` Richard Guenther
@ 2011-07-12 11:14 ` Richard Guenther
2011-07-12 11:38 ` Andrew Stubbs
2011-07-21 19:51 ` Joseph S. Myers
2011-07-14 14:17 ` Andrew Stubbs
1 sibling, 2 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-12 11:14 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Tue, Jul 12, 2011 at 1:04 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 12, 2011 at 11:50 AM, Andrew Stubbs <ams@codesourcery.com> wrote:
>> On 23/06/11 15:39, Andrew Stubbs wrote:
>>>
>>> This patch has two effects:
>>>
>>> 1. It permits the use of widening multiply instructions that widen by
>>> more than one mode. E.g. HImode -> DImode.
>>>
>>> 2. It enables the use of widening multiply instructions for (extended)
>>> inputs of narrower mode than the instruction takes. E.g. QImode ->
>>> DImode where only HI->DI or SI->DI is available.
>>>
>>> Hopefully, most of the patch is self-explanatory, but here are few notes:
>>>
>>> The code introduces a temporary FIXME comment; this will be removed
>>> later in the patch series. In fact, this is not a new restriction;
>>> previously "type1" and "type2" were implicitly identical because they
>>> were required to be one mode smaller than "type".
>>>
>>> I regard the ARM portion of this patch as obvious, so I don't think I
>>> need an ARM maintainer to read this.
>>>
>>> Is the patch OK?
>>
>> I found a bug in this patch. It seems I do need to add casts for the inputs
>> to widening multiplies (even though I know the registers are already fine),
>> because otherwise something is insisting on truncating the values to the
>> minimum width, which isn't helpful when it's actually an instruction with
>> wider inputs.
>>
>> The mode changing bits from patch 4 have therefore been moved here. I've
>> made the changes Richard Guenther requested there, I think.
>>
>> Otherwise, the patch is the same as before.
>
> I wonder if we want to restrict the WIDEN_* operations to operate
> on types that have matching type/mode precision(**). Consider
>
> struct {
> int a : 7;
> int b : 7;
> } x;
>
> short c = x.a * x.b;
>
> which will be represented as (short)((int)<7-bit-type-with-QImode> *
> (int)<7-bit-type-with-QImode>).
>
> I wonder if you can do some experiments with bitfield types and see
> if your patch series handles them correctly.
>
> As for the patch, please update tree.def with the new requirements
> for the WIDEN_* codes.
>
> As for the bitfield precisions, we probably want to reject types that
> do not have TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE
> (type)). Or maybe we can allow them if we generate
> correct and good code for them?
>
> + tmp2 = TYPE_UNSIGNED (type1) == TYPE_UNSIGNED (type2)
> + ? tmp1
> + : create_tmp_var (
> + build_nonstandard_integer_type (
> + GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
> + NULL);
>
> please use an if () stmt to avoid gross formatting.
>
> + if (TYPE_MODE (type1) != from_mode)
>
> these kind of checks are unsafe if type1 does not have a TYPE_PRECISION
> equal to its mode precision.
(**) We really ought to forbid any arithmetic on types that have non-mode
precision and only allow conversions to/from such types.
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 11:14 ` Richard Guenther
@ 2011-07-12 11:38 ` Andrew Stubbs
2011-07-12 11:51 ` Richard Guenther
2011-07-21 19:51 ` Joseph S. Myers
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-12 11:38 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
On 12/07/11 12:05, Richard Guenther wrote:
> (**) We really ought to forbid any arithmetic on types that have non-mode
> precision and only allow conversions to/from such types.
Hmmm, presumably the problem is that we might have a compatible
precision, but the backends actually work with purely mode-sized types?
That does sound problematic. :(
Does the recent bitfield lowering activity have any affect on this? I.e.
does it make it a moot point by the time we get to the widen_mult pass?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 11:38 ` Andrew Stubbs
@ 2011-07-12 11:51 ` Richard Guenther
0 siblings, 0 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-12 11:51 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Tue, Jul 12, 2011 at 1:26 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
> On 12/07/11 12:05, Richard Guenther wrote:
>>
>> (**) We really ought to forbid any arithmetic on types that have non-mode
>> precision and only allow conversions to/from such types.
>
> Hmmm, presumably the problem is that we might have a compatible precision,
> but the backends actually work with purely mode-sized types?
>
> That does sound problematic. :(
>
> Does the recent bitfield lowering activity have any affect on this? I.e.
> does it make it a moot point by the time we get to the widen_mult pass?
No, the bitfield lowering will only change the types of memory loads,
not the types of the quantities we eventually see in the IL. Thus for
my example we'd still see the casts from 7-bit types.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 11:14 ` Richard Guenther
2011-07-12 11:38 ` Andrew Stubbs
@ 2011-07-21 19:51 ` Joseph S. Myers
2011-07-22 8:58 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Joseph S. Myers @ 2011-07-21 19:51 UTC (permalink / raw)
To: Richard Guenther; +Cc: Andrew Stubbs, gcc-patches, patches
On Tue, 12 Jul 2011, Richard Guenther wrote:
> (**) We really ought to forbid any arithmetic on types that have non-mode
> precision and only allow conversions to/from such types.
Arithmetic on such types is a perfectly reasonable notion to have in
language-independent code and carry out language-independent optimizations
on. There may well be a case for lowering such arithmetic earlier than
the present point at which it's lowered (expand), but it isn't obvious
that gimplification is the right point for that lowering either.
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-21 19:51 ` Joseph S. Myers
@ 2011-07-22 8:58 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 8:58 UTC (permalink / raw)
To: Joseph S. Myers; +Cc: Richard Guenther, gcc-patches, patches
On 21/07/11 20:29, Joseph S. Myers wrote:
> On Tue, 12 Jul 2011, Richard Guenther wrote:
>
>> (**) We really ought to forbid any arithmetic on types that have non-mode
>> precision and only allow conversions to/from such types.
>
> Arithmetic on such types is a perfectly reasonable notion to have in
> language-independent code and carry out language-independent optimizations
> on. There may well be a case for lowering such arithmetic earlier than
> the present point at which it's lowered (expand), but it isn't obvious
> that gimplification is the right point for that lowering either.
This optimization deals with real machine instructions, and so the
inputs must always be in whole-mode sizes. With my patch, this pass
inserts conversions to ensure this is the case.
However, the code takes into account the true precision of each input
when selecting the most optimal machine instruction to use, so I think
it should have satisfied both goals.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-12 11:05 ` Richard Guenther
2011-07-12 11:14 ` Richard Guenther
@ 2011-07-14 14:17 ` Andrew Stubbs
2011-07-14 14:24 ` Richard Guenther
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:17 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 943 bytes --]
On 12/07/11 12:04, Richard Guenther wrote:
> I wonder if we want to restrict the WIDEN_* operations to operate
> on types that have matching type/mode precision(**).
I've now modified the patch to allow bitfields, or other case where the
precision is smaller than the mode-size. I've also addressed the
formatting issues you pointed out (and in fact reorganised the code
slightly to make the rest of the series a bit cleaner).
As in the previous version of this patch, it's necessary to convert the
input values to the proper mode for the machine instruction, so the
basic tools for supporting the bitfields were already there - I just had
to tweak the conditionals to take bitfields into account.
The only this I haven't done is modified tree.def. Looking at it though,
I don't thing any needs changing? The code is still valid, and the
comments are correct (in fact, they may have been wrong before).
Is this version OK?
Andrew
[-- Attachment #2: widening-multiplies-2.patch --]
[-- Type: text/x-patch, Size: 17355 bytes --]
2011-07-14 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/arm/arm.md (maddhidi4): Remove '*' from name.
* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
* optabs.c (find_widening_optab_handler_and_mode): New function.
(expand_widen_pattern_expr): Use find_widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (find_widening_optab_handler): New macro define.
(find_widening_optab_handler_and_mode): New prototype.
* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
type precision rules.
(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
* tree-ssa-math-opts.c (build_and_insert_cast): New function.
(is_widening_mult_rhs_p): Allow widening by more than one mode.
Explicitly disallow mis-matched input types.
(convert_mult_to_widen): Use find_widening_optab_handler, and cast
input types to fit the new handler.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-bitfield-1.c: New file.
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
)
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
[(set (match_operand:DI 0 "s_register_operand" "=r")
(plus:DI
(mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7638,19 +7638,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
{
enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
this_optab = usmul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode))
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+ != CODE_FOR_nothing)
{
- if (widening_optab_handler (this_optab, mode, innermode)
- != CODE_FOR_nothing)
- {
- if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
- expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
- else
- expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
- goto binop3;
- }
+ if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+ expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+ else
+ expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+ goto binop3;
}
}
/* Check for a multiplication with matching signedness. */
@@ -7665,10 +7662,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode)
- && TREE_CODE (treeop0) != INTEGER_CST)
+ if (TREE_CODE (treeop0) != INTEGER_CST)
{
- if (widening_optab_handler (this_optab, mode, innermode)
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
!= CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7677,7 +7673,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (widening_optab_handler (other_optab, mode, innermode)
+ if (find_widening_optab_handler (other_optab, mode, innermode, 0)
!= CODE_FOR_nothing
&& innermode == word_mode)
{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -225,6 +225,37 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
return 1;
}
\f
+/* Find a widening optab even if it doesn't widen as much as we want.
+ E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+ direct HI->SI insn, then return SI->DI, if that exists.
+ If PERMIT_NON_WIDENING is non-zero then this can be used with
+ non-widening optabs also. */
+
+enum insn_code
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
+{
+ for (; (permit_non_widening || from_mode != to_mode)
+ && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
+ && from_mode != VOIDmode;
+ from_mode = GET_MODE_WIDER_MODE (from_mode))
+ {
+ enum insn_code handler = widening_optab_handler (op, to_mode,
+ from_mode);
+
+ if (handler != CODE_FOR_nothing)
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
+ }
+
+ return CODE_FOR_nothing;
+}
+\f
/* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -515,8 +546,9 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = widening_optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
+ icode = find_widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)),
+ tmode0, 0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1243,7 +1275,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx last)
{
enum machine_mode from_mode = GET_MODE (op0);
- enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
+ enum insn_code icode = find_widening_optab_handler (binoptab, mode,
+ from_mode, 1);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1390,7 +1423,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && widening_optab_handler (binoptab, mode, GET_MODE (op0))
+ && find_widening_optab_handler (binoptab, mode, GET_MODE (op0), 1)
!= CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
@@ -1464,10 +1497,11 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
!= CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (widening_optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode),
- mode)
+ && (find_widening_optab_handler ((unsignedp
+ ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode, 0)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -2002,7 +2036,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (widening_optab_handler (binoptab, wider_mode, mode)
+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
!= CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -807,6 +807,15 @@ extern rtx expand_copysign (rtx, rtx, rtx);
extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
+/* Find a widening optab even if it doesn't widen as much as we want. */
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
+
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
shift amount vs. machines that take a vector for the shift amount. */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+struct bf
+{
+ int a : 3;
+ int b : 15;
+ int c : 3;
+};
+
+long long
+foo (long long a, struct bf b, struct bf c)
+{
+ return a + b.b * c.b;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3577,7 +3577,7 @@ do_pointer_plus_expr_check:
case WIDEN_MULT_EXPR:
if (TREE_CODE (lhs_type) != INTEGER_TYPE)
return true;
- return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type))
+ return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type))
|| (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)));
case WIDEN_SUM_EXPR:
@@ -3668,7 +3668,7 @@ verify_gimple_assign_ternary (gimple stmt)
&& !FIXED_POINT_TYPE_P (rhs1_type))
|| !useless_type_conversion_p (rhs1_type, rhs2_type)
|| !useless_type_conversion_p (lhs_type, rhs3_type)
- || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
+ || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)
|| TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))
{
error ("type mismatch in widening multiply-accumulate expression");
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1086,6 +1086,16 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TARGET. Insert the statement
+ prior to GSI's current position, and return the fresh SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val)
+{
+ return build_and_insert_binop (gsi, loc, target, CONVERT_EXPR, val, NULL);
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -1958,8 +1968,8 @@ struct gimple_opt_pass pass_optimize_bswap =
/* Return true if RHS is a suitable operand for a widening multiplication.
There are two cases:
- - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT
- if so, and store its type in *TYPE_OUT.
+ - RHS makes some value at least twice as wide. Store that value
+ in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT.
- RHS is an integer constant. Store that value in *NEW_RHS_OUT if so,
but leave *TYPE_OUT untouched. */
@@ -1987,7 +1997,7 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
- || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type))
+ || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
*new_rhs_out = rhs1;
@@ -2043,6 +2053,10 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
+ /* FIXME: remove this restriction. */
+ if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
+ return false;
+
return true;
}
@@ -2051,12 +2065,14 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
- tree lhs, rhs1, rhs2, type, type1, type2;
+ tree lhs, rhs1, rhs2, type, type1, type2, tmp;
enum insn_code handler;
- enum machine_mode to_mode, from_mode;
+ enum machine_mode to_mode, from_mode, actual_mode;
optab op;
+ int actual_precision;
+ location_t loc = gimple_location (stmt);
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2076,13 +2092,32 @@ convert_mult_to_widen (gimple stmt)
else
op = usmul_widen_optab;
- handler = widening_optab_handler (op, to_mode, from_mode);
+ handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode,
+ 0, &actual_mode);
if (handler == CODE_FOR_nothing)
return false;
- gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
- gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
+ /* Ensure that the inputs to the handler are in the correct precison
+ for the opcode. This will be the full mode size. */
+ actual_precision = GET_MODE_PRECISION (actual_mode);
+ if (actual_precision != TYPE_PRECISION (type1))
+ {
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type1)),
+ NULL);
+ rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
+
+ /* Reuse the same type info, if possible. */
+ if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type2)),
+ NULL);
+ rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
+ }
+
+ gimple_assign_set_rhs1 (stmt, rhs1);
+ gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
update_stmt (stmt);
widen_mul_stats.widen_mults_inserted++;
@@ -2100,11 +2135,15 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code code)
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
- tree type, type1, type2;
+ tree type, type1, type2, tmp;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
enum tree_code wmult_code;
+ enum insn_code handler;
+ enum machine_mode to_mode, from_mode, actual_mode;
+ location_t loc = gimple_location (stmt);
+ int actual_precision;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2138,39 +2177,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
- if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
+ /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
+ is_widening_mult_p, but we still need the rhs returns.
+
+ It might also appear that it would be sufficient to use the existing
+ operands of the widening multiply, but that would limit the choice of
+ multiply-and-accumulate instructions. */
+ if (code == PLUS_EXPR
+ && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
}
- else if (rhs2_code == MULT_EXPR)
+ else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
}
- else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs2;
- }
- else if (rhs2_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs1;
- }
else
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
@@ -2178,15 +2211,26 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
- == CODE_FOR_nothing)
+ handler = find_widening_optab_handler_and_mode (this_optab, to_mode,
+ from_mode, 0, &actual_mode);
+
+ if (handler == CODE_FOR_nothing)
return false;
- /* ??? May need some type verification here? */
+ /* Ensure that the inputs to the handler are in the correct precison
+ for the opcode. This will be the full mode size. */
+ actual_precision = GET_MODE_PRECISION (actual_mode);
+ if (actual_precision != TYPE_PRECISION (type1))
+ {
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type1)),
+ NULL);
+
+ mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
+ mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
+ }
- gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code,
- fold_convert (type1, mult_rhs1),
- fold_convert (type2, mult_rhs2),
+ gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
widen_mul_stats.maccs_inserted++;
@@ -2398,7 +2442,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-14 14:17 ` Andrew Stubbs
@ 2011-07-14 14:24 ` Richard Guenther
2011-08-19 14:45 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-14 14:24 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 14, 2011 at 4:10 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 12/07/11 12:04, Richard Guenther wrote:
>>
>> I wonder if we want to restrict the WIDEN_* operations to operate
>> on types that have matching type/mode precision(**).
>
> I've now modified the patch to allow bitfields, or other case where the
> precision is smaller than the mode-size. I've also addressed the formatting
> issues you pointed out (and in fact reorganised the code slightly to make
> the rest of the series a bit cleaner).
>
> As in the previous version of this patch, it's necessary to convert the
> input values to the proper mode for the machine instruction, so the basic
> tools for supporting the bitfields were already there - I just had to tweak
> the conditionals to take bitfields into account.
>
> The only this I haven't done is modified tree.def. Looking at it though, I
> don't thing any needs changing? The code is still valid, and the comments
> are correct (in fact, they may have been wrong before).
Ah, it indeed talks about at least twice the precision already.
> Is this version OK?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (2/7)] Widening multiplies by more than one mode
2011-07-14 14:24 ` Richard Guenther
@ 2011-08-19 14:45 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 14:45 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 285 bytes --]
On 14/07/11 15:15, Richard Guenther wrote:
>> Is this version OK?
> Ok.
I've just committed this slightly updated patch.
I found some bugs while testing, now fixed. Most of the changes in this
patch are context changes, and using widened_mode to handle VOIDmode
constants.
Andrew
[-- Attachment #2: widening-multiplies-2.patch --]
[-- Type: text/x-patch, Size: 17544 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/arm/arm.md (maddhidi4): Remove '*' from name.
* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
* optabs.c (find_widening_optab_handler_and_mode): New function.
(expand_widen_pattern_expr): Use find_widening_optab_handler.
(expand_binop_directly): Likewise.
(expand_binop): Likewise.
* optabs.h (find_widening_optab_handler): New macro define.
(find_widening_optab_handler_and_mode): New prototype.
* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
type precision rules.
(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
* tree-ssa-math-opts.c (build_and_insert_cast): New function.
(is_widening_mult_rhs_p): Allow widening by more than one mode.
Explicitly disallow mis-matched input types.
(convert_mult_to_widen): Use find_widening_optab_handler, and cast
input types to fit the new handler.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-bitfield-1.c: New file.
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
)
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
[(set (match_operand:DI 0 "s_register_operand" "=r")
(plus:DI
(mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8003,19 +8003,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
{
enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
this_optab = usmul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode))
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+ != CODE_FOR_nothing)
{
- if (widening_optab_handler (this_optab, mode, innermode)
- != CODE_FOR_nothing)
- {
- if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
- expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
- else
- expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
- goto binop3;
- }
+ if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+ expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+ else
+ expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+ goto binop3;
}
}
/* Check for a multiplication with matching signedness. */
@@ -8030,10 +8027,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
- if (mode == GET_MODE_2XWIDER_MODE (innermode)
- && TREE_CODE (treeop0) != INTEGER_CST)
+ if (TREE_CODE (treeop0) != INTEGER_CST)
{
- if (widening_optab_handler (this_optab, mode, innermode)
+ if (find_widening_optab_handler (this_optab, mode, innermode, 0)
!= CODE_FOR_nothing)
{
expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -8042,7 +8038,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
unsignedp, this_optab);
return REDUCE_BIT_FIELD (temp);
}
- if (widening_optab_handler (other_optab, mode, innermode)
+ if (find_widening_optab_handler (other_optab, mode, innermode, 0)
!= CODE_FOR_nothing
&& innermode == word_mode)
{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -249,6 +249,37 @@ widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
return result;
}
\f
+/* Find a widening optab even if it doesn't widen as much as we want.
+ E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+ direct HI->SI insn, then return SI->DI, if that exists.
+ If PERMIT_NON_WIDENING is non-zero then this can be used with
+ non-widening optabs also. */
+
+enum insn_code
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
+{
+ for (; (permit_non_widening || from_mode != to_mode)
+ && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
+ && from_mode != VOIDmode;
+ from_mode = GET_MODE_WIDER_MODE (from_mode))
+ {
+ enum insn_code handler = widening_optab_handler (op, to_mode,
+ from_mode);
+
+ if (handler != CODE_FOR_nothing)
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
+ }
+
+ return CODE_FOR_nothing;
+}
+\f
/* Widen OP to MODE and return the rtx for the widened operand. UNSIGNEDP
says whether OP is signed or unsigned. NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -539,8 +570,9 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
if (ops->code == WIDEN_MULT_PLUS_EXPR
|| ops->code == WIDEN_MULT_MINUS_EXPR)
- icode = widening_optab_handler (widen_pattern_optab,
- TYPE_MODE (TREE_TYPE (ops->op2)), tmode0);
+ icode = find_widening_optab_handler (widen_pattern_optab,
+ TYPE_MODE (TREE_TYPE (ops->op2)),
+ tmode0, 0);
else
icode = optab_handler (widen_pattern_optab, tmode0);
gcc_assert (icode != CODE_FOR_nothing);
@@ -1267,7 +1299,8 @@ expand_binop_directly (enum machine_mode mode, optab binoptab,
rtx last)
{
enum machine_mode from_mode = widened_mode (mode, op0, op1);
- enum insn_code icode = widening_optab_handler (binoptab, mode, from_mode);
+ enum insn_code icode = find_widening_optab_handler (binoptab, mode,
+ from_mode, 1);
enum machine_mode xmode0 = insn_data[(int) icode].operand[1].mode;
enum machine_mode xmode1 = insn_data[(int) icode].operand[2].mode;
enum machine_mode mode0, mode1, tmp_mode;
@@ -1414,8 +1447,8 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
/* If we can do it with a three-operand insn, do so. */
if (methods != OPTAB_MUST_WIDEN
- && widening_optab_handler (binoptab, mode,
- widened_mode (mode, op0, op1))
+ && find_widening_optab_handler (binoptab, mode,
+ widened_mode (mode, op0, op1), 1)
!= CODE_FOR_nothing)
{
temp = expand_binop_directly (mode, binoptab, op0, op1, target,
@@ -1488,10 +1521,11 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
|| (binoptab == smul_optab
&& GET_MODE_WIDER_MODE (wider_mode) != VOIDmode
- && (widening_optab_handler ((unsignedp ? umul_widen_optab
- : smul_widen_optab),
- GET_MODE_WIDER_MODE (wider_mode),
- mode)
+ && (find_widening_optab_handler ((unsignedp
+ ? umul_widen_optab
+ : smul_widen_optab),
+ GET_MODE_WIDER_MODE (wider_mode),
+ mode, 0)
!= CODE_FOR_nothing)))
{
rtx xop0 = op0, xop1 = op1;
@@ -2026,8 +2060,7 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
wider_mode != VOIDmode;
wider_mode = GET_MODE_WIDER_MODE (wider_mode))
{
- if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
- || widening_optab_handler (binoptab, wider_mode, mode)
+ if (find_widening_optab_handler (binoptab, wider_mode, mode, 1)
!= CODE_FOR_nothing
|| (methods == OPTAB_LIB
&& optab_libfunc (binoptab, wider_mode)))
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -807,6 +807,15 @@ extern rtx expand_copysign (rtx, rtx, rtx);
extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
+/* Find a widening optab even if it doesn't widen as much as we want. */
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
+
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
shift amount vs. machines that take a vector for the shift amount. */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-bitfield-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+struct bf
+{
+ int a : 3;
+ int b : 15;
+ int c : 3;
+};
+
+long long
+foo (long long a, struct bf b, struct bf c)
+{
+ return a + b.b * c.b;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3564,7 +3564,7 @@ do_pointer_plus_expr_check:
case WIDEN_MULT_EXPR:
if (TREE_CODE (lhs_type) != INTEGER_TYPE)
return true;
- return ((2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type))
+ return ((2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type))
|| (TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type)));
case WIDEN_SUM_EXPR:
@@ -3655,7 +3655,7 @@ verify_gimple_assign_ternary (gimple stmt)
&& !FIXED_POINT_TYPE_P (rhs1_type))
|| !useless_type_conversion_p (rhs1_type, rhs2_type)
|| !useless_type_conversion_p (lhs_type, rhs3_type)
- || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
+ || 2 * TYPE_PRECISION (rhs1_type) > TYPE_PRECISION (lhs_type)
|| TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))
{
error ("type mismatch in widening multiply-accumulate expression");
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1086,6 +1086,16 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TARGET. Insert the statement
+ prior to GSI's current position, and return the fresh SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val)
+{
+ return build_and_insert_binop (gsi, loc, target, CONVERT_EXPR, val, NULL);
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -1959,8 +1969,8 @@ struct gimple_opt_pass pass_optimize_bswap =
/* Return true if RHS is a suitable operand for a widening multiplication.
There are two cases:
- - RHS makes some value twice as wide. Store that value in *NEW_RHS_OUT
- if so, and store its type in *TYPE_OUT.
+ - RHS makes some value at least twice as wide. Store that value
+ in *NEW_RHS_OUT if so, and store its type in *TYPE_OUT.
- RHS is an integer constant. Store that value in *NEW_RHS_OUT if so,
but leave *TYPE_OUT untouched. */
@@ -1988,7 +1998,7 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
- || TYPE_PRECISION (type1) * 2 != TYPE_PRECISION (type))
+ || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
*new_rhs_out = rhs1;
@@ -2044,6 +2054,10 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
+ /* FIXME: remove this restriction. */
+ if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
+ return false;
+
return true;
}
@@ -2052,12 +2066,14 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
- tree lhs, rhs1, rhs2, type, type1, type2;
+ tree lhs, rhs1, rhs2, type, type1, type2, tmp;
enum insn_code handler;
- enum machine_mode to_mode, from_mode;
+ enum machine_mode to_mode, from_mode, actual_mode;
optab op;
+ int actual_precision;
+ location_t loc = gimple_location (stmt);
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2077,13 +2093,32 @@ convert_mult_to_widen (gimple stmt)
else
op = usmul_widen_optab;
- handler = widening_optab_handler (op, to_mode, from_mode);
+ handler = find_widening_optab_handler_and_mode (op, to_mode, from_mode,
+ 0, &actual_mode);
if (handler == CODE_FOR_nothing)
return false;
- gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
- gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
+ /* Ensure that the inputs to the handler are in the correct precison
+ for the opcode. This will be the full mode size. */
+ actual_precision = GET_MODE_PRECISION (actual_mode);
+ if (actual_precision != TYPE_PRECISION (type1))
+ {
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type1)),
+ NULL);
+ rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
+
+ /* Reuse the same type info, if possible. */
+ if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type2)),
+ NULL);
+ rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
+ }
+
+ gimple_assign_set_rhs1 (stmt, rhs1);
+ gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
update_stmt (stmt);
widen_mul_stats.widen_mults_inserted++;
@@ -2101,11 +2136,15 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code code)
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
- tree type, type1, type2;
+ tree type, type1, type2, tmp;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
enum tree_code wmult_code;
+ enum insn_code handler;
+ enum machine_mode to_mode, from_mode, actual_mode;
+ location_t loc = gimple_location (stmt);
+ int actual_precision;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2139,39 +2178,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
- if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
+ /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
+ is_widening_mult_p, but we still need the rhs returns.
+
+ It might also appear that it would be sufficient to use the existing
+ operands of the widening multiply, but that would limit the choice of
+ multiply-and-accumulate instructions. */
+ if (code == PLUS_EXPR
+ && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
}
- else if (rhs2_code == MULT_EXPR)
+ else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
}
- else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs1_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs2;
- }
- else if (rhs2_code == WIDEN_MULT_EXPR)
- {
- mult_rhs1 = gimple_assign_rhs1 (rhs2_stmt);
- mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
- type1 = TREE_TYPE (mult_rhs1);
- type2 = TREE_TYPE (mult_rhs2);
- add_rhs = rhs1;
- }
else
return false;
+ to_mode = TYPE_MODE (type);
+ from_mode = TYPE_MODE (type1);
+
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
@@ -2179,15 +2212,26 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
- if (widening_optab_handler (this_optab, TYPE_MODE (type), TYPE_MODE (type1))
- == CODE_FOR_nothing)
+ handler = find_widening_optab_handler_and_mode (this_optab, to_mode,
+ from_mode, 0, &actual_mode);
+
+ if (handler == CODE_FOR_nothing)
return false;
- /* ??? May need some type verification here? */
+ /* Ensure that the inputs to the handler are in the correct precison
+ for the opcode. This will be the full mode size. */
+ actual_precision = GET_MODE_PRECISION (actual_mode);
+ if (actual_precision != TYPE_PRECISION (type1))
+ {
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, TYPE_UNSIGNED (type1)),
+ NULL);
+
+ mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
+ mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
+ }
- gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code,
- fold_convert (type1, mult_rhs1),
- fold_convert (type2, mult_rhs2),
+ gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
widen_mul_stats.maccs_inserted++;
@@ -2399,7 +2443,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
2011-06-23 14:39 ` [PATCH (1/7)] New optab framework for widening multiplies Andrew Stubbs
2011-06-23 14:41 ` [PATCH (2/7)] Widening multiplies by more than one mode Andrew Stubbs
@ 2011-06-23 14:42 ` Andrew Stubbs
2011-06-23 16:28 ` Richard Guenther
2011-06-23 21:55 ` Janis Johnson
2011-06-23 14:43 ` [PATCH (4/7)] Unsigned multiplies using wider signed multiplies Andrew Stubbs
` (6 subsequent siblings)
9 siblings, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:42 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 288 bytes --]
There are many cases where the widening_mult pass does not recognise
widening multiply-and-accumulate cases simply because there is a type
conversion step between the multiply and add statements.
This patch should rectify that simply by looking beyond those conversions.
OK?
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 1978 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Look for
multiply statement beyond NOP_EXPR statements.
gcc/testsuite/
* gcc.target/arm/umlal-1.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/umlal-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2114,26 +2114,39 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
wmult_code = WIDEN_MULT_PLUS_EXPR;
- rhs1 = gimple_assign_rhs1 (stmt);
- rhs2 = gimple_assign_rhs2 (stmt);
-
- if (TREE_CODE (rhs1) == SSA_NAME)
+ rhs1_stmt = stmt;
+ do
{
- rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
- if (is_gimple_assign (rhs1_stmt))
- rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ rhs1_code = ERROR_MARK;
+ rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+
+ if (TREE_CODE (rhs1) == SSA_NAME)
+ {
+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+ if (is_gimple_assign (rhs1_stmt))
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ }
+ else
+ return false;
}
- else
- return false;
+ while (rhs1_code == NOP_EXPR);
- if (TREE_CODE (rhs2) == SSA_NAME)
+ rhs2_stmt = stmt;
+ do
{
- rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
- if (is_gimple_assign (rhs2_stmt))
- rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ rhs2_code = ERROR_MARK;
+ rhs2 = gimple_assign_rhs2 (rhs2_stmt);
+
+ if (rhs2 && TREE_CODE (rhs2) == SSA_NAME)
+ {
+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+ if (is_gimple_assign (rhs2_stmt))
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ }
+ else
+ return false;
}
- else
- return false;
+ while (rhs2_code == NOP_EXPR);
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
{
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-23 14:42 ` [PATCH (3/7)] Widening multiply-and-accumulate pattern matching Andrew Stubbs
@ 2011-06-23 16:28 ` Richard Guenther
2011-06-24 8:14 ` Andrew Stubbs
2011-06-23 21:55 ` Janis Johnson
1 sibling, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-06-23 16:28 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs <andrew.stubbs@linaro.org> wrote:
> There are many cases where the widening_mult pass does not recognise
> widening multiply-and-accumulate cases simply because there is a type
> conversion step between the multiply and add statements.
>
> This patch should rectify that simply by looking beyond those conversions.
That's surely wrong for (int)(short)int_var. You have to constrain
the conversions
you look through properly.
Richard.
> OK?
>
> Andrew
>
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-23 16:28 ` Richard Guenther
@ 2011-06-24 8:14 ` Andrew Stubbs
2011-06-24 9:31 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-24 8:14 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
On 23/06/11 17:26, Richard Guenther wrote:
> On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs<andrew.stubbs@linaro.org> wrote:
>> There are many cases where the widening_mult pass does not recognise
>> widening multiply-and-accumulate cases simply because there is a type
>> conversion step between the multiply and add statements.
>>
>> This patch should rectify that simply by looking beyond those conversions.
>
> That's surely wrong for (int)(short)int_var. You have to constrain
> the conversions
> you look through properly.
To be clear, it only skips past NOP_EXPR. Is it not the case that what
you're describing would need a CONVERT_EXPR?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 8:14 ` Andrew Stubbs
@ 2011-06-24 9:31 ` Richard Guenther
2011-06-24 14:08 ` Stubbs, Andrew
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-06-24 9:31 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Fri, Jun 24, 2011 at 10:05 AM, Andrew Stubbs
<andrew.stubbs@linaro.org> wrote:
> On 23/06/11 17:26, Richard Guenther wrote:
>>
>> On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs<andrew.stubbs@linaro.org>
>> wrote:
>>>
>>> There are many cases where the widening_mult pass does not recognise
>>> widening multiply-and-accumulate cases simply because there is a type
>>> conversion step between the multiply and add statements.
>>>
>>> This patch should rectify that simply by looking beyond those
>>> conversions.
>>
>> That's surely wrong for (int)(short)int_var. You have to constrain
>> the conversions
>> you look through properly.
>
> To be clear, it only skips past NOP_EXPR. Is it not the case that what
> you're describing would need a CONVERT_EXPR?
NOP_EXPR is the same as CONVERT_EXPR.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 9:31 ` Richard Guenther
@ 2011-06-24 14:08 ` Stubbs, Andrew
2011-06-24 16:13 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Stubbs, Andrew @ 2011-06-24 14:08 UTC (permalink / raw)
To: Richard Guenther; +Cc: Andrew Stubbs, gcc-patches, patches
On 24/06/11 09:28, Richard Guenther wrote:
>> > To be clear, it only skips past NOP_EXPR. Is it not the case that what
>> > you're describing would need a CONVERT_EXPR?
> NOP_EXPR is the same as CONVERT_EXPR.
Are you sure?
I thought this was safe because the internals manual says:
NOP_EXPR
These nodes are used to represent conversions that do not require any
code-generation ....
CONVERT_EXPR
These nodes are similar to NOP_EXPRs, but are used in those
situations where code may need to be generated ....
So, I tried this example:
int
foo (int a, short b, short c)
{
int bc = b * c;
return a + (short)bc;
}
Both before and after my patch, GCC gives:
mul r2, r1, r2
sxtah r0, r0, r2
(where, SXTAH means sign-extend the third operand from HImode to SImode
and add to the second operand.)
The dump after the widening_mult pass is:
foo (int a, short int b, short int c)
{
int bc;
int D.2018;
short int D.2017;
int D.2016;
int D.2015;
int D.2014;
<bb 2>:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = b_1(D) w* c_3(D);
D.2017_6 = (short int) bc_5;
D.2018_7 = (int) D.2017_6;
D.2016_9 = D.2018_7 + a_8(D);
return D.2016_9;
}
Where you can clearly see that the addition has not been recognised as a
multiply-and-accumulate.
When I step through convert_plusminus_to_widen, I can see that the
reason it has not matched is because "D.2017_6 = (short int) bc_5" is
encoded with a CONVERT_EXPR, just as the manual said it would be.
So, according to the manual, and my (admittedly limited) experiments,
skipping over NOP_EXPR does appear to be safe.
But you say that it isn't safe. So now I'm confused. :(
I can certainly add checks to make sure that the skipped operations
actually don't make any important changes to the value, but do I need to?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 14:08 ` Stubbs, Andrew
@ 2011-06-24 16:13 ` Richard Guenther
2011-06-24 18:22 ` Stubbs, Andrew
2011-06-28 11:32 ` Andrew Stubbs
0 siblings, 2 replies; 107+ messages in thread
From: Richard Guenther @ 2011-06-24 16:13 UTC (permalink / raw)
To: Stubbs, Andrew; +Cc: Andrew Stubbs, gcc-patches, patches
On Fri, Jun 24, 2011 at 3:46 PM, Stubbs, Andrew
<Andrew_Stubbs@mentor.com> wrote:
> On 24/06/11 09:28, Richard Guenther wrote:
>>> > To be clear, it only skips past NOP_EXPR. Is it not the case that what
>>> > you're describing would need a CONVERT_EXPR?
>> NOP_EXPR is the same as CONVERT_EXPR.
>
> Are you sure?
Yes, definitely. They are synonyms of each other (an unfinished merging
process), the usual check for them is via CONVERT_EXPR_P.
> I thought this was safe because the internals manual says:
>
> NOP_EXPR
> These nodes are used to represent conversions that do not require any
> code-generation ....
>
> CONVERT_EXPR
> These nodes are similar to NOP_EXPRs, but are used in those
> situations where code may need to be generated ....
Which is wrong (sorry).
> So, I tried this example:
>
> int
> foo (int a, short b, short c)
> {
> int bc = b * c;
> return a + (short)bc;
> }
>
> Both before and after my patch, GCC gives:
>
> mul r2, r1, r2
> sxtah r0, r0, r2
>
> (where, SXTAH means sign-extend the third operand from HImode to SImode
> and add to the second operand.)
>
> The dump after the widening_mult pass is:
>
> foo (int a, short int b, short int c)
> {
> int bc;
> int D.2018;
> short int D.2017;
> int D.2016;
> int D.2015;
> int D.2014;
>
> <bb 2>:
> D.2014_2 = (int) b_1(D);
> D.2015_4 = (int) c_3(D);
> bc_5 = b_1(D) w* c_3(D);
> D.2017_6 = (short int) bc_5;
> D.2018_7 = (int) D.2017_6;
> D.2016_9 = D.2018_7 + a_8(D);
> return D.2016_9;
>
> }
>
> Where you can clearly see that the addition has not been recognised as a
> multiply-and-accumulate.
>
> When I step through convert_plusminus_to_widen, I can see that the
> reason it has not matched is because "D.2017_6 = (short int) bc_5" is
> encoded with a CONVERT_EXPR, just as the manual said it would be.
A NOP_EXPR in this place would be valid as well. The merging hasn't
been completed and at least the C frontend still generates CONVERT_EXPRs
in some cases.
> So, according to the manual, and my (admittedly limited) experiments,
> skipping over NOP_EXPR does appear to be safe.
>
> But you say that it isn't safe. So now I'm confused. :(
>
> I can certainly add checks to make sure that the skipped operations
> actually don't make any important changes to the value, but do I need to?
Yes.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 16:13 ` Richard Guenther
@ 2011-06-24 18:22 ` Stubbs, Andrew
2011-06-25 9:58 ` Richard Guenther
2011-06-28 11:32 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Stubbs, Andrew @ 2011-06-24 18:22 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
On 24/06/11 16:47, Richard Guenther wrote:
>> > I can certainly add checks to make sure that the skipped operations
>> > actually don't make any important changes to the value, but do I need to?
> Yes.
Ok, I'll go away and do that then.
BTW, I see useless_type_conversion_p, but that's not quite what I want.
Is there an equivalent existing function to determine whether a
conversion changes the logical/arithmetic meaning of a type?
I mean, conversion to a wider mode is not "useless", but it is harmless,
whereas conversion to a narrower mode may truncate the value.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 18:22 ` Stubbs, Andrew
@ 2011-06-25 9:58 ` Richard Guenther
0 siblings, 0 replies; 107+ messages in thread
From: Richard Guenther @ 2011-06-25 9:58 UTC (permalink / raw)
To: Stubbs, Andrew; +Cc: gcc-patches, patches
On Fri, Jun 24, 2011 at 6:58 PM, Stubbs, Andrew
<Andrew_Stubbs@mentor.com> wrote:
> On 24/06/11 16:47, Richard Guenther wrote:
>>> > I can certainly add checks to make sure that the skipped operations
>>> > actually don't make any important changes to the value, but do I need to?
>> Yes.
>
> Ok, I'll go away and do that then.
>
> BTW, I see useless_type_conversion_p, but that's not quite what I want.
> Is there an equivalent existing function to determine whether a
> conversion changes the logical/arithmetic meaning of a type?
>
> I mean, conversion to a wider mode is not "useless", but it is harmless,
> whereas conversion to a narrower mode may truncate the value.
Well, you have to decide that for the concrete situation based on
the signedness and precision of the types involved. All such
conversions change the logical/arithmetic meaning of a type if
seen in the right context.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-24 16:13 ` Richard Guenther
2011-06-24 18:22 ` Stubbs, Andrew
@ 2011-06-28 11:32 ` Andrew Stubbs
2011-06-28 12:48 ` Richard Guenther
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 11:32 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 371 bytes --]
On 24/06/11 16:47, Richard Guenther wrote:
>> I can certainly add checks to make sure that the skipped operations
>> > actually don't make any important changes to the value, but do I need to?
> Yes.
OK, how about this patch?
I've added checks to make sure the value is not truncated at any point.
I've also changed the test cases to address Janis' comments.
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 5383 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* gimple.h (tree_ssa_harmless_type_conversion): New prototype.
(tree_ssa_strip_harmless_type_conversions): New prototype.
(harmless_type_conversion_p): New prototype.
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Look for
multiply statement beyond no-op conversion statements.
* tree-ssa.c (harmless_type_conversion_p): New function.
(tree_ssa_harmless_type_conversion): New function.
(tree_ssa_strip_harmless_type_conversions): New function.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: New file.
* gcc.target/arm/no-wmla-1.c: New file.
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1090,8 +1090,11 @@ extern bool validate_gimple_arglist (const_gimple, ...);
/* In tree-ssa.c */
extern bool tree_ssa_useless_type_conversion (tree);
+extern bool tree_ssa_harmless_type_conversion (tree);
extern tree tree_ssa_strip_useless_type_conversions (tree);
+extern tree tree_ssa_strip_harmless_type_conversions (tree);
extern bool useless_type_conversion_p (tree, tree);
+extern bool harmless_type_conversion_p (tree, tree);
extern bool types_compatible_p (tree, tree);
/* Return the code for GIMPLE statement G. */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+ return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2117,23 +2117,19 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
rhs1 = gimple_assign_rhs1 (stmt);
rhs2 = gimple_assign_rhs2 (stmt);
- if (TREE_CODE (rhs1) == SSA_NAME)
- {
- rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
- if (is_gimple_assign (rhs1_stmt))
- rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
- }
- else
+ if (TREE_CODE (rhs1) != SSA_NAME
+ || TREE_CODE (rhs2) != SSA_NAME)
return false;
- if (TREE_CODE (rhs2) == SSA_NAME)
- {
- rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
- if (is_gimple_assign (rhs2_stmt))
- rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
- }
- else
- return false;
+ rhs1 = tree_ssa_strip_harmless_type_conversions (rhs1);
+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+ if (is_gimple_assign (rhs1_stmt))
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+
+ rhs2 = tree_ssa_strip_harmless_type_conversions(rhs2);
+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+ if (is_gimple_assign (rhs2_stmt))
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
{
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1484,6 +1484,33 @@ useless_type_conversion_p (tree outer_type, tree inner_type)
return false;
}
+/* Return true if the conversion from INNER_TYPE to OUTER_TYPE will
+ not alter the arithmetic meaning of a type, otherwise return false.
+
+ For example, widening an integer type leaves the value unchanged,
+ but narrowing an integer type can cause truncation.
+
+ Note that switching between signed and unsigned modes doesn't change
+ the underlying representation, and so is harmless.
+
+ This function is not yet a complete definition of what is harmless
+ but should reject everything that is not. */
+
+bool
+harmless_type_conversion_p (tree outer_type, tree inner_type)
+{
+ /* If it's useless, it's also harmless. */
+ if (useless_type_conversion_p (outer_type, inner_type))
+ return true;
+
+ if (INTEGRAL_TYPE_P (inner_type)
+ && INTEGRAL_TYPE_P (outer_type)
+ && TYPE_PRECISION (inner_type) <= TYPE_PRECISION (outer_type))
+ return true;
+
+ return false;
+}
+
/* Return true if a conversion from either type of TYPE1 and TYPE2
to the other is not required. Otherwise return false. */
@@ -1515,6 +1542,29 @@ tree_ssa_useless_type_conversion (tree expr)
return false;
}
+/* Return true if EXPR is a harmless type conversion, otherwise return
+ false. */
+
+bool
+tree_ssa_harmless_type_conversion (tree expr)
+{
+ gimple stmt;
+
+ if (TREE_CODE (expr) != SSA_NAME)
+ return false;
+
+ stmt = SSA_NAME_DEF_STMT (expr);
+
+ if (!is_gimple_assign (stmt))
+ return false;
+
+ if (!CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt)))
+ return false;
+
+ return harmless_type_conversion_p (TREE_TYPE (gimple_assign_lhs (stmt)),
+ TREE_TYPE (gimple_assign_rhs1 (stmt)));
+}
+
/* Strip conversions from EXP according to
tree_ssa_useless_type_conversion and return the resulting
expression. */
@@ -1527,6 +1577,18 @@ tree_ssa_strip_useless_type_conversions (tree exp)
return exp;
}
+/* Strip conversions from EXP according to
+ tree_ssa_harmless_type_conversion and return the resulting
+ expression. */
+
+tree
+tree_ssa_strip_harmless_type_conversions (tree exp)
+{
+ while (tree_ssa_harmless_type_conversion (exp))
+ exp = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (exp));
+ return exp;
+}
+
/* Internal helper for walk_use_def_chains. VAR, FN and DATA are as
described in walk_use_def_chains.
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 11:32 ` Andrew Stubbs
@ 2011-06-28 12:48 ` Richard Guenther
2011-06-28 16:37 ` Michael Matz
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-06-28 12:48 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Tue, Jun 28, 2011 at 12:47 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
> On 24/06/11 16:47, Richard Guenther wrote:
>>>
>>> I can certainly add checks to make sure that the skipped operations
>>> > actually don't make any important changes to the value, but do I need
>>> > to?
>>
>> Yes.
>
> OK, how about this patch?
I'd name the predicate value_preserving_conversion_p which I think
is what you mean. harmless isn't really descriptive.
Note that you include non-value-preserving conversions, namely
int -> unsigned int. Don't dispatch to useless_type_conversion_p,
it's easy to enumerate which conversions are value-preserving.
Don't try to match the tree_ssa_useless_* set of functions, instead
put the value_preserving_conversion_p predicate in tree.[ch] and
a suitable function using it in tree-ssa-math-opts.c.
Thanks,
Richard.
> I've added checks to make sure the value is not truncated at any point.
>
> I've also changed the test cases to address Janis' comments.
>
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 12:48 ` Richard Guenther
@ 2011-06-28 16:37 ` Michael Matz
2011-06-28 16:48 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Michael Matz @ 2011-06-28 16:37 UTC (permalink / raw)
To: Richard Guenther; +Cc: Andrew Stubbs, gcc-patches, patches
Hi,
On Tue, 28 Jun 2011, Richard Guenther wrote:
> I'd name the predicate value_preserving_conversion_p which I think is
> what you mean. harmless isn't really descriptive.
>
> Note that you include non-value-preserving conversions, namely int ->
> unsigned int.
It seems that Andrew really does want to accept them. If so
value_preserving_conversion_p would be the wrong name. It seems to me he
wants to accept those conversions that make it possible to retrieve the
old value, i.e. when "T1 x; (T1)(T2)x == x", then T1->T2 has the
to-be-named property. bits_preserving? Hmm.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 16:37 ` Michael Matz
@ 2011-06-28 16:48 ` Andrew Stubbs
2011-06-28 17:09 ` Michael Matz
2011-07-01 16:40 ` Bernd Schmidt
0 siblings, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 16:48 UTC (permalink / raw)
To: Michael Matz; +Cc: Richard Guenther, gcc-patches, patches
On 28/06/11 16:53, Michael Matz wrote:
> On Tue, 28 Jun 2011, Richard Guenther wrote:
>> I'd name the predicate value_preserving_conversion_p which I think is
>> what you mean. harmless isn't really descriptive.
>>
>> Note that you include non-value-preserving conversions, namely int ->
>> unsigned int.
>
> It seems that Andrew really does want to accept them. If so
> value_preserving_conversion_p would be the wrong name. It seems to me he
> wants to accept those conversions that make it possible to retrieve the
> old value, i.e. when "T1 x; (T1)(T2)x == x", then T1->T2 has the
> to-be-named property. bits_preserving? Hmm.
What I want (and I'm not totally clear on what this actually means) is
to be able to optimize all the cases where the end result will be the
same as the compiler produces now (using multiple multiply, shift, and
add operations).
Ok, so that's an obvious statement, but the point is that, right now,
the compiler does nothing special when you cast from int -> unsigned
int, or vice-versa, and I want to capture that somehow. There are some
exceptions, I'm sure, but what are they?
What is clear is that I don't want to just assume that casting from one
signedness to the other is a show-stopper.
For example:
unsigned long long
foo (unsigned long long a, unsigned char b, unsigned char c)
{
return a + b * c;
}
This appears to be entirely unsigned maths with plenty of spare
precision, and therefore a dead cert for any SI->DI
multiply-and-accumulate instruction, but not so - it is represented
internally as:
signed int tmp = (signed int)a * (signed int)b;
unsigned long long result = a + (unsigned long long)tmp;
Notice the unexpected signed int in the middle! I need to be able to get
past that to optimize this properly.
I've tried various test cases in which I cast signedness and mode around
a bit, and so far it appear to perform safely, but probably I'm not be
cunning enough.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 16:48 ` Andrew Stubbs
@ 2011-06-28 17:09 ` Michael Matz
2011-07-01 11:58 ` Stubbs, Andrew
2011-07-01 16:40 ` Bernd Schmidt
1 sibling, 1 reply; 107+ messages in thread
From: Michael Matz @ 2011-06-28 17:09 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Richard Guenther, gcc-patches, patches
Hi,
On Tue, 28 Jun 2011, Andrew Stubbs wrote:
> What I want (and I'm not totally clear on what this actually means) is
> to be able to optimize all the cases where the end result will be the
> same as the compiler produces now (using multiple multiply, shift, and
> add operations).
Okay, then you really want to look through value-preserving conversions.
> Ok, so that's an obvious statement, but the point is that, right now,
> the compiler does nothing special when you cast from int -> unsigned
> int, or vice-versa, and I want to capture that somehow. There are some
> exceptions, I'm sure, but what are they?
Same-sized signed <-> unsigned conversions aren't value preserving:
unsigned char c = 255; (signed char)c == -1; 255 != -1
unsigned -> larger sized signed is value preserving
unsigned char c = 255; (signed short)c == 255;
signed -> unsigned never is value preserving
> multiply-and-accumulate instruction, but not so - it is represented
> internally as:
>
> signed int tmp = (signed int)a * (signed int)b;
> unsigned long long result = a + (unsigned long long)tmp;
>
> Notice the unexpected signed int in the middle!
Yeah, the C standard requires this.
> I need to be able to get past that to optimize this properly.
Then you're lucky because unsigned char -> signed int is an embedding,
hence value preserving. I thought we had a predicate for such conversions
already, but seems I was wrong. So, create it as Richi said, but
enumerate explicitely the cases you want to handle, and include only those
that really are value preserving.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 17:09 ` Michael Matz
@ 2011-07-01 11:58 ` Stubbs, Andrew
2011-07-01 12:25 ` Richard Guenther
2011-07-01 12:33 ` Paolo Bonzini
0 siblings, 2 replies; 107+ messages in thread
From: Stubbs, Andrew @ 2011-07-01 11:58 UTC (permalink / raw)
To: Michael Matz; +Cc: Andrew Stubbs, Richard Guenther, gcc-patches, patches
On 28/06/11 17:37, Michael Matz wrote:
>> What I want (and I'm not totally clear on what this actually means) is
>> > to be able to optimize all the cases where the end result will be the
>> > same as the compiler produces now (using multiple multiply, shift, and
>> > add operations).
> Okay, then you really want to look through value-preserving conversions.
>
>> > Ok, so that's an obvious statement, but the point is that, right now,
>> > the compiler does nothing special when you cast from int -> unsigned
>> > int, or vice-versa, and I want to capture that somehow. There are some
>> > exceptions, I'm sure, but what are they?
> Same-sized signed<-> unsigned conversions aren't value preserving:
> unsigned char c = 255; (signed char)c == -1; 255 != -1
> unsigned -> larger sized signed is value preserving
> unsigned char c = 255; (signed short)c == 255;
> signed -> unsigned never is value preserving
OK, so I've tried implementing this, and I find I hit against a problem:
Given this test case:
unsigned long long
foo (unsigned long long a, signed char *b, signed char *c)
{
return a + *b * *c;
}
Those rules say that it should not be suitable for optimization because
there's an implicit cast from signed int to unsigned long long.
Without any widening multiplications allowed, GCC gives this code (for ARM):
ldrsb r2, [r2, #0]
ldrsb r3, [r3, #0]
mul r2, r2, r3
adds r0, r0, r2
adc r1, r1, r2, asr #31
This is exactly what a signed widening multiply-and-accumulate with
smlalbb would have done!
OK, so the types in the testcase are a bit contrived, but my point is
that I want to be able to use the widening-mult instructions everywhere
that they would produce the same output and gcc would otherwise, and gcc
just doesn't seem that interested in signed<->unsigned conversions.
So, I'm happy to put in checks to ensure that truncations are not
ignore, but I'm really not sure what's the right thing to do with the
extends and signedness switches.
Any suggestions?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 11:58 ` Stubbs, Andrew
@ 2011-07-01 12:25 ` Richard Guenther
2011-07-04 14:23 ` Andrew Stubbs
2011-07-01 12:33 ` Paolo Bonzini
1 sibling, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-01 12:25 UTC (permalink / raw)
To: Stubbs, Andrew; +Cc: Michael Matz, Andrew Stubbs, gcc-patches, patches
On Fri, Jul 1, 2011 at 1:58 PM, Stubbs, Andrew <Andrew_Stubbs@mentor.com> wrote:
> On 28/06/11 17:37, Michael Matz wrote:
>>> What I want (and I'm not totally clear on what this actually means) is
>>> > to be able to optimize all the cases where the end result will be the
>>> > same as the compiler produces now (using multiple multiply, shift, and
>>> > add operations).
>> Okay, then you really want to look through value-preserving conversions.
>>
>>> > Ok, so that's an obvious statement, but the point is that, right now,
>>> > the compiler does nothing special when you cast from int -> unsigned
>>> > int, or vice-versa, and I want to capture that somehow. There are some
>>> > exceptions, I'm sure, but what are they?
>> Same-sized signed<-> unsigned conversions aren't value preserving:
>> unsigned char c = 255; (signed char)c == -1; 255 != -1
>> unsigned -> larger sized signed is value preserving
>> unsigned char c = 255; (signed short)c == 255;
>> signed -> unsigned never is value preserving
>
> OK, so I've tried implementing this, and I find I hit against a problem:
>
> Given this test case:
>
> unsigned long long
> foo (unsigned long long a, signed char *b, signed char *c)
> {
> return a + *b * *c;
> }
>
> Those rules say that it should not be suitable for optimization because
> there's an implicit cast from signed int to unsigned long long.
>
> Without any widening multiplications allowed, GCC gives this code (for ARM):
>
> ldrsb r2, [r2, #0]
> ldrsb r3, [r3, #0]
> mul r2, r2, r3
> adds r0, r0, r2
> adc r1, r1, r2, asr #31
>
> This is exactly what a signed widening multiply-and-accumulate with
> smlalbb would have done!
>
> OK, so the types in the testcase are a bit contrived, but my point is
> that I want to be able to use the widening-mult instructions everywhere
> that they would produce the same output and gcc would otherwise, and gcc
> just doesn't seem that interested in signed<->unsigned conversions.
>
> So, I'm happy to put in checks to ensure that truncations are not
> ignore, but I'm really not sure what's the right thing to do with the
> extends and signedness switches.
>
> Any suggestions?
Well - some operations work the same on both signedness if you
just care about the twos-complement result. This includes
multiplication (but not for example division). For this special
case I suggest to not bother trying to invent a generic predicate
but do something local in tree-ssa-math-opts.c.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 12:25 ` Richard Guenther
@ 2011-07-04 14:23 ` Andrew Stubbs
2011-07-07 10:00 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-04 14:23 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1430 bytes --]
On 01/07/11 13:25, Richard Guenther wrote:
> Well - some operations work the same on both signedness if you
> just care about the twos-complement result. This includes
> multiplication (but not for example division). For this special
> case I suggest to not bother trying to invent a generic predicate
> but do something local in tree-ssa-math-opts.c.
OK, here's my updated patch.
I've taken the view that we *know* what size and signedness the result
of the multiplication is, and we know what size the input to the
addition must be, so all the check has to do is make sure it does that
same conversion, even if by a roundabout means.
What I hadn't grasped before is that when extending a value it's the
source type that is significant, not the destination, so the checks are
not as complex as I had thought.
So, this patch adds a test to ensure that:
1. the type is not truncated so far that we lose any information; and
2. the type is only ever extended in the proper signedness.
Also, just to be absolutely sure, I've also added a little bit of logic
to permit extends that are then undone by a truncate. I'm really not
sure what guarantees there are about what sort of cast sequences can
exist? Is this necessary? I haven't managed to coax it to generated any
examples of extends followed by truncates myself, but in any case, it's
hardly any code and it'll make sure it's future proofed.
OK?
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 8045 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (valid_types_for_madd_p): New function.
(convert_plusminus_to_widen): Use valid_types_for_madd_p to
identify optimization candidates.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: New file.
* gcc.target/arm/no-wmla-1.c: New file.
---
.../gcc/testsuite/gcc.target/arm/no-wmla-1.c | 11 ++
.../gcc/testsuite/gcc.target/arm/wmul-5.c | 10 ++
src/gcc-mainline/gcc/tree-ssa-math-opts.c | 112 ++++++++++++++++++--
3 files changed, 123 insertions(+), 10 deletions(-)
create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
new file mode 100644
index 0000000..17f7427
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+ return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
new file mode 100644
index 0000000..65c43e3
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
diff --git a/src/gcc-mainline/gcc/tree-ssa-math-opts.c b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
index d55ba57..5ef7bb4 100644
--- a/src/gcc-mainline/gcc/tree-ssa-math-opts.c
+++ b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
@@ -2085,6 +2085,78 @@ convert_mult_to_widen (gimple stmt)
return true;
}
+/* Check the input types, TYPE1 and TYPE2 to a widening multiply,
+ and then the convertions between the output of the multiply, and
+ the input to an addition EXPR, to ensure that they are compatible with
+ a widening multiply-and-accumulate.
+
+ This function assumes that expr is a valid string of conversion expressions
+ terminated by a multiplication.
+
+ This function tries NOT to make any (fragile) assumptions about what
+ sequence of conversions can exist in the input. */
+
+static bool
+valid_types_for_madd_p (tree type1, tree type2, tree expr)
+{
+ gimple stmt, prev_stmt;
+ enum tree_code code, prev_code;
+ tree prev_expr, type, prev_type;
+ int bitsize, prev_bitsize, initial_bitsize, min_bitsize;
+ bool initial_unsigned;
+
+ initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+ initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+ stmt = SSA_NAME_DEF_STMT (expr);
+ code = gimple_assign_rhs_code (stmt);
+ type = TREE_TYPE (expr);
+ bitsize = TYPE_PRECISION (type);
+ min_bitsize = bitsize;
+
+ if (code == MULT_EXPR || code == WIDEN_MULT_EXPR)
+ return true;
+
+ if (!INTEGRAL_TYPE_P (type)
+ || TYPE_PRECISION (type) < initial_bitsize)
+ return false;
+
+ /* Step through the conversions backwards. */
+ while (true)
+ {
+ prev_expr = gimple_assign_rhs1 (stmt);
+ prev_stmt = SSA_NAME_DEF_STMT (prev_expr);
+ prev_code = gimple_assign_rhs_code (prev_stmt);
+ prev_type = TREE_TYPE (prev_expr);
+ prev_bitsize = TYPE_PRECISION (prev_type);
+
+ if (prev_code == MULT_EXPR || prev_code == WIDEN_MULT_EXPR)
+ break;
+
+ /* If it's an unsuitable type or a truncate that damages the
+ original value, then were done. */
+ if (!INTEGRAL_TYPE_P (prev_type)
+ || TYPE_PRECISION (prev_type) < initial_bitsize)
+ return false;
+
+ /* If we have the wrong sort of extend for the value, then it
+ could still be ok if we already saw a truncate that reverses
+ the effect. */
+ if (bitsize > prev_bitsize
+ && TYPE_UNSIGNED (prev_type) != initial_unsigned
+ && min_bitsize > prev_bitsize)
+ return false;
+
+ stmt = prev_stmt;
+ code = prev_code;
+ type = prev_type;
+ bitsize = prev_bitsize;
+ min_bitsize = bitsize < min_bitsize ? bitsize : min_bitsize;
+ }
+
+ return true;
+}
+
/* Process a single gimple statement STMT, which is found at the
iterator GSI and has a either a PLUS_EXPR or a MINUS_EXPR as its
rhs (given by CODE), and try to convert it into a
@@ -2098,6 +2170,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
tree type, type1, type2;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
+ tree tmp, mult_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
enum tree_code wmult_code;
@@ -2117,22 +2190,32 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
rhs1 = gimple_assign_rhs1 (stmt);
rhs2 = gimple_assign_rhs2 (stmt);
- if (TREE_CODE (rhs1) == SSA_NAME)
+ for (tmp = rhs1, rhs1_code = ERROR_MARK;
+ TREE_CODE (tmp) == SSA_NAME
+ && (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK);
+ tmp = gimple_assign_rhs1 (rhs1_stmt))
{
- rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
- if (is_gimple_assign (rhs1_stmt))
- rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ rhs1_stmt = SSA_NAME_DEF_STMT (tmp);
+ if (!is_gimple_assign (rhs1_stmt))
+ break;
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
- else
+
+ if (TREE_CODE (tmp) != SSA_NAME)
return false;
- if (TREE_CODE (rhs2) == SSA_NAME)
+ for (tmp = rhs2, rhs2_code = ERROR_MARK;
+ TREE_CODE (tmp) == SSA_NAME
+ && (CONVERT_EXPR_CODE_P (rhs2_code) || rhs2_code == ERROR_MARK);
+ tmp = gimple_assign_rhs1 (rhs2_stmt))
{
- rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
- if (is_gimple_assign (rhs2_stmt))
- rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ rhs2_stmt = SSA_NAME_DEF_STMT (tmp);
+ if (!is_gimple_assign (rhs2_stmt))
+ break;
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
}
- else
+
+ if (TREE_CODE (tmp) != SSA_NAME)
return false;
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
@@ -2140,6 +2223,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
+ mult_rhs = rhs1;
add_rhs = rhs2;
}
else if (rhs2_code == MULT_EXPR)
@@ -2147,6 +2231,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
+ mult_rhs = rhs2;
add_rhs = rhs1;
}
else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
@@ -2155,6 +2240,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
type1 = TREE_TYPE (mult_rhs1);
type2 = TREE_TYPE (mult_rhs2);
+ mult_rhs = rhs1;
add_rhs = rhs2;
}
else if (rhs2_code == WIDEN_MULT_EXPR)
@@ -2163,6 +2249,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
type1 = TREE_TYPE (mult_rhs1);
type2 = TREE_TYPE (mult_rhs2);
+ mult_rhs = rhs2;
add_rhs = rhs1;
}
else
@@ -2171,6 +2258,11 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
+ /* Verify that the convertions between the mult and the add doesn't do
+ anything unexpected. */
+ if (!valid_types_for_madd_p (type1, type2, mult_rhs))
+ return false;
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-04 14:23 ` Andrew Stubbs
@ 2011-07-07 10:00 ` Richard Guenther
2011-07-07 10:27 ` Andrew Stubbs
2011-07-11 17:01 ` Andrew Stubbs
0 siblings, 2 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 10:00 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, gcc-patches, patches
On Mon, Jul 4, 2011 at 4:23 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 01/07/11 13:25, Richard Guenther wrote:
>>
>> Well - some operations work the same on both signedness if you
>> just care about the twos-complement result. This includes
>> multiplication (but not for example division). For this special
>> case I suggest to not bother trying to invent a generic predicate
>> but do something local in tree-ssa-math-opts.c.
>
> OK, here's my updated patch.
>
> I've taken the view that we *know* what size and signedness the result of
> the multiplication is, and we know what size the input to the addition must
> be, so all the check has to do is make sure it does that same conversion,
> even if by a roundabout means.
>
> What I hadn't grasped before is that when extending a value it's the source
> type that is significant, not the destination, so the checks are not as
> complex as I had thought.
>
> So, this patch adds a test to ensure that:
>
> 1. the type is not truncated so far that we lose any information; and
>
> 2. the type is only ever extended in the proper signedness.
>
> Also, just to be absolutely sure, I've also added a little bit of logic to
> permit extends that are then undone by a truncate. I'm really not sure what
> guarantees there are about what sort of cast sequences can exist? Is this
> necessary? I haven't managed to coax it to generated any examples of extends
> followed by truncates myself, but in any case, it's hardly any code and
> it'll make sure it's future proofed.
>
> OK?
I think you should assume that series of widenings, (int)(short)char_variable
are already combined. Thus I believe you only need to consider a single
conversion in valid_types_for_madd_p.
+/* Check the input types, TYPE1 and TYPE2 to a widening multiply,
what are those types? Is TYPE1 the result type and TYPE2 the
operand type? If so why
+ initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
this?!
+ initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
that also looks odd. So probably TYPE1 isn't the result type. If they
are the types of the operands, then what operand is EXPR for?
I didn't look at the actual implementation of the function because of the
lack of understanding of the inputs.
- if (TREE_CODE (rhs1) == SSA_NAME)
+ for (tmp = rhs1, rhs1_code = ERROR_MARK;
+ TREE_CODE (tmp) == SSA_NAME
+ && (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK);
+ tmp = gimple_assign_rhs1 (rhs1_stmt))
{
- rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
- if (is_gimple_assign (rhs1_stmt))
- rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ rhs1_stmt = SSA_NAME_DEF_STMT (tmp);
+ if (!is_gimple_assign (rhs1_stmt))
+ break;
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
the result looks a bit like spaghetti code ... and lacks a comment
on what it is trying to do. It looks like it sees through an arbitrary
number of conversions - possibly ones that will make the
macc invalid, as for (short)int-var * short-var + int-var. So you'll
be pessimizing code by doing that unconditionally. As I said
above you should at most consider one intermediate conversion.
I believe the code should be arranged such that only valid
conversions are looked through in the first place. Valid, in
that the resulting types should still match the macc constraints.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 10:00 ` Richard Guenther
@ 2011-07-07 10:27 ` Andrew Stubbs
2011-07-07 12:18 ` Andrew Stubbs
2011-07-11 17:01 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-07 10:27 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, gcc-patches, patches
On 07/07/11 10:58, Richard Guenther wrote:
> I think you should assume that series of widenings, (int)(short)char_variable
> are already combined. Thus I believe you only need to consider a single
> conversion in valid_types_for_madd_p.
Hmm, I'm not so sure. I'll look into it a bit further.
> +/* Check the input types, TYPE1 and TYPE2 to a widening multiply,
>
> what are those types? Is TYPE1 the result type and TYPE2 the
> operand type? If so why
TYPE1 and TYPE2 are the inputs to the multiply. I thought I explained
that in the comment before the function.
> + initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
>
> this?!
The result of the multiply will be this many bits wide. This may be
narrower than the type that holds it.
E.g., 16-bit * 8-bit gives a result at most 24-bits wide, which will
usually be held in a 32- or 64-bit variable.
> + initial_unsigned = TYPE_UNSIGNED (type1)&& TYPE_UNSIGNED (type2);
>
> that also looks odd. So probably TYPE1 isn't the result type. If they
> are the types of the operands, then what operand is EXPR for?
EXPR, as the comment says, is the addition that follows the multiply.
> - if (TREE_CODE (rhs1) == SSA_NAME)
> + for (tmp = rhs1, rhs1_code = ERROR_MARK;
> + TREE_CODE (tmp) == SSA_NAME
> +&& (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK);
> + tmp = gimple_assign_rhs1 (rhs1_stmt))
> {
> - rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
> - if (is_gimple_assign (rhs1_stmt))
> - rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
> + rhs1_stmt = SSA_NAME_DEF_STMT (tmp);
> + if (!is_gimple_assign (rhs1_stmt))
> + break;
> + rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
> }
>
> the result looks a bit like spaghetti code ... and lacks a comment
> on what it is trying to do. It looks like it sees through an arbitrary
> number of conversions - possibly ones that will make the
> macc invalid, as for (short)int-var * short-var + int-var. So you'll
> be pessimizing code by doing that unconditionally. As I said
> above you should at most consider one intermediate conversion.
Ok, I need to add a comment here. The code does indeed look back through
an arbitrary number of conversions. It is searching for the last real
operation before the addition, hoping to find a multiply.
> I believe the code should be arranged such that only valid
> conversions are looked through in the first place. Valid, in
> that the resulting types should still match the macc constraints.
Well, it might be possible to discard some conversions initially, but
until the multiply is found, and it's input types are known, we can't
know for certain what conversions are valid.
I think I need to explain what's going on here more clearly.
1. It finds an addition statement. It's not known yet whether it is
part of a multiply-and-accumulate, or not.
2. It follows the conversion chain back from each operand to see if
it finds a multiply, or widening multiply statement.
3. If it finds a non-widening multiply, it checks it to see if it
could be widening multiply-and-accumulate (it will already have been
rejected as a widening multiply on it's own, but the addition might be
in a wider mode, or the target might provide multiply-and-accumulate
insns that don't have corresponding widening multiply insns).
4. (This is the new bit!) It looks to see if there are any
conversions between the multiply and addition that can safely be ignored.
5. If we get here, then emit any necessary conversion statements, and
convert the addition to a WIDEN_MULT_PLUS_EXPR.
Before these changes, any conversion between the multiply and addition
statements would prevent optimization, even though there are many cases
where the conversions are valid, and even inserted automatically.
I'm going to go away and find out whether there are really any cases
where there can legitimately be more than one conversion, and at least
update my patch with better commenting.
Thanks for you review.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 10:27 ` Andrew Stubbs
@ 2011-07-07 12:18 ` Andrew Stubbs
2011-07-07 12:34 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-07 12:18 UTC (permalink / raw)
Cc: Richard Guenther, Michael Matz, gcc-patches, patches
On 07/07/11 11:26, Andrew Stubbs wrote:
> On 07/07/11 10:58, Richard Guenther wrote:
>> I think you should assume that series of widenings,
>> (int)(short)char_variable
>> are already combined. Thus I believe you only need to consider a single
>> conversion in valid_types_for_madd_p.
>
> Hmm, I'm not so sure. I'll look into it a bit further.
OK, here's a test case that gives multiple conversions:
long long
foo (long long a, signed char b, signed char c)
{
int bc = b * c;
return a + (short)bc;
}
The dump right before the widen_mult pass gives:
foo (long long int a, signed char b, signed char c)
{
int bc;
long long int D.2018;
short int D.2017;
long long int D.2016;
int D.2015;
int D.2014;
<bb 2>:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = D.2014_2 * D.2015_4;
D.2017_6 = (short int) bc_5;
D.2018_7 = (long long int) D.2017_6;
D.2016_9 = D.2018_7 + a_8(D);
return D.2016_9;
}
Here we have a multiply and accumulate done the long way. The 8-bit
inputs are widened to 32-bit, multiplied to give a 32-bit result (of
which only the lower 16-bits contain meaningful data), then truncated to
16-bits, and sign-extended up to 64-bits ready for the 64-bit addition.
This is slight contrived, perhaps, but not unlike the sort of thing that
might occur when you have inline functions and macros, and most
importantly - it is mathematically valid!
So, here's the output from my patched widen_mult pass:
foo (long long int a, signed char b, signed char c)
{
int bc;
long long int D.2018;
short int D.2017;
long long int D.2016;
int D.2015;
int D.2014;
<bb 2>:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = b_1(D) w* c_3(D);
D.2017_6 = (short int) bc_5;
D.2018_7 = (long long int) D.2017_6;
D.2016_9 = WIDEN_MULT_PLUS_EXPR <b_1(D), c_3(D), a_8(D)>;
return D.2016_9;
}
As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is
now redundant. (Ideally, this would be removed now, but in fact it
doesn't get eliminated until the RTL into_cfglayout pass. This is not
new behaviour.)
My point is that it's possible to have at least two conversions to
examine. Is it possible to have more? I don't know, but once I'm dealing
with two I might as well deal with an arbitrary number.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 12:18 ` Andrew Stubbs
@ 2011-07-07 12:34 ` Richard Guenther
2011-07-07 12:49 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 12:34 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, gcc-patches, patches
On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
> On 07/07/11 11:26, Andrew Stubbs wrote:
>>
>> On 07/07/11 10:58, Richard Guenther wrote:
>>>
>>> I think you should assume that series of widenings,
>>> (int)(short)char_variable
>>> are already combined. Thus I believe you only need to consider a single
>>> conversion in valid_types_for_madd_p.
>>
>> Hmm, I'm not so sure. I'll look into it a bit further.
>
> OK, here's a test case that gives multiple conversions:
>
> long long
> foo (long long a, signed char b, signed char c)
> {
> int bc = b * c;
> return a + (short)bc;
> }
>
> The dump right before the widen_mult pass gives:
>
> foo (long long int a, signed char b, signed char c)
> {
> int bc;
> long long int D.2018;
> short int D.2017;
> long long int D.2016;
> int D.2015;
> int D.2014;
>
> <bb 2>:
> D.2014_2 = (int) b_1(D);
> D.2015_4 = (int) c_3(D);
> bc_5 = D.2014_2 * D.2015_4;
> D.2017_6 = (short int) bc_5;
Ok, so you have a truncation that is a no-op value-wise. I would
argue that this truncation should be removed independent on
whether we have a widening multiply instruction or not.
The technically most capable place to remove non-value-changing
truncations (and combine them with a successive conversion)
would be value-range propagation. Which already knows:
Value ranges after VRP:
b_1(D): VARYING
D.2698_2: [-128, 127]
c_3(D): VARYING
D.2699_4: [-128, 127]
bc_5: [-16256, 16384]
D.2701_6: [-16256, 16384]
D.2702_7: [-16256, 16384]
a_8(D): VARYING
D.2700_9: VARYING
thus truncating bc_5 to short does not change the value.
The simplification could be made when looking at the
statement
> D.2018_7 = (long long int) D.2017_6;
in vrp_fold_stmt, based on the fact that this conversion
converts from a value-preserving intermediate conversion.
Thus the transform would replace the D.2017_6 operand
with bc_5.
So yes, the case appears - but it shouldn't ;)
I'll cook up a quick patch for VRP.
Thanks,
Richard.
> D.2016_9 = D.2018_7 + a_8(D);
> return D.2016_9;
>
> }
>
> Here we have a multiply and accumulate done the long way. The 8-bit inputs
> are widened to 32-bit, multiplied to give a 32-bit result (of which only the
> lower 16-bits contain meaningful data), then truncated to 16-bits, and
> sign-extended up to 64-bits ready for the 64-bit addition.
>
> This is slight contrived, perhaps, but not unlike the sort of thing that
> might occur when you have inline functions and macros, and most importantly
> - it is mathematically valid!
>
>
> So, here's the output from my patched widen_mult pass:
>
> foo (long long int a, signed char b, signed char c)
> {
> int bc;
> long long int D.2018;
> short int D.2017;
> long long int D.2016;
> int D.2015;
> int D.2014;
>
> <bb 2>:
> D.2014_2 = (int) b_1(D);
> D.2015_4 = (int) c_3(D);
> bc_5 = b_1(D) w* c_3(D);
> D.2017_6 = (short int) bc_5;
> D.2018_7 = (long long int) D.2017_6;
> D.2016_9 = WIDEN_MULT_PLUS_EXPR <b_1(D), c_3(D), a_8(D)>;
> return D.2016_9;
>
> }
>
> As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now
> redundant. (Ideally, this would be removed now, but in fact it doesn't get
> eliminated until the RTL into_cfglayout pass. This is not new behaviour.)
>
>
> My point is that it's possible to have at least two conversions to examine.
> Is it possible to have more? I don't know, but once I'm dealing with two I
> might as well deal with an arbitrary number.
>
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 12:34 ` Richard Guenther
@ 2011-07-07 12:49 ` Richard Guenther
2011-07-08 12:55 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 12:49 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]
On Thu, Jul 7, 2011 at 2:28 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
>> On 07/07/11 11:26, Andrew Stubbs wrote:
>>>
>>> On 07/07/11 10:58, Richard Guenther wrote:
>>>>
>>>> I think you should assume that series of widenings,
>>>> (int)(short)char_variable
>>>> are already combined. Thus I believe you only need to consider a single
>>>> conversion in valid_types_for_madd_p.
>>>
>>> Hmm, I'm not so sure. I'll look into it a bit further.
>>
>> OK, here's a test case that gives multiple conversions:
>>
>> long long
>> foo (long long a, signed char b, signed char c)
>> {
>> int bc = b * c;
>> return a + (short)bc;
>> }
>>
>> The dump right before the widen_mult pass gives:
>>
>> foo (long long int a, signed char b, signed char c)
>> {
>> int bc;
>> long long int D.2018;
>> short int D.2017;
>> long long int D.2016;
>> int D.2015;
>> int D.2014;
>>
>> <bb 2>:
>> D.2014_2 = (int) b_1(D);
>> D.2015_4 = (int) c_3(D);
>> bc_5 = D.2014_2 * D.2015_4;
>> D.2017_6 = (short int) bc_5;
>
> Ok, so you have a truncation that is a no-op value-wise. I would
> argue that this truncation should be removed independent on
> whether we have a widening multiply instruction or not.
>
> The technically most capable place to remove non-value-changing
> truncations (and combine them with a successive conversion)
> would be value-range propagation. Which already knows:
>
> Value ranges after VRP:
>
> b_1(D): VARYING
> D.2698_2: [-128, 127]
> c_3(D): VARYING
> D.2699_4: [-128, 127]
> bc_5: [-16256, 16384]
> D.2701_6: [-16256, 16384]
> D.2702_7: [-16256, 16384]
> a_8(D): VARYING
> D.2700_9: VARYING
>
> thus truncating bc_5 to short does not change the value.
>
> The simplification could be made when looking at the
> statement
>
>> D.2018_7 = (long long int) D.2017_6;
>
> in vrp_fold_stmt, based on the fact that this conversion
> converts from a value-preserving intermediate conversion.
> Thus the transform would replace the D.2017_6 operand
> with bc_5.
>
> So yes, the case appears - but it shouldn't ;)
>
> I'll cook up a quick patch for VRP.
Like the attached. I'll finish and properly test it.
Richard.
[-- Attachment #2: p --]
[-- Type: application/octet-stream, Size: 10194 bytes --]
Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c (revision 175962)
+++ gcc/tree-vrp.c (working copy)
@@ -161,10 +161,10 @@ static VEC (switch_update, heap) *to_upd
static inline tree
vrp_val_max (const_tree type)
{
- if (!INTEGRAL_TYPE_P (type))
- return NULL_TREE;
+ if (INTEGRAL_TYPE_P (type))
+ return upper_bound_in_type (CONST_CAST_TREE (type), CONST_CAST_TREE (type));
- return TYPE_MAX_VALUE (type);
+ return NULL_TREE;
}
/* Return the minimum value for TYPE. */
@@ -172,10 +172,10 @@ vrp_val_max (const_tree type)
static inline tree
vrp_val_min (const_tree type)
{
- if (!INTEGRAL_TYPE_P (type))
- return NULL_TREE;
+ if (INTEGRAL_TYPE_P (type))
+ return lower_bound_in_type (CONST_CAST_TREE (type), CONST_CAST_TREE (type));
- return TYPE_MIN_VALUE (type);
+ return NULL_TREE;
}
/* Return whether VAL is equal to the maximum value of its type. This
@@ -565,7 +565,7 @@ set_value_range_to_nonnegative (value_ra
set_value_range (vr, VR_RANGE, zero,
(overflow_infinity
? positive_overflow_infinity (type)
- : TYPE_MAX_VALUE (type)),
+ : vrp_val_max (type)),
vr->equiv);
}
@@ -1627,7 +1627,7 @@ extract_range_from_assert (value_range_t
}
else if (cond_code == LE_EXPR || cond_code == LT_EXPR)
{
- min = TYPE_MIN_VALUE (type);
+ min = vrp_val_min (type);
if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE)
max = limit;
@@ -1662,7 +1662,7 @@ extract_range_from_assert (value_range_t
}
else if (cond_code == GE_EXPR || cond_code == GT_EXPR)
{
- max = TYPE_MAX_VALUE (type);
+ max = vrp_val_max (type);
if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE)
min = limit;
@@ -2079,11 +2079,11 @@ vrp_int_const_binop (enum tree_code code
|| code == ROUND_DIV_EXPR)
return (needs_overflow_infinity (TREE_TYPE (res))
? positive_overflow_infinity (TREE_TYPE (res))
- : TYPE_MAX_VALUE (TREE_TYPE (res)));
+ : vrp_val_max (TREE_TYPE (res)));
else
return (needs_overflow_infinity (TREE_TYPE (res))
? negative_overflow_infinity (TREE_TYPE (res))
- : TYPE_MIN_VALUE (TREE_TYPE (res)));
+ : vrp_val_min (TREE_TYPE (res)));
}
return res;
@@ -2888,8 +2888,8 @@ extract_range_from_unary_expr (value_ran
&& TYPE_PRECISION (inner_type) < TYPE_PRECISION (outer_type))
{
vr0.type = VR_RANGE;
- vr0.min = TYPE_MIN_VALUE (inner_type);
- vr0.max = TYPE_MAX_VALUE (inner_type);
+ vr0.min = vrp_val_min (inner_type);
+ vr0.max = vrp_val_max (inner_type);
}
/* If VR0 is a constant range or anti-range and the conversion is
@@ -2974,7 +2974,7 @@ extract_range_from_unary_expr (value_ran
}
}
else
- min = TYPE_MIN_VALUE (type);
+ min = vrp_val_min (type);
if (is_positive_overflow_infinity (vr0.min))
max = negative_overflow_infinity (type);
@@ -2993,7 +2993,7 @@ extract_range_from_unary_expr (value_ran
}
}
else
- max = TYPE_MIN_VALUE (type);
+ max = vrp_val_min (type);
}
else if (code == NEGATE_EXPR
&& TYPE_UNSIGNED (type))
@@ -3035,7 +3035,7 @@ extract_range_from_unary_expr (value_ran
else if (!vrp_val_is_min (vr0.min))
min = fold_unary_to_constant (code, type, vr0.min);
else if (!needs_overflow_infinity (type))
- min = TYPE_MAX_VALUE (type);
+ min = vrp_val_max (type);
else if (supports_overflow_infinity (type))
min = positive_overflow_infinity (type);
else
@@ -3049,7 +3049,7 @@ extract_range_from_unary_expr (value_ran
else if (!vrp_val_is_min (vr0.max))
max = fold_unary_to_constant (code, type, vr0.max);
else if (!needs_overflow_infinity (type))
- max = TYPE_MAX_VALUE (type);
+ max = vrp_val_max (type);
else if (supports_overflow_infinity (type)
/* We shouldn't generate [+INF, +INF] as set_value_range
doesn't like this and ICEs. */
@@ -3079,7 +3079,7 @@ extract_range_from_unary_expr (value_ran
TYPE_MIN_VALUE, remember -TYPE_MIN_VALUE = TYPE_MIN_VALUE. */
if (TYPE_OVERFLOW_WRAPS (type))
{
- tree type_min_value = TYPE_MIN_VALUE (type);
+ tree type_min_value = vrp_val_min (type);
min = (vr0.min != type_min_value
? int_const_binop (PLUS_EXPR, type_min_value,
@@ -3091,7 +3091,7 @@ extract_range_from_unary_expr (value_ran
if (overflow_infinity_range_p (&vr0))
min = negative_overflow_infinity (type);
else
- min = TYPE_MIN_VALUE (type);
+ min = vrp_val_min (type);
}
}
else
@@ -3112,7 +3112,7 @@ extract_range_from_unary_expr (value_ran
}
}
else
- max = TYPE_MAX_VALUE (type);
+ max = vrp_val_max (type);
}
}
@@ -3396,11 +3396,11 @@ adjust_range_with_scev (value_range_t *v
if (POINTER_TYPE_P (type) || !TYPE_MIN_VALUE (type))
tmin = lower_bound_in_type (type, type);
else
- tmin = TYPE_MIN_VALUE (type);
+ tmin = vrp_val_min (type);
if (POINTER_TYPE_P (type) || !TYPE_MAX_VALUE (type))
tmax = upper_bound_in_type (type, type);
else
- tmax = TYPE_MAX_VALUE (type);
+ tmax = vrp_val_max (type);
/* Try to use estimated number of iterations for the loop to constrain the
final value in the evolution. */
@@ -4318,8 +4318,8 @@ extract_code_and_val_from_cond_with_ops
if ((comp_code == GT_EXPR || comp_code == LT_EXPR)
&& INTEGRAL_TYPE_P (TREE_TYPE (val)))
{
- tree min = TYPE_MIN_VALUE (TREE_TYPE (val));
- tree max = TYPE_MAX_VALUE (TREE_TYPE (val));
+ tree min = vrp_val_min (TREE_TYPE (val));
+ tree max = vrp_val_max (TREE_TYPE (val));
if (comp_code == GT_EXPR
&& (!max
@@ -6685,7 +6685,7 @@ vrp_visit_phi_node (gimple phi)
{
if (!needs_overflow_infinity (TREE_TYPE (vr_result.min))
|| !vrp_var_may_overflow (lhs, phi))
- vr_result.min = TYPE_MIN_VALUE (TREE_TYPE (vr_result.min));
+ vr_result.min = vrp_val_min (TREE_TYPE (vr_result.min));
else if (supports_overflow_infinity (TREE_TYPE (vr_result.min)))
vr_result.min =
negative_overflow_infinity (TREE_TYPE (vr_result.min));
@@ -6697,7 +6697,7 @@ vrp_visit_phi_node (gimple phi)
{
if (!needs_overflow_infinity (TREE_TYPE (vr_result.max))
|| !vrp_var_may_overflow (lhs, phi))
- vr_result.max = TYPE_MAX_VALUE (TREE_TYPE (vr_result.max));
+ vr_result.max = vrp_val_max (TREE_TYPE (vr_result.max));
else if (supports_overflow_infinity (TREE_TYPE (vr_result.max)))
vr_result.max =
positive_overflow_infinity (TREE_TYPE (vr_result.max));
@@ -7119,7 +7119,7 @@ test_for_singularity (enum tree_code con
{
/* This should not be negative infinity; there is no overflow
here. */
- min = TYPE_MIN_VALUE (TREE_TYPE (op0));
+ min = vrp_val_min (TREE_TYPE (op0));
max = op1;
if (cond_code == LT_EXPR && !is_overflow_infinity (max))
@@ -7134,7 +7134,7 @@ test_for_singularity (enum tree_code con
{
/* This should not be positive infinity; there is no overflow
here. */
- max = TYPE_MAX_VALUE (TREE_TYPE (op0));
+ max = vrp_val_max (TREE_TYPE (op0));
min = op1;
if (cond_code == GT_EXPR && !is_overflow_infinity (min))
@@ -7342,6 +7342,33 @@ simplify_switch_using_ranges (gimple stm
return false;
}
+/* Simplify an integral conversion from an SSA name in STMT. */
+
+static bool
+simplify_conversion_using_ranges (gimple stmt)
+{
+ tree rhs1 = gimple_assign_rhs1 (stmt);
+ tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+ gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
+ value_range_t *vr;
+
+ if (!is_gimple_assign (def_stmt)
+ || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
+ return false;
+ rhs1 = gimple_assign_rhs1 (def_stmt);
+ if (TREE_CODE (rhs1) != SSA_NAME)
+ return false;
+ vr = get_value_range (rhs1);
+ if (vr->type != VR_RANGE)
+ return false;
+ if (!int_fits_type_p (vr->min, type)
+ || !int_fits_type_p (vr->max, type))
+ return false;
+ gimple_assign_set_rhs1 (stmt, rhs1);
+ update_stmt (stmt);
+ return true;
+}
+
/* Simplify STMT using ranges if possible. */
static bool
@@ -7351,6 +7378,7 @@ simplify_stmt_using_ranges (gimple_stmt_
if (is_gimple_assign (stmt))
{
enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
+ tree rhs1 = gimple_assign_rhs1 (stmt);
switch (rhs_code)
{
@@ -7364,7 +7392,7 @@ simplify_stmt_using_ranges (gimple_stmt_
or identity if the RHS is zero or one, and the LHS are known
to be boolean values. Transform all TRUTH_*_EXPR into
BIT_*_EXPR if both arguments are known to be boolean values. */
- if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))
+ if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_truth_ops_using_ranges (gsi, stmt);
break;
@@ -7373,15 +7401,15 @@ simplify_stmt_using_ranges (gimple_stmt_
than zero and the second operand is an exact power of two. */
case TRUNC_DIV_EXPR:
case TRUNC_MOD_EXPR:
- if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+ if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
&& integer_pow2p (gimple_assign_rhs2 (stmt)))
return simplify_div_or_mod_using_ranges (stmt);
break;
/* Transform ABS (X) into X or -X as appropriate. */
case ABS_EXPR:
- if (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
- && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))
+ if (TREE_CODE (rhs1) == SSA_NAME
+ && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_abs_using_ranges (stmt);
break;
@@ -7390,10 +7418,16 @@ simplify_stmt_using_ranges (gimple_stmt_
/* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
if all the bits being cleared are already cleared or
all the bits being set are already set. */
- if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))
+ if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_bit_ops_using_ranges (gsi, stmt);
break;
+ CASE_CONVERT:
+ if (TREE_CODE (rhs1) == SSA_NAME
+ && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
+ return simplify_conversion_using_ranges (stmt);
+ break;
+
default:
break;
}
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 12:49 ` Richard Guenther
@ 2011-07-08 12:55 ` Andrew Stubbs
2011-07-08 13:22 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-08 12:55 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, gcc-patches, patches
On 07/07/11 13:37, Richard Guenther wrote:
>> I'll cook up a quick patch for VRP.
>
> Like the attached. I'll finish and properly test it.
Your patch appears to do the wrong thing for this test case:
int
foo (int a, short b, short c)
{
int bc = b * c;
return a + (short)bc;
}
With your patch, the input to the widening-mult pass now looks like this:
foo (int a, short int b, short int c)
{
int bc;
int D.2016;
int D.2015;
int D.2014;
<bb 2>:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = D.2014_2 * D.2015_4;
D.2016_9 = bc_5 + a_8(D);
return D.2016_9;
}
It looks like when the user tries to deliberately break the maths your
patch seems to unbreak it.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-08 12:55 ` Andrew Stubbs
@ 2011-07-08 13:22 ` Richard Guenther
0 siblings, 0 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-08 13:22 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, gcc-patches, patches
On Fri, Jul 8, 2011 at 2:44 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 07/07/11 13:37, Richard Guenther wrote:
>>>
>>> I'll cook up a quick patch for VRP.
>>
>> Like the attached. I'll finish and properly test it.
>
> Your patch appears to do the wrong thing for this test case:
>
> int
> foo (int a, short b, short c)
> {
> int bc = b * c;
> return a + (short)bc;
> }
>
> With your patch, the input to the widening-mult pass now looks like this:
>
> foo (int a, short int b, short int c)
> {
> int bc;
> int D.2016;
> int D.2015;
> int D.2014;
>
> <bb 2>:
> D.2014_2 = (int) b_1(D);
> D.2015_4 = (int) c_3(D);
> bc_5 = D.2014_2 * D.2015_4;
> D.2016_9 = bc_5 + a_8(D);
> return D.2016_9;
>
> }
>
> It looks like when the user tries to deliberately break the maths your patch
> seems to unbreak it.
Yeah, I fixed that in the checked in version.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-07 10:00 ` Richard Guenther
2011-07-07 10:27 ` Andrew Stubbs
@ 2011-07-11 17:01 ` Andrew Stubbs
2011-07-12 11:05 ` Richard Guenther
2011-07-14 14:26 ` Andrew Stubbs
1 sibling, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-11 17:01 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]
On 07/07/11 10:58, Richard Guenther wrote:
> I think you should assume that series of widenings, (int)(short)char_variable
> are already combined. Thus I believe you only need to consider a single
> conversion in valid_types_for_madd_p.
Ok, here's my new patch.
This version only allows one conversion between the multiply and
addition, so assumes that VRP has eliminated any needless ones.
That one conversion may either be a truncate, if the mode was too large
for the meaningful data, or an extend, which must be of the right flavour.
This means that this patch now has the same effect as the last patch,
for all valid cases (following you VRP patch), but rejects the cases
where the C language (unhelpfully) requires an intermediate temporary to
be of the 'wrong' signedness.
Hopefully the output will now be the same between both -O0 and -O2, and
programmers will continue to have to be careful about casting unsigned
variables whenever they expect purely unsigned math. :(
Is this one ok?
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 4415 bytes --]
2011-07-11 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
conversion statement separating multiply-and-accumulate.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: New file.
* gcc.target/arm/no-wmla-1.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+ return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2135,6 +2135,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code code)
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
+ gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
tree type, type1, type2;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
@@ -2175,6 +2176,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
+ /* Allow for one conversion statement between the multiply
+ and addition/subtraction statement. If there are more than
+ one conversions then we assume they would invalidate this
+ transformation. If that's not the case then they should have
+ been folded before now. */
+ if (CONVERT_EXPR_CODE_P (rhs1_code))
+ {
+ conv1_stmt = rhs1_stmt;
+ rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+ if (TREE_CODE (rhs1) == SSA_NAME)
+ {
+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+ if (is_gimple_assign (rhs1_stmt))
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ }
+ else
+ return false;
+ }
+ if (CONVERT_EXPR_CODE_P (rhs2_code))
+ {
+ conv2_stmt = rhs2_stmt;
+ rhs2 = gimple_assign_rhs1 (rhs2_stmt);
+ if (TREE_CODE (rhs2) == SSA_NAME)
+ {
+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+ if (is_gimple_assign (rhs2_stmt))
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ }
+ else
+ return false;
+ }
+
/* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
is_widening_mult_p, but we still need the rhs returns.
@@ -2188,6 +2221,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
+ conv_stmt = conv1_stmt;
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
@@ -2195,6 +2229,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
+ conv_stmt = conv2_stmt;
}
else
return false;
@@ -2202,6 +2237,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
+ /* If there was a conversion between the multiply and addition
+ then we need to make sure it fits a multiply-and-accumulate.
+ The should be a single mode change which does not change the
+ value. */
+ if (conv_stmt)
+ {
+ tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
+ tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
+ int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+ bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+ if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type))
+ {
+ /* Conversion is a truncate. */
+ if (TYPE_PRECISION (to_type) < data_size)
+ return false;
+ }
+ else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type))
+ {
+ /* Conversion is an extend. Check it's the right sort. */
+ if (TYPE_UNSIGNED (from_type) != is_unsigned
+ && !(is_unsigned && TYPE_PRECISION (from_type) > data_size))
+ return false;
+ }
+ /* else convert is a no-op for our purposes. */
+ }
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-11 17:01 ` Andrew Stubbs
@ 2011-07-12 11:05 ` Richard Guenther
2011-08-19 14:50 ` Andrew Stubbs
2011-07-14 14:26 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-12 11:05 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, gcc-patches, patches
On Mon, Jul 11, 2011 at 6:55 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 07/07/11 10:58, Richard Guenther wrote:
>>
>> I think you should assume that series of widenings,
>> (int)(short)char_variable
>> are already combined. Thus I believe you only need to consider a single
>> conversion in valid_types_for_madd_p.
>
> Ok, here's my new patch.
>
> This version only allows one conversion between the multiply and addition,
> so assumes that VRP has eliminated any needless ones.
>
> That one conversion may either be a truncate, if the mode was too large for
> the meaningful data, or an extend, which must be of the right flavour.
>
> This means that this patch now has the same effect as the last patch, for
> all valid cases (following you VRP patch), but rejects the cases where the C
> language (unhelpfully) requires an intermediate temporary to be of the
> 'wrong' signedness.
>
> Hopefully the output will now be the same between both -O0 and -O2, and
> programmers will continue to have to be careful about casting unsigned
> variables whenever they expect purely unsigned math. :(
>
> Is this one ok?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-12 11:05 ` Richard Guenther
@ 2011-08-19 14:50 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 14:50 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 184 bytes --]
On 12/07/11 11:52, Richard Guenther wrote:
>> Is this one ok?
> Ok.
I've just committed this slightly modified patch.
The changes are mainly in the context and the testcase.
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 4488 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
conversion statement separating multiply-and-accumulate.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: New file.
* gcc.target/arm/no-wmla-1.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+ return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "\tmul\t" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2136,6 +2136,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code code)
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
+ gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
tree type, type1, type2, tmp;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
@@ -2178,6 +2179,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
+ /* Allow for one conversion statement between the multiply
+ and addition/subtraction statement. If there are more than
+ one conversions then we assume they would invalidate this
+ transformation. If that's not the case then they should have
+ been folded before now. */
+ if (CONVERT_EXPR_CODE_P (rhs1_code))
+ {
+ conv1_stmt = rhs1_stmt;
+ rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+ if (TREE_CODE (rhs1) == SSA_NAME)
+ {
+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+ if (is_gimple_assign (rhs1_stmt))
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ }
+ else
+ return false;
+ }
+ if (CONVERT_EXPR_CODE_P (rhs2_code))
+ {
+ conv2_stmt = rhs2_stmt;
+ rhs2 = gimple_assign_rhs1 (rhs2_stmt);
+ if (TREE_CODE (rhs2) == SSA_NAME)
+ {
+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+ if (is_gimple_assign (rhs2_stmt))
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ }
+ else
+ return false;
+ }
+
/* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
is_widening_mult_p, but we still need the rhs returns.
@@ -2191,6 +2224,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
+ conv_stmt = conv1_stmt;
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
@@ -2198,6 +2232,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
+ conv_stmt = conv2_stmt;
}
else
return false;
@@ -2208,6 +2243,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
+ /* If there was a conversion between the multiply and addition
+ then we need to make sure it fits a multiply-and-accumulate.
+ The should be a single mode change which does not change the
+ value. */
+ if (conv_stmt)
+ {
+ tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
+ tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
+ int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+ bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+ if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type))
+ {
+ /* Conversion is a truncate. */
+ if (TYPE_PRECISION (to_type) < data_size)
+ return false;
+ }
+ else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type))
+ {
+ /* Conversion is an extend. Check it's the right sort. */
+ if (TYPE_UNSIGNED (from_type) != is_unsigned
+ && !(is_unsigned && TYPE_PRECISION (from_type) > data_size))
+ return false;
+ }
+ /* else convert is a no-op for our purposes. */
+ }
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-11 17:01 ` Andrew Stubbs
2011-07-12 11:05 ` Richard Guenther
@ 2011-07-14 14:26 ` Andrew Stubbs
2011-07-19 0:36 ` Janis Johnson
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:26 UTC (permalink / raw)
Cc: Richard Guenther, Michael Matz, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 153 bytes --]
This update changes only the context modified by changes to patch 2. The
patch has already been approved. I'm just posting it for completeness.
Andrew
[-- Attachment #2: widening-multiplies-3.patch --]
[-- Type: text/x-patch, Size: 4420 bytes --]
2011-07-14 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
conversion statement separating multiply-and-accumulate.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: New file.
* gcc.target/arm/no-wmla-1.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+ return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2135,6 +2135,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code code)
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
+ gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
tree type, type1, type2, tmp;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
@@ -2177,6 +2178,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
+ /* Allow for one conversion statement between the multiply
+ and addition/subtraction statement. If there are more than
+ one conversions then we assume they would invalidate this
+ transformation. If that's not the case then they should have
+ been folded before now. */
+ if (CONVERT_EXPR_CODE_P (rhs1_code))
+ {
+ conv1_stmt = rhs1_stmt;
+ rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+ if (TREE_CODE (rhs1) == SSA_NAME)
+ {
+ rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+ if (is_gimple_assign (rhs1_stmt))
+ rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+ }
+ else
+ return false;
+ }
+ if (CONVERT_EXPR_CODE_P (rhs2_code))
+ {
+ conv2_stmt = rhs2_stmt;
+ rhs2 = gimple_assign_rhs1 (rhs2_stmt);
+ if (TREE_CODE (rhs2) == SSA_NAME)
+ {
+ rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+ if (is_gimple_assign (rhs2_stmt))
+ rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+ }
+ else
+ return false;
+ }
+
/* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
is_widening_mult_p, but we still need the rhs returns.
@@ -2190,6 +2223,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
+ conv_stmt = conv1_stmt;
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
@@ -2197,6 +2231,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
+ conv_stmt = conv2_stmt;
}
else
return false;
@@ -2207,6 +2242,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
return false;
+ /* If there was a conversion between the multiply and addition
+ then we need to make sure it fits a multiply-and-accumulate.
+ The should be a single mode change which does not change the
+ value. */
+ if (conv_stmt)
+ {
+ tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
+ tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
+ int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+ bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+ if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type))
+ {
+ /* Conversion is a truncate. */
+ if (TYPE_PRECISION (to_type) < data_size)
+ return false;
+ }
+ else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type))
+ {
+ /* Conversion is an extend. Check it's the right sort. */
+ if (TYPE_UNSIGNED (from_type) != is_unsigned
+ && !(is_unsigned && TYPE_PRECISION (from_type) > data_size))
+ return false;
+ }
+ /* else convert is a no-op for our purposes. */
+ }
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-14 14:26 ` Andrew Stubbs
@ 2011-07-19 0:36 ` Janis Johnson
2011-07-19 9:01 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Janis Johnson @ 2011-07-19 0:36 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Richard Guenther, Michael Matz, gcc-patches, patches
On 07/14/2011 07:16 AM, Andrew Stubbs wrote:
> { dg-options "-O2 -march=armv7-a" }
The tests use "{ dg-options "-O2 -march=armv7-a" }" but -march will be
overridden for multilibs that specify -march, and might conflict with
other multilib options. If you really need that particular -march value
then use dg-skip-if to skip multilibs with conflicting or overriding
options, or else "dg-require-effective-target arm_dsp" to only run the
tests when the multilib already supports it; that has the advantage of
testing a wider range of arch values.
Janis
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-19 0:36 ` Janis Johnson
@ 2011-07-19 9:01 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-19 9:01 UTC (permalink / raw)
To: Janis Johnson; +Cc: Richard Guenther, Michael Matz, gcc-patches, patches
On 19/07/11 00:33, Janis Johnson wrote:
> On 07/14/2011 07:16 AM, Andrew Stubbs wrote:
>> { dg-options "-O2 -march=armv7-a" }
>
> The tests use "{ dg-options "-O2 -march=armv7-a" }" but -march will be
> overridden for multilibs that specify -march, and might conflict with
> other multilib options. If you really need that particular -march value
> then use dg-skip-if to skip multilibs with conflicting or overriding
> options, or else "dg-require-effective-target arm_dsp" to only run the
> tests when the multilib already supports it; that has the advantage of
> testing a wider range of arch values.
Yes, I know about this one. You committed that feature since I first
posted this, I think? I plan to make that change when I do the final commit.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 11:58 ` Stubbs, Andrew
2011-07-01 12:25 ` Richard Guenther
@ 2011-07-01 12:33 ` Paolo Bonzini
2011-07-01 13:31 ` Stubbs, Andrew
1 sibling, 1 reply; 107+ messages in thread
From: Paolo Bonzini @ 2011-07-01 12:33 UTC (permalink / raw)
To: Stubbs, Andrew
Cc: Michael Matz, Andrew Stubbs, Richard Guenther, gcc-patches, patches
On 07/01/2011 01:58 PM, Stubbs, Andrew wrote:
> Given this test case:
>
> unsigned long long
> foo (unsigned long long a, signed char *b, signed char *c)
> {
> return a + *b * *c;
> }
>
> Those rules say that it should not be suitable for optimization because
> there's an implicit cast from signed int to unsigned long long.
Got it now! Casts from signed to unsigned are not value-preserving, but
they are "bit-preserving": s32->s64 obviously is, and s32->u64 has the
same result bit-by-bit as the s64 result. The fact that s64 has an
implicit 1111... in front, while an u64 has an implicit 0000... does not
matter.
Is this the meaning of the predicate you want? I think so, based on the
discussion, but it's hard to say without seeing the cases enumerated
(i.e. a patch).
However, perhaps there is a catch. We can do the following thought
experiment. What would happen if you had multiple widening multiplies?
Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to
128-bit unsigned? I believe in this case you couldn't optimize 8-bit
signed to 128-bit unsigned. Would your code do it?
Paolo
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 12:33 ` Paolo Bonzini
@ 2011-07-01 13:31 ` Stubbs, Andrew
2011-07-01 14:41 ` Paolo Bonzini
2011-07-01 15:10 ` Stubbs, Andrew
0 siblings, 2 replies; 107+ messages in thread
From: Stubbs, Andrew @ 2011-07-01 13:31 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 01/07/11 13:33, Paolo Bonzini wrote:
> Got it now! Casts from signed to unsigned are not value-preserving, but
> they are "bit-preserving": s32->s64 obviously is, and s32->u64 has the
> same result bit-by-bit as the s64 result. The fact that s64 has an
> implicit 1111... in front, while an u64 has an implicit 0000... does not
> matter.
But, the 1111... and 0000... are not implicit. They are very real, and
if applied incorrectly will change the result, I think.
> Is this the meaning of the predicate you want? I think so, based on the
> discussion, but it's hard to say without seeing the cases enumerated
> (i.e. a patch).
The purpose of this predicate is to determine whether any type
conversions that occur between the output of a widening multiply, and
the input of an addition have any bearing on the end result.
We know what the effective output type of the multiply is (the size is
2x the input type, and the signed if either one of the inputs in
signed), and we know what the input type of the addition is, but any
amount of junk can lie in between. The problem is determining if it *is*
junk.
In an ideal world there would only be two cases to consider:
1. No conversion needed.
2. A single sign-extend or zero-extend (according to the type of the
inputs) to match the input size of the addition.
Anything else would be unsuitable for optimization. Of course, it's
never that simple, but it should still be possible to boil down a list
of conversions to one of these cases, if it's valid.
The signedness of the input to the addition is not significant - the
code would be the same either way. But, I is important not to try to
zero-extend something that started out signed, and not to sign-extend
something that started out unsigned.
> However, perhaps there is a catch. We can do the following thought
> experiment. What would happen if you had multiple widening multiplies?
> Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit
> unsigned? I believe in this case you couldn't optimize 8-bit signed to
> 128-bit unsigned. Would your code do it?
My code does not attempt to combine multiple multiplies. In any case, if
you have two multiplications, surely you have at least three input
values, so they can't be combined?
It does attempt to combine a multiply and an addition, where a suitable
madd* insn is available. (This is not new; I'm just trying to do it in
more cases.)
I have considered the case where you have "(a * b) + (c * d)", but have
not yet coded anything for it. At present, the code will simply choose
whichever multiply happens to find itself the first input operand of the
plus, and ignores the other, even if the first turns out not to be a
suitable candidate.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 13:31 ` Stubbs, Andrew
@ 2011-07-01 14:41 ` Paolo Bonzini
2011-07-01 14:55 ` Stubbs, Andrew
2011-07-01 15:10 ` Stubbs, Andrew
1 sibling, 1 reply; 107+ messages in thread
From: Paolo Bonzini @ 2011-07-01 14:41 UTC (permalink / raw)
To: Stubbs, Andrew; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 07/01/2011 03:30 PM, Stubbs, Andrew wrote:
>> > However, perhaps there is a catch. We can do the following thought
>> > experiment. What would happen if you had multiple widening multiplies?
>> > Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit
>> > unsigned? I believe in this case you couldn't optimize 8-bit signed to
>> > 128-bit unsigned. Would your code do it?
> My code does not attempt to combine multiple multiplies. In any case, if
> you have two multiplications, surely you have at least three input
> values, so they can't be combined?
What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to
(u128)c + (u128)((s8)a * (s8)b).
Paolo
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 14:41 ` Paolo Bonzini
@ 2011-07-01 14:55 ` Stubbs, Andrew
2011-07-01 15:54 ` Paolo Bonzini
0 siblings, 1 reply; 107+ messages in thread
From: Stubbs, Andrew @ 2011-07-01 14:55 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 01/07/11 15:40, Paolo Bonzini wrote:
> On 07/01/2011 03:30 PM, Stubbs, Andrew wrote:
>>> > However, perhaps there is a catch. We can do the following thought
>>> > experiment. What would happen if you had multiple widening multiplies?
>>> > Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to
>>> 128-bit
>>> > unsigned? I believe in this case you couldn't optimize 8-bit signed to
>>> > 128-bit unsigned. Would your code do it?
>> My code does not attempt to combine multiple multiplies. In any case, if
>> you have two multiplications, surely you have at least three input
>> values, so they can't be combined?
>
> What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to
> (u128)c + (u128)((s8)a * (s8)b).
Oh I see, sorry. Yes, that's exactly what I'm trying to do here.
No, wait, I don't see. Where are these multiple widening multiplies
you're talking about? I only see one multiply?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 14:55 ` Stubbs, Andrew
@ 2011-07-01 15:54 ` Paolo Bonzini
2011-07-01 18:18 ` Stubbs, Andrew
0 siblings, 1 reply; 107+ messages in thread
From: Paolo Bonzini @ 2011-07-01 15:54 UTC (permalink / raw)
To: Stubbs, Andrew; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 07/01/2011 04:55 PM, Stubbs, Andrew wrote:
>> >
>> > What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to
>> > (u128)c + (u128)((s8)a * (s8)b).
> Oh I see, sorry. Yes, that's exactly what I'm trying to do here.
>
> No, wait, I don't see. Where are these multiple widening multiplies
> you're talking about? I only see one multiply?
I meant one multiplication with multiple widening steps. Not clear at
all, sorry.
Paolo
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 15:54 ` Paolo Bonzini
@ 2011-07-01 18:18 ` Stubbs, Andrew
0 siblings, 0 replies; 107+ messages in thread
From: Stubbs, Andrew @ 2011-07-01 18:18 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 01/07/11 16:54, Paolo Bonzini wrote:
> On 07/01/2011 04:55 PM, Stubbs, Andrew wrote:
>>> >
>>> > What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to
>>> > (u128)c + (u128)((s8)a * (s8)b).
>> Oh I see, sorry. Yes, that's exactly what I'm trying to do here.
>>
>> No, wait, I don't see. Where are these multiple widening multiplies
>> you're talking about? I only see one multiply?
>
> I meant one multiplication with multiple widening steps. Not clear at
> all, sorry.
Yes, I see now, the whole purpose of my patch set is widening by more
than one mode.
The case of the multiply-and-accumulate is the only way there can be
more than one step though. Widening multiplies themselves are always
handled as one unit.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-07-01 13:31 ` Stubbs, Andrew
2011-07-01 14:41 ` Paolo Bonzini
@ 2011-07-01 15:10 ` Stubbs, Andrew
1 sibling, 0 replies; 107+ messages in thread
From: Stubbs, Andrew @ 2011-07-01 15:10 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 01/07/11 14:30, Stubbs, Andrew wrote:
>> Got it now! Casts from signed to unsigned are not value-preserving, but
>> > they are "bit-preserving": s32->s64 obviously is, and s32->u64 has the
>> > same result bit-by-bit as the s64 result. The fact that s64 has an
>> > implicit 1111... in front, while an u64 has an implicit 0000... does not
>> > matter.
> But, the 1111... and 0000... are not implicit. They are very real, and
> if applied incorrectly will change the result, I think.
Wait, I'm clearly confused ....
When I try a s32->u64 conversion, the expand pass generates a
sign_extend insn.
Clearly it's the source type that determines the extension type, not the
destination type ... and I'm a dunce!
Thanks :)
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-28 16:48 ` Andrew Stubbs
2011-06-28 17:09 ` Michael Matz
@ 2011-07-01 16:40 ` Bernd Schmidt
1 sibling, 0 replies; 107+ messages in thread
From: Bernd Schmidt @ 2011-07-01 16:40 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Michael Matz, Richard Guenther, gcc-patches, patches
On 06/28/11 18:14, Andrew Stubbs wrote:
> unsigned long long
> foo (unsigned long long a, unsigned char b, unsigned char c)
> {
> return a + b * c;
> }
>
> This appears to be entirely unsigned maths with plenty of spare
> precision, and therefore a dead cert for any SI->DI
> multiply-and-accumulate instruction, but not so - it is represented
> internally as:
>
> signed int tmp = (signed int)a * (signed int)b;
> unsigned long long result = a + (unsigned long long)tmp;
>
> Notice the unexpected signed int in the middle! I need to be able to get
> past that to optimize this properly.
Since both inputs are positive in a signed int (they must be, being cast
from a smaller unsigned value), you can infer that it does not matter
whether you treat the result of the multiplication as a signed or an
unsigned value. It is positive in any case.
So, I think the thing to test is: if the accumulate step requires
widening the result of the multiplication, either the cast must be value
preserving (widening unsigned to signed), or you must be able to prove
that the multiplication produces a positive result.
If the accumulate step just casts the multiplication result from signed
to unsigned, keeping the precision the same, you can ignore the cast
since the addition is unaffected by it.
Bernd
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
2011-06-23 14:42 ` [PATCH (3/7)] Widening multiply-and-accumulate pattern matching Andrew Stubbs
2011-06-23 16:28 ` Richard Guenther
@ 2011-06-23 21:55 ` Janis Johnson
1 sibling, 0 replies; 107+ messages in thread
From: Janis Johnson @ 2011-06-23 21:55 UTC (permalink / raw)
To: gcc-patches, Andrew Stubbs
On 06/23/2011 07:40 AM, Andrew Stubbs wrote:
+++ b/gcc/testsuite/gcc.target/arm/umlal-1.c
+/* { dg-final { scan-assembler "umlal" } } */
Don't use the name of the instruction as the test name or the scan
will always pass, because the file name shows up in assembly output.
See http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01823.html for a
proposed effective target that can be used in this test.
Janis
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (2 preceding siblings ...)
2011-06-23 14:42 ` [PATCH (3/7)] Widening multiply-and-accumulate pattern matching Andrew Stubbs
@ 2011-06-23 14:43 ` Andrew Stubbs
2011-06-28 13:28 ` Andrew Stubbs
2011-06-28 13:30 ` Paolo Bonzini
2011-06-23 14:44 ` [PATCH (5/7)] Widening multiplies for mis-matched mode inputs Andrew Stubbs
` (5 subsequent siblings)
9 siblings, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:43 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 713 bytes --]
If one or both of the inputs to a widening multiply are of unsigned type
then the compiler will attempt to use usmul_widen_optab or
umul_widen_optab, respectively.
That works fine, but only if the target supports those operations
directly. Otherwise, it just bombs out and reverts to the normal
inefficient non-widening multiply.
This patch attempts to catch these cases and use an alternative signed
widening multiply instruction, if one of those is available.
I believe this should be legal as long as the top bit of both inputs is
guaranteed to be zero. The code achieves this guarantee by
zero-extending the inputs to a wider mode (which must still be narrower
than the output mode).
OK?
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7324 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency.
* optabs.c (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this, and add new
argument 'found_mode'.
* optabs.h (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this.
(find_widening_optab_handler): New macro.
* tree-ssa-math-opts.c: Include langhooks.h
(build_and_insert_cast): New function.
(convert_mult_to_widen): Add new argument 'gsi'.
Convert unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
(execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen.
gcc/testsuite/
* gcc.target/arm/smlalbb-1.c: New file.
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \
$(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \
- $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h
+ $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \
+ langhooks.h
tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \
$(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
non-widening optabs also. */
enum insn_code
-find_widening_optab_handler (optab op, enum machine_mode to_mode,
- enum machine_mode from_mode,
- int permit_non_widening)
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
{
for (; (permit_non_widening || from_mode != to_mode)
&& GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
@@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode,
from_mode);
if (handler != CODE_FOR_nothing)
- return handler;
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
}
return CODE_FOR_nothing;
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
/* Find a widening optab even if it doesn't widen as much as we want. */
-extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
- enum machine_mode, int);
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/smlalbb-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "basic-block.h"
#include "target.h"
#include "gimple-pretty-print.h"
+#include "langhooks.h"
/* FIXME: RTL headers have to be included here for optabs. */
#include "rtl.h" /* Because optabs.h wants enum rtx_code. */
@@ -1086,6 +1087,21 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TYPE, and put the result in
+ TARGET. Insert the statement prior to GSI's current position, and
+ return the from SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val, tree type)
+{
+ tree result = make_ssa_name (target, NULL);
+ gimple stmt = gimple_build_assign (result, fold_convert (type, val));
+ gimple_set_location (stmt, loc);
+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+ return result;
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -2047,7 +2063,7 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
@@ -2075,7 +2091,31 @@ convert_mult_to_widen (gimple stmt)
handler = find_widening_optab_handler (op, to_mode, from_mode, 0);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &from_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
+
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+ }
+ else
+ return false;
+ }
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
@@ -2182,7 +2222,22 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
return false;
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ {
+ enum machine_mode mode = TYPE_MODE (type1);
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+ {
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+ }
+ else
+ return false;
+ }
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
@@ -2410,7 +2465,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-06-23 14:43 ` [PATCH (4/7)] Unsigned multiplies using wider signed multiplies Andrew Stubbs
@ 2011-06-28 13:28 ` Andrew Stubbs
2011-06-28 14:49 ` Andrew Stubbs
2011-06-28 13:30 ` Paolo Bonzini
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 13:28 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 832 bytes --]
On 23/06/11 15:41, Andrew Stubbs wrote:
> If one or both of the inputs to a widening multiply are of unsigned type
> then the compiler will attempt to use usmul_widen_optab or
> umul_widen_optab, respectively.
>
> That works fine, but only if the target supports those operations
> directly. Otherwise, it just bombs out and reverts to the normal
> inefficient non-widening multiply.
>
> This patch attempts to catch these cases and use an alternative signed
> widening multiply instruction, if one of those is available.
>
> I believe this should be legal as long as the top bit of both inputs is
> guaranteed to be zero. The code achieves this guarantee by
> zero-extending the inputs to a wider mode (which must still be narrower
> than the output mode).
>
> OK?
This update fixes the testsuite issue Janis pointed out.
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7316 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency.
* optabs.c (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this, and add new
argument 'found_mode'.
* optabs.h (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this.
(find_widening_optab_handler): New macro.
* tree-ssa-math-opts.c: Include langhooks.h
(build_and_insert_cast): New function.
(convert_mult_to_widen): Add new argument 'gsi'.
Convert unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
(execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen.
gcc/testsuite/
* gcc.target/arm/wmul-6.c: New file.
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \
$(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \
- $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h
+ $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \
+ langhooks.h
tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \
$(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
non-widening optabs also. */
enum insn_code
-find_widening_optab_handler (optab op, enum machine_mode to_mode,
- enum machine_mode from_mode,
- int permit_non_widening)
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
{
for (; (permit_non_widening || from_mode != to_mode)
&& GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
@@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode,
from_mode);
if (handler != CODE_FOR_nothing)
- return handler;
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
}
return CODE_FOR_nothing;
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
/* Find a widening optab even if it doesn't widen as much as we want. */
-extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
- enum machine_mode, int);
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "basic-block.h"
#include "target.h"
#include "gimple-pretty-print.h"
+#include "langhooks.h"
/* FIXME: RTL headers have to be included here for optabs. */
#include "rtl.h" /* Because optabs.h wants enum rtx_code. */
@@ -1086,6 +1087,21 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TYPE, and put the result in
+ TARGET. Insert the statement prior to GSI's current position, and
+ return the from SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val, tree type)
+{
+ tree result = make_ssa_name (target, NULL);
+ gimple stmt = gimple_build_assign (result, fold_convert (type, val));
+ gimple_set_location (stmt, loc);
+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+ return result;
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -2047,7 +2063,7 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
@@ -2075,7 +2091,31 @@ convert_mult_to_widen (gimple stmt)
handler = find_widening_optab_handler (op, to_mode, from_mode, 0);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &from_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
+
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+ }
+ else
+ return false;
+ }
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
@@ -2165,7 +2205,22 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
return false;
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ {
+ enum machine_mode mode = TYPE_MODE (type1);
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+ {
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+ }
+ else
+ return false;
+ }
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
@@ -2393,7 +2448,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-06-28 13:28 ` Andrew Stubbs
@ 2011-06-28 14:49 ` Andrew Stubbs
2011-07-04 14:27 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 14:49 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 988 bytes --]
On 28/06/11 13:33, Andrew Stubbs wrote:
> On 23/06/11 15:41, Andrew Stubbs wrote:
>> If one or both of the inputs to a widening multiply are of unsigned type
>> then the compiler will attempt to use usmul_widen_optab or
>> umul_widen_optab, respectively.
>>
>> That works fine, but only if the target supports those operations
>> directly. Otherwise, it just bombs out and reverts to the normal
>> inefficient non-widening multiply.
>>
>> This patch attempts to catch these cases and use an alternative signed
>> widening multiply instruction, if one of those is available.
>>
>> I believe this should be legal as long as the top bit of both inputs is
>> guaranteed to be zero. The code achieves this guarantee by
>> zero-extending the inputs to a wider mode (which must still be narrower
>> than the output mode).
>>
>> OK?
>
> This update fixes the testsuite issue Janis pointed out.
And this one fixes up the wmul-5.c testcase also. The patch has changed
the correct result.
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7632 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency.
* optabs.c (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this, and add new
argument 'found_mode'.
* optabs.h (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this.
(find_widening_optab_handler): New macro.
* tree-ssa-math-opts.c: Include langhooks.h
(build_and_insert_cast): New function.
(convert_mult_to_widen): Add new argument 'gsi'.
Convert unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
(execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: Update expected result.
* gcc.target/arm/wmul-6.c: New file.
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \
$(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \
- $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h
+ $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \
+ langhooks.h
tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \
$(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
non-widening optabs also. */
enum insn_code
-find_widening_optab_handler (optab op, enum machine_mode to_mode,
- enum machine_mode from_mode,
- int permit_non_widening)
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
{
for (; (permit_non_widening || from_mode != to_mode)
&& GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
@@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode,
from_mode);
if (handler != CODE_FOR_nothing)
- return handler;
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
}
return CODE_FOR_nothing;
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
/* Find a widening optab even if it doesn't widen as much as we want. */
-extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
- enum machine_mode, int);
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
--- a/gcc/testsuite/gcc.target/arm/wmul-5.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -7,4 +7,4 @@ foo (long long a, char *b, char *c)
return a + *b * *c;
}
-/* { dg-final { scan-assembler "umlal" } } */
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "basic-block.h"
#include "target.h"
#include "gimple-pretty-print.h"
+#include "langhooks.h"
/* FIXME: RTL headers have to be included here for optabs. */
#include "rtl.h" /* Because optabs.h wants enum rtx_code. */
@@ -1086,6 +1087,21 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TYPE, and put the result in
+ TARGET. Insert the statement prior to GSI's current position, and
+ return the from SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val, tree type)
+{
+ tree result = make_ssa_name (target, NULL);
+ gimple stmt = gimple_build_assign (result, fold_convert (type, val));
+ gimple_set_location (stmt, loc);
+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+ return result;
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -2047,7 +2063,7 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
@@ -2075,7 +2091,31 @@ convert_mult_to_widen (gimple stmt)
handler = find_widening_optab_handler (op, to_mode, from_mode, 0);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &from_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
+
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+ }
+ else
+ return false;
+ }
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
@@ -2165,7 +2205,22 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
return false;
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ {
+ enum machine_mode mode = TYPE_MODE (type1);
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+ {
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+ }
+ else
+ return false;
+ }
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
@@ -2393,7 +2448,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-06-28 14:49 ` Andrew Stubbs
@ 2011-07-04 14:27 ` Andrew Stubbs
2011-07-07 10:10 ` Richard Guenther
2011-07-12 14:10 ` Andrew Stubbs
0 siblings, 2 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-04 14:27 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]
On 28/06/11 15:14, Andrew Stubbs wrote:
> On 28/06/11 13:33, Andrew Stubbs wrote:
>> On 23/06/11 15:41, Andrew Stubbs wrote:
>>> If one or both of the inputs to a widening multiply are of unsigned type
>>> then the compiler will attempt to use usmul_widen_optab or
>>> umul_widen_optab, respectively.
>>>
>>> That works fine, but only if the target supports those operations
>>> directly. Otherwise, it just bombs out and reverts to the normal
>>> inefficient non-widening multiply.
>>>
>>> This patch attempts to catch these cases and use an alternative signed
>>> widening multiply instruction, if one of those is available.
>>>
>>> I believe this should be legal as long as the top bit of both inputs is
>>> guaranteed to be zero. The code achieves this guarantee by
>>> zero-extending the inputs to a wider mode (which must still be narrower
>>> than the output mode).
>>>
>>> OK?
>>
>> This update fixes the testsuite issue Janis pointed out.
>
> And this one fixes up the wmul-5.c testcase also. The patch has changed
> the correct result.
Here's an update for the context changed by the update to patch 3.
The content of the patch has not changed.
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7611 bytes --]
2011-07-04 Andrew Stubbs <ams@codesourcery.com>
gcc/
* Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency.
* optabs.c (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this, and add new
argument 'found_mode'.
* optabs.h (find_widening_optab_handler): Rename to ...
(find_widening_optab_handler_and_mode): ... this.
(find_widening_optab_handler): New macro.
* tree-ssa-math-opts.c: Include langhooks.h
(build_and_insert_cast): New function.
(convert_mult_to_widen): Add new argument 'gsi'.
Convert unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
(execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen.
gcc/testsuite/
* gcc.target/arm/wmul-5.c: Update expected result.
* gcc.target/arm/wmul-6.c: New file.
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \
$(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \
- $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h
+ $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \
+ langhooks.h
tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \
$(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
non-widening optabs also. */
enum insn_code
-find_widening_optab_handler (optab op, enum machine_mode to_mode,
- enum machine_mode from_mode,
- int permit_non_widening)
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+ enum machine_mode from_mode,
+ int permit_non_widening,
+ enum machine_mode *found_mode)
{
for (; (permit_non_widening || from_mode != to_mode)
&& GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
@@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode,
from_mode);
if (handler != CODE_FOR_nothing)
- return handler;
+ {
+ if (found_mode)
+ *found_mode = from_mode;
+ return handler;
+ }
}
return CODE_FOR_nothing;
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
/* Find a widening optab even if it doesn't widen as much as we want. */
-extern enum insn_code find_widening_optab_handler (optab, enum machine_mode,
- enum machine_mode, int);
+#define find_widening_optab_handler(A,B,C,D) \
+ find_widening_optab_handler_and_mode (A, B, C, D, NULL)
+extern enum insn_code find_widening_optab_handler_and_mode (optab,
+ enum machine_mode,
+ enum machine_mode,
+ int,
+ enum machine_mode *);
/* An extra flag to control optab_for_tree_code's behavior. This is needed to
distinguish between machines with a vector shift that takes a scalar for the
--- a/gcc/testsuite/gcc.target/arm/wmul-5.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -7,4 +7,4 @@ foo (long long a, char *b, char *c)
return a + *b * *c;
}
-/* { dg-final { scan-assembler "umlal" } } */
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "basic-block.h"
#include "target.h"
#include "gimple-pretty-print.h"
+#include "langhooks.h"
/* FIXME: RTL headers have to be included here for optabs. */
#include "rtl.h" /* Because optabs.h wants enum rtx_code. */
@@ -1086,6 +1087,21 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type,
return result;
}
+/* Build a gimple assignment to cast VAL to TYPE, and put the result in
+ TARGET. Insert the statement prior to GSI's current position, and
+ return the from SSA name. */
+
+static tree
+build_and_insert_cast (gimple_stmt_iterator *gsi, location_t loc,
+ tree target, tree val, tree type)
+{
+ tree result = make_ssa_name (target, NULL);
+ gimple stmt = gimple_build_assign (result, fold_convert (type, val));
+ gimple_set_location (stmt, loc);
+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+ return result;
+}
+
/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
with location info LOC. If possible, create an equivalent and
less expensive sequence of statements prior to GSI, and return an
@@ -2047,7 +2063,7 @@ is_widening_mult_p (gimple stmt,
value is true iff we converted the statement. */
static bool
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
tree lhs, rhs1, rhs2, type, type1, type2;
enum insn_code handler;
@@ -2075,7 +2091,31 @@ convert_mult_to_widen (gimple stmt)
handler = find_widening_optab_handler (op, to_mode, from_mode, 0);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &from_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
+
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+ }
+ else
+ return false;
+ }
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
@@ -2256,7 +2296,22 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
return false;
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ {
+ enum machine_mode mode = TYPE_MODE (type1);
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+ {
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+ }
+ else
+ return false;
+ }
/* Verify that the convertions between the mult and the add doesn't do
anything unexpected. */
@@ -2489,7 +2544,7 @@ execute_optimize_widening_mul (void)
switch (code)
{
case MULT_EXPR:
- if (!convert_mult_to_widen (stmt)
+ if (!convert_mult_to_widen (stmt, &gsi)
&& convert_mult_to_fma (stmt,
gimple_assign_rhs1 (stmt),
gimple_assign_rhs2 (stmt)))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-04 14:27 ` Andrew Stubbs
@ 2011-07-07 10:10 ` Richard Guenther
2011-07-07 10:42 ` Andrew Stubbs
2011-07-12 14:10 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 10:10 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Mon, Jul 4, 2011 at 4:26 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 28/06/11 15:14, Andrew Stubbs wrote:
>>
>> On 28/06/11 13:33, Andrew Stubbs wrote:
>>>
>>> On 23/06/11 15:41, Andrew Stubbs wrote:
>>>>
>>>> If one or both of the inputs to a widening multiply are of unsigned type
>>>> then the compiler will attempt to use usmul_widen_optab or
>>>> umul_widen_optab, respectively.
>>>>
>>>> That works fine, but only if the target supports those operations
>>>> directly. Otherwise, it just bombs out and reverts to the normal
>>>> inefficient non-widening multiply.
>>>>
>>>> This patch attempts to catch these cases and use an alternative signed
>>>> widening multiply instruction, if one of those is available.
>>>>
>>>> I believe this should be legal as long as the top bit of both inputs is
>>>> guaranteed to be zero. The code achieves this guarantee by
>>>> zero-extending the inputs to a wider mode (which must still be narrower
>>>> than the output mode).
>>>>
>>>> OK?
>>>
>>> This update fixes the testsuite issue Janis pointed out.
>>
>> And this one fixes up the wmul-5.c testcase also. The patch has changed
>> the correct result.
>
> Here's an update for the context changed by the update to patch 3.
>
> The content of the patch has not changed.
+ gimple stmt = gimple_build_assign (result, fold_convert (type, val));
please use gimple_build_assign_with_ops
-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
The comment needs updating for the new parameter.
+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
don't use type_for_mode, use build_nonstandard_integer_type
(GET_MODE_PRECISION (from_mode), 0) instead.
Both types are equal, so please share the temporary variable you
create
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
rhs2, type2);
here (CSE create_tmp_var).
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
Likewise.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-07 10:10 ` Richard Guenther
@ 2011-07-07 10:42 ` Andrew Stubbs
2011-07-07 11:08 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-07 10:42 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
On 07/07/11 11:04, Richard Guenther wrote:
> Both types are equal, so please share the temporary variable you
> create
>
> + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
> + create_tmp_var (type1, NULL),
> rhs1, type1);
> + rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
> + create_tmp_var (type2, NULL),
> rhs2, type2);
>
> here (CSE create_tmp_var).
I'm sorry, I don't understand this?
This takes code like this:
r1 = a;
r2 = b;
result = r1 + r2;
And transforms it to this:
r1 = a;
r2 = b;
t1 = (type1) r1;
t2 = (type2) r2;
result = t1 + t2;
Yes, type1 == type2, but r1 != r2, so t1 != t2.
I don't see where the common expression is here? But then, I am
something of a newbie to tree optimizations.
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-07 10:42 ` Andrew Stubbs
@ 2011-07-07 11:08 ` Richard Guenther
0 siblings, 0 replies; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 11:08 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 7, 2011 at 12:41 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
> On 07/07/11 11:04, Richard Guenther wrote:
>>
>> Both types are equal, so please share the temporary variable you
>> create
>>
>> + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
>> + create_tmp_var (type1, NULL),
>> rhs1, type1);
>> + rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
>> + create_tmp_var (type2, NULL),
>> rhs2, type2);
>>
>> here (CSE create_tmp_var).
>
> I'm sorry, I don't understand this?
>
> This takes code like this:
>
> r1 = a;
> r2 = b;
> result = r1 + r2;
>
> And transforms it to this:
>
> r1 = a;
> r2 = b;
> t1 = (type1) r1;
> t2 = (type2) r2;
> result = t1 + t2;
>
> Yes, type1 == type2, but r1 != r2, so t1 != t2.
>
> I don't see where the common expression is here? But then, I am something of
> a newbie to tree optimizations.
create_tmp_var creates a var-decl, build_and_insert_casts builds an
SSA name from it. You can build multiple SSA names from a single
VAR_DECL, so no need to waste two VAR_DECLs for temporaries
of the same type.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-04 14:27 ` Andrew Stubbs
2011-07-07 10:10 ` Richard Guenther
@ 2011-07-12 14:10 ` Andrew Stubbs
2011-07-14 14:28 ` Andrew Stubbs
1 sibling, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-12 14:10 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1481 bytes --]
On 04/07/11 15:26, Andrew Stubbs wrote:
> On 28/06/11 15:14, Andrew Stubbs wrote:
>> On 28/06/11 13:33, Andrew Stubbs wrote:
>>> On 23/06/11 15:41, Andrew Stubbs wrote:
>>>> If one or both of the inputs to a widening multiply are of unsigned
>>>> type
>>>> then the compiler will attempt to use usmul_widen_optab or
>>>> umul_widen_optab, respectively.
>>>>
>>>> That works fine, but only if the target supports those operations
>>>> directly. Otherwise, it just bombs out and reverts to the normal
>>>> inefficient non-widening multiply.
>>>>
>>>> This patch attempts to catch these cases and use an alternative signed
>>>> widening multiply instruction, if one of those is available.
>>>>
>>>> I believe this should be legal as long as the top bit of both inputs is
>>>> guaranteed to be zero. The code achieves this guarantee by
>>>> zero-extending the inputs to a wider mode (which must still be narrower
>>>> than the output mode).
>>>>
>>>> OK?
>>>
>>> This update fixes the testsuite issue Janis pointed out.
>>
>> And this one fixes up the wmul-5.c testcase also. The patch has changed
>> the correct result.
>
> Here's an update for the context changed by the update to patch 3.
>
> The content of the patch has not changed.
This update does the same thing as before, but updated for the changes
earlier in the patch series. In particular, the build_and_insert_cast
function and find_widening_optab_handler_and_mode changes have been
moved up to patch 2.
OK?
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 3075 bytes --]
2011-07-12 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Convert
unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-6.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2071,6 +2071,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
enum insn_code handler;
enum machine_mode to_mode, from_mode;
optab op;
+ bool do_cast = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2094,9 +2095,32 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
0, &from_mode);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &from_mode);
- if (from_mode != TYPE_MODE (type1))
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ type1 = build_nonstandard_integer_type (
+ GET_MODE_PRECISION (from_mode),
+ 0);
+ type2 = type1;
+ do_cast = true;
+ }
+ else
+ return false;
+ }
+
+ if (from_mode != TYPE_MODE (type1) || do_cast)
{
location_t loc = gimple_location (stmt);
tree tmp1, tmp2;
@@ -2143,6 +2167,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum tree_code wmult_code;
enum insn_code handler;
enum machine_mode from_mode;
+ bool do_cast = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2234,8 +2259,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
else
return false;
+ /* We don't support usmadd yet, so try a wider signed mode. */
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ {
+ enum machine_mode mode = TYPE_MODE (type1);
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+ {
+ type1 = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+ 0);
+ type2 = type1;
+ do_cast = true;
+ }
+ else
+ return false;
+ }
/* If there was a conversion between the multiply and addition
then we need to make sure it fits a multiply-and-accumulate.
@@ -2276,7 +2314,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (handler == CODE_FOR_nothing)
return false;
- if (TYPE_MODE (type1) != from_mode)
+ if (TYPE_MODE (type1) != from_mode || do_cast)
{
location_t loc = gimple_location (stmt);
tree tmp;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-12 14:10 ` Andrew Stubbs
@ 2011-07-14 14:28 ` Andrew Stubbs
2011-07-14 14:31 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:28 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 581 bytes --]
On 12/07/11 15:07, Andrew Stubbs wrote:
> This update does the same thing as before, but updated for the changes
> earlier in the patch series. In particular, the build_and_insert_cast
> function and find_widening_optab_handler_and_mode changes have been
> moved up to patch 2.
And this update changes the way the casts are handled, partly because it
got unwieldy towards the end of the patch series, and partly because I
found a few bugs.
I've also ensured that it checks the precision of the types, rather than
the mode size to ensure that it is bitfield safe.
OK?
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7001 bytes --]
2011-07-14 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Convert
unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-6.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2067,12 +2067,13 @@ is_widening_mult_p (gimple stmt,
static bool
convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
- tree lhs, rhs1, rhs2, type, type1, type2, tmp;
+ tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL;
enum insn_code handler;
enum machine_mode to_mode, from_mode, actual_mode;
optab op;
int actual_precision;
location_t loc = gimple_location (stmt);
+ bool from_unsigned1, from_unsigned2;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2084,10 +2085,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
to_mode = TYPE_MODE (type);
from_mode = TYPE_MODE (type1);
+ from_unsigned1 = TYPE_UNSIGNED (type1);
+ from_unsigned2 = TYPE_UNSIGNED (type2);
- if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
+ if (from_unsigned1 && from_unsigned2)
op = umul_widen_optab;
- else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
+ else if (!from_unsigned1 && !from_unsigned2)
op = smul_widen_optab;
else
op = usmul_widen_optab;
@@ -2096,22 +2099,45 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
0, &actual_mode);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &actual_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ from_unsigned1 = from_unsigned2 = false;
+ }
+ else
+ return false;
+ }
/* Ensure that the inputs to the handler are in the correct precison
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
- if (actual_precision != TYPE_PRECISION (type1))
+ if (actual_precision != TYPE_PRECISION (type1)
+ || from_unsigned1 != TYPE_UNSIGNED (type1))
{
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type1)),
+ (actual_precision, from_unsigned1),
NULL);
rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
-
+ }
+ if (actual_precision != TYPE_PRECISION (type2)
+ || from_unsigned2 != TYPE_UNSIGNED (type2))
+ {
/* Reuse the same type info, if possible. */
- if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+ if (!tmp || from_unsigned1 != from_unsigned2)
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type2)),
+ (actual_precision, from_unsigned2),
NULL);
rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
}
@@ -2136,7 +2162,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
- tree type, type1, type2, tmp;
+ tree type, type1, type2, optype, tmp = NULL;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
@@ -2145,6 +2171,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum machine_mode to_mode, from_mode, actual_mode;
location_t loc = gimple_location (stmt);
int actual_precision;
+ bool from_unsigned1, from_unsigned2;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2238,9 +2265,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
to_mode = TYPE_MODE (type);
from_mode = TYPE_MODE (type1);
+ from_unsigned1 = TYPE_UNSIGNED (type1);
+ from_unsigned2 = TYPE_UNSIGNED (type2);
- if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ /* There's no such thing as a mixed sign madd yet, so use a wider mode. */
+ if (from_unsigned1 != from_unsigned2)
+ {
+ enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+ {
+ from_mode = mode;
+ from_unsigned1 = from_unsigned2 = false;
+ }
+ else
+ return false;
+ }
/* If there was a conversion between the multiply and addition
then we need to make sure it fits a multiply-and-accumulate.
@@ -2248,6 +2287,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
value. */
if (conv_stmt)
{
+ /* We use the original, unmodified data types for this. */
tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
@@ -2272,7 +2312,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
- this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
+ optype = build_nonstandard_integer_type (from_mode, from_unsigned1);
+ this_optab = optab_for_tree_code (wmult_code, optype, optab_default);
handler = find_widening_optab_handler_and_mode (this_optab, to_mode,
from_mode, 0, &actual_mode);
@@ -2282,13 +2323,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* Ensure that the inputs to the handler are in the correct precison
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
- if (actual_precision != TYPE_PRECISION (type1))
+ if (actual_precision != TYPE_PRECISION (type1)
+ || from_unsigned1 != TYPE_UNSIGNED (type1))
{
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type1)),
+ (actual_precision, from_unsigned1),
NULL);
-
mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
+ }
+ if (actual_precision != TYPE_PRECISION (type2)
+ || from_unsigned2 != TYPE_UNSIGNED (type2))
+ {
+ if (!tmp || from_unsigned1 != from_unsigned2)
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, from_unsigned2),
+ NULL);
mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
}
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-14 14:28 ` Andrew Stubbs
@ 2011-07-14 14:31 ` Richard Guenther
2011-08-19 14:51 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-14 14:31 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 14, 2011 at 4:23 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 12/07/11 15:07, Andrew Stubbs wrote:
>>
>> This update does the same thing as before, but updated for the changes
>> earlier in the patch series. In particular, the build_and_insert_cast
>> function and find_widening_optab_handler_and_mode changes have been
>> moved up to patch 2.
>
> And this update changes the way the casts are handled, partly because it got
> unwieldy towards the end of the patch series, and partly because I found a
> few bugs.
>
> I've also ensured that it checks the precision of the types, rather than the
> mode size to ensure that it is bitfield safe.
>
> OK?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-07-14 14:31 ` Richard Guenther
@ 2011-08-19 14:51 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 14:51 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 129 bytes --]
On 14/07/11 15:25, Richard Guenther wrote:
> Ok.
Committed, with no real changes. I just updated the testcase a little.
Andrew
[-- Attachment #2: widening-multiplies-4.patch --]
[-- Type: text/x-patch, Size: 7035 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Convert
unsupported unsigned multiplies to signed.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-6.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+ return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2068,12 +2068,13 @@ is_widening_mult_p (gimple stmt,
static bool
convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
- tree lhs, rhs1, rhs2, type, type1, type2, tmp;
+ tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL;
enum insn_code handler;
enum machine_mode to_mode, from_mode, actual_mode;
optab op;
int actual_precision;
location_t loc = gimple_location (stmt);
+ bool from_unsigned1, from_unsigned2;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2085,10 +2086,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
to_mode = TYPE_MODE (type);
from_mode = TYPE_MODE (type1);
+ from_unsigned1 = TYPE_UNSIGNED (type1);
+ from_unsigned2 = TYPE_UNSIGNED (type2);
- if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
+ if (from_unsigned1 && from_unsigned2)
op = umul_widen_optab;
- else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
+ else if (!from_unsigned1 && !from_unsigned2)
op = smul_widen_optab;
else
op = usmul_widen_optab;
@@ -2097,22 +2100,45 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
0, &actual_mode);
if (handler == CODE_FOR_nothing)
- return false;
+ {
+ if (op != smul_widen_optab)
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+
+ op = smul_widen_optab;
+ handler = find_widening_optab_handler_and_mode (op, to_mode,
+ from_mode, 0,
+ &actual_mode);
+
+ if (handler == CODE_FOR_nothing)
+ return false;
+
+ from_unsigned1 = from_unsigned2 = false;
+ }
+ else
+ return false;
+ }
/* Ensure that the inputs to the handler are in the correct precison
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
- if (actual_precision != TYPE_PRECISION (type1))
+ if (actual_precision != TYPE_PRECISION (type1)
+ || from_unsigned1 != TYPE_UNSIGNED (type1))
{
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type1)),
+ (actual_precision, from_unsigned1),
NULL);
rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
-
+ }
+ if (actual_precision != TYPE_PRECISION (type2)
+ || from_unsigned2 != TYPE_UNSIGNED (type2))
+ {
/* Reuse the same type info, if possible. */
- if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+ if (!tmp || from_unsigned1 != from_unsigned2)
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type2)),
+ (actual_precision, from_unsigned2),
NULL);
rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
}
@@ -2137,7 +2163,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
{
gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
- tree type, type1, type2, tmp;
+ tree type, type1, type2, optype, tmp = NULL;
tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
optab this_optab;
@@ -2146,6 +2172,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
enum machine_mode to_mode, from_mode, actual_mode;
location_t loc = gimple_location (stmt);
int actual_precision;
+ bool from_unsigned1, from_unsigned2;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2239,9 +2266,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
to_mode = TYPE_MODE (type);
from_mode = TYPE_MODE (type1);
+ from_unsigned1 = TYPE_UNSIGNED (type1);
+ from_unsigned2 = TYPE_UNSIGNED (type2);
- if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
- return false;
+ /* There's no such thing as a mixed sign madd yet, so use a wider mode. */
+ if (from_unsigned1 != from_unsigned2)
+ {
+ enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+ {
+ from_mode = mode;
+ from_unsigned1 = from_unsigned2 = false;
+ }
+ else
+ return false;
+ }
/* If there was a conversion between the multiply and addition
then we need to make sure it fits a multiply-and-accumulate.
@@ -2249,6 +2288,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
value. */
if (conv_stmt)
{
+ /* We use the original, unmodified data types for this. */
tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
@@ -2273,7 +2313,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
- this_optab = optab_for_tree_code (wmult_code, type1, optab_default);
+ optype = build_nonstandard_integer_type (from_mode, from_unsigned1);
+ this_optab = optab_for_tree_code (wmult_code, optype, optab_default);
handler = find_widening_optab_handler_and_mode (this_optab, to_mode,
from_mode, 0, &actual_mode);
@@ -2283,13 +2324,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* Ensure that the inputs to the handler are in the correct precison
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
- if (actual_precision != TYPE_PRECISION (type1))
+ if (actual_precision != TYPE_PRECISION (type1)
+ || from_unsigned1 != TYPE_UNSIGNED (type1))
{
tmp = create_tmp_var (build_nonstandard_integer_type
- (actual_precision, TYPE_UNSIGNED (type1)),
+ (actual_precision, from_unsigned1),
NULL);
-
mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
+ }
+ if (actual_precision != TYPE_PRECISION (type2)
+ || from_unsigned2 != TYPE_UNSIGNED (type2))
+ {
+ if (!tmp || from_unsigned1 != from_unsigned2)
+ tmp = create_tmp_var (build_nonstandard_integer_type
+ (actual_precision, from_unsigned2),
+ NULL);
mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
}
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
2011-06-23 14:43 ` [PATCH (4/7)] Unsigned multiplies using wider signed multiplies Andrew Stubbs
2011-06-28 13:28 ` Andrew Stubbs
@ 2011-06-28 13:30 ` Paolo Bonzini
1 sibling, 0 replies; 107+ messages in thread
From: Paolo Bonzini @ 2011-06-28 13:30 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On 06/23/2011 04:41 PM, Andrew Stubbs wrote:
>
> I believe this should be legal as long as the top bit of both inputs is
> guaranteed to be zero. The code achieves this guarantee by
> zero-extending the inputs to a wider mode (which must still be narrower
> than the output mode).
Yes, that's correct.
Paolo
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (3 preceding siblings ...)
2011-06-23 14:43 ` [PATCH (4/7)] Unsigned multiplies using wider signed multiplies Andrew Stubbs
@ 2011-06-23 14:44 ` Andrew Stubbs
2011-06-28 15:44 ` Andrew Stubbs
2011-06-23 14:51 ` [PATCH (6/7)] More widening multiply-and-accumulate pattern matching Andrew Stubbs
` (4 subsequent siblings)
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:44 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 399 bytes --]
This patch removes the restriction that the inputs to a widening
multiply must be of the same mode.
It does this by extending the smaller of the two inputs to match the
larger; therefore, it remains the case that subsequent code (in the
expand pass, for example) can rely on the type of rhs1 being the input
type of the operation, and the gimple verification code is still valid.
OK?
Andrew
[-- Attachment #2: widening-multiplies-5.patch --]
[-- Type: text/x-patch, Size: 4152 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
Ensure the the larger type is the first operand.
(convert_mult_to_widen): Insert cast if type2 is smaller than type1.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/smlalbb-2.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/smlalbb-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
- /* FIXME: remove this restriction. */
- if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
- return false;
+ /* Ensure that the larger of the two operands comes first. */
+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+ {
+ tree tmp;
+ tmp = *type1_out;
+ *type1_out = *type2_out;
+ *type2_out = tmp;
+ tmp = *rhs1_out;
+ *rhs1_out = *rhs2_out;
+ *rhs2_out = tmp;
+ }
return true;
}
@@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
enum insn_code handler;
enum machine_mode to_mode, from_mode;
optab op;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
return false;
type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
-
- rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL), rhs1, type1);
- rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL), rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != from_mode)
+ {
+ type2 = lang_hooks.types.type_for_mode (from_mode,
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ if (cast2)
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2142,6 +2161,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
optab this_optab;
enum tree_code wmult_code;
enum insn_code handler;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2228,17 +2248,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
{
type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
- mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL),
- mult_rhs1, type1);
- mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL),
- mult_rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != TYPE_MODE (type1))
+ {
+ type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ if (cast2)
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-06-23 14:44 ` [PATCH (5/7)] Widening multiplies for mis-matched mode inputs Andrew Stubbs
@ 2011-06-28 15:44 ` Andrew Stubbs
2011-07-04 14:29 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 15:44 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 507 bytes --]
On 23/06/11 15:41, Andrew Stubbs wrote:
> This patch removes the restriction that the inputs to a widening
> multiply must be of the same mode.
>
> It does this by extending the smaller of the two inputs to match the
> larger; therefore, it remains the case that subsequent code (in the
> expand pass, for example) can rely on the type of rhs1 being the input
> type of the operation, and the gimple verification code is still valid.
>
> OK?
This update fixes the testcase issue Janis highlighted.
Andrew
[-- Attachment #2: widening-multiplies-5.patch --]
[-- Type: text/x-patch, Size: 4144 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
Ensure the the larger type is the first operand.
(convert_mult_to_widen): Insert cast if type2 is smaller than type1.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-7.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
- /* FIXME: remove this restriction. */
- if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
- return false;
+ /* Ensure that the larger of the two operands comes first. */
+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+ {
+ tree tmp;
+ tmp = *type1_out;
+ *type1_out = *type2_out;
+ *type2_out = tmp;
+ tmp = *rhs1_out;
+ *rhs1_out = *rhs2_out;
+ *rhs2_out = tmp;
+ }
return true;
}
@@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
enum insn_code handler;
enum machine_mode to_mode, from_mode;
optab op;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
return false;
type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
-
- rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL), rhs1, type1);
- rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL), rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != from_mode)
+ {
+ type2 = lang_hooks.types.type_for_mode (from_mode,
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ if (cast2)
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2142,6 +2161,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
optab this_optab;
enum tree_code wmult_code;
enum insn_code handler;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2211,17 +2231,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
{
type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
- mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL),
- mult_rhs1, type1);
- mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL),
- mult_rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != TYPE_MODE (type1))
+ {
+ type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ if (cast2)
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+
/* Verify that the machine can perform a widening multiply
accumulate in this mode/signedness combination, otherwise
this transformation is likely to pessimize code. */
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-06-28 15:44 ` Andrew Stubbs
@ 2011-07-04 14:29 ` Andrew Stubbs
2011-07-07 10:11 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-04 14:29 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 671 bytes --]
On 28/06/11 16:08, Andrew Stubbs wrote:
> On 23/06/11 15:41, Andrew Stubbs wrote:
>> This patch removes the restriction that the inputs to a widening
>> multiply must be of the same mode.
>>
>> It does this by extending the smaller of the two inputs to match the
>> larger; therefore, it remains the case that subsequent code (in the
>> expand pass, for example) can rely on the type of rhs1 being the input
>> type of the operation, and the gimple verification code is still valid.
>>
>> OK?
>
> This update fixes the testcase issue Janis highlighted.
And this one updates the context changed by my update to patch 3.
The content of the patch has not changed.
Andrew
[-- Attachment #2: widening-multiplies-5.patch --]
[-- Type: text/x-patch, Size: 4121 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
Ensure the the larger type is the first operand.
(convert_mult_to_widen): Insert cast if type2 is smaller than type1.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-7.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
- /* FIXME: remove this restriction. */
- if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
- return false;
+ /* Ensure that the larger of the two operands comes first. */
+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+ {
+ tree tmp;
+ tmp = *type1_out;
+ *type1_out = *type2_out;
+ *type2_out = tmp;
+ tmp = *rhs1_out;
+ *rhs1_out = *rhs2_out;
+ *rhs2_out = tmp;
+ }
return true;
}
@@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
enum insn_code handler;
enum machine_mode to_mode, from_mode;
optab op;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
return false;
type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);
-
- rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL), rhs1, type1);
- rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL), rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != from_mode)
+ {
+ type2 = lang_hooks.types.type_for_mode (from_mode,
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ if (cast2)
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
+
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2215,6 +2234,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
optab this_optab;
enum tree_code wmult_code;
enum insn_code handler;
+ int cast1 = false, cast2 = false;
lhs = gimple_assign_lhs (stmt);
type = TREE_TYPE (lhs);
@@ -2302,17 +2322,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
{
type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
- mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type1, NULL),
- mult_rhs1, type1);
- mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
- create_tmp_var (type2, NULL),
- mult_rhs2, type2);
+ cast1 = cast2 = true;
}
else
return false;
}
+ if (TYPE_MODE (type2) != TYPE_MODE (type1))
+ {
+ type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ if (cast2)
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
+
/* Verify that the convertions between the mult and the add doesn't do
anything unexpected. */
if (!valid_types_for_madd_p (type1, type2, mult_rhs))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-07-04 14:29 ` Andrew Stubbs
@ 2011-07-07 10:11 ` Richard Guenther
2011-07-14 14:34 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 10:11 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Mon, Jul 4, 2011 at 4:29 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 28/06/11 16:08, Andrew Stubbs wrote:
>>
>> On 23/06/11 15:41, Andrew Stubbs wrote:
>>>
>>> This patch removes the restriction that the inputs to a widening
>>> multiply must be of the same mode.
>>>
>>> It does this by extending the smaller of the two inputs to match the
>>> larger; therefore, it remains the case that subsequent code (in the
>>> expand pass, for example) can rely on the type of rhs1 being the input
>>> type of the operation, and the gimple verification code is still valid.
>>>
>>> OK?
>>
>> This update fixes the testcase issue Janis highlighted.
>
> And this one updates the context changed by my update to patch 3.
>
> The content of the patch has not changed.
Similar to the previous patch
+ if (TYPE_MODE (type2) != from_mode)
+ {
+ type2 = lang_hooks.types.type_for_mode (from_mode,
+ TYPE_UNSIGNED (type2));
use build_nonstandard_integer_type.
+ if (cast1)
+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+ if (cast2)
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);
and CSE create_tmp_var - at this point type1 and type2 should be
the same, right? So I guess it would be a good place to assert
types_compatible_p (type1, type2).
gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));
and that's now seemingly redundant ... it should probably be
gimple_assign_set_rhs1 (stmt, rhs1);, no? A conversion isn't
a valid rhs1/2. Similar oddity in convert_plusminus_to_widen.
+ if (TYPE_MODE (type2) != TYPE_MODE (type1))
+ {
+ type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+ TYPE_UNSIGNED (type2));
+ cast2 = true;
+ }
+
+ if (cast1)
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL),
+ mult_rhs1, type1);
+ if (cast2)
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL),
+ mult_rhs2, type2);
see above.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-07-07 10:11 ` Richard Guenther
@ 2011-07-14 14:34 ` Andrew Stubbs
2011-07-14 14:35 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:34 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 161 bytes --]
I've updated this patch following the changes earlier in the patch
series. There isn't much left.
This should obviate all the review comments. :)
OK?
Andrew
[-- Attachment #2: widening-multiplies-5.patch --]
[-- Type: text/x-patch, Size: 1161 bytes --]
2011-07-14 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
Ensure the the larger type is the first operand.
gcc/testsuite/
* gcc.target/arm/wmul-7.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2053,9 +2053,17 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
- /* FIXME: remove this restriction. */
- if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
- return false;
+ /* Ensure that the larger of the two operands comes first. */
+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+ {
+ tree tmp;
+ tmp = *type1_out;
+ *type1_out = *type2_out;
+ *type2_out = tmp;
+ tmp = *rhs1_out;
+ *rhs1_out = *rhs2_out;
+ *rhs2_out = tmp;
+ }
return true;
}
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-07-14 14:34 ` Andrew Stubbs
@ 2011-07-14 14:35 ` Richard Guenther
2011-08-19 14:54 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-14 14:35 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 14, 2011 at 4:28 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> I've updated this patch following the changes earlier in the patch series.
> There isn't much left.
>
> This should obviate all the review comments. :)
Indeed ;)
> OK?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
2011-07-14 14:35 ` Richard Guenther
@ 2011-08-19 14:54 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 14:54 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 144 bytes --]
On 14/07/11 15:31, Richard Guenther wrote:
> Ok.
I've just committed this patch with no real changes. I've just updated
the testcase.
Andrew
[-- Attachment #2: widening-multiplies-5.patch --]
[-- Type: text/x-patch, Size: 1193 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
Ensure the the larger type is the first operand.
gcc/testsuite/
* gcc.target/arm/wmul-7.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2054,9 +2054,17 @@ is_widening_mult_p (gimple stmt,
*type2_out = *type1_out;
}
- /* FIXME: remove this restriction. */
- if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
- return false;
+ /* Ensure that the larger of the two operands comes first. */
+ if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+ {
+ tree tmp;
+ tmp = *type1_out;
+ *type1_out = *type2_out;
+ *type2_out = tmp;
+ tmp = *rhs1_out;
+ *rhs1_out = *rhs2_out;
+ *rhs2_out = tmp;
+ }
return true;
}
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (4 preceding siblings ...)
2011-06-23 14:44 ` [PATCH (5/7)] Widening multiplies for mis-matched mode inputs Andrew Stubbs
@ 2011-06-23 14:51 ` Andrew Stubbs
2011-06-28 15:49 ` Andrew Stubbs
2011-06-23 14:54 ` [PATCH (7/7)] Mixed-sign multiplies using narrowest mode Andrew Stubbs
` (3 subsequent siblings)
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:51 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 790 bytes --]
This patch fixes the case where widening multiply-and-accumulate were
not recognised because the multiplication itself is not actually widening.
This can happen when you have "DI + SI * SI" - the multiplication will
be done in SImode as a non-widening multiply, and it's only the final
accumulate step that is widening.
This was not recognised for two reasons:
1. is_widening_mult_p inferred the output type from the multiply
statement, which in not useful in this case.
2. The inputs to the multiply instruction may not have been converted at
all (because they're not being widened), so the pattern match failed.
The patch fixes these issues by making the output type explicit, and by
permitting unconverted inputs (the types are still checked, so this is
safe).
OK?
Andrew
[-- Attachment #2: widening-multiplies-6.patch --]
[-- Type: text/x-patch, Size: 5025 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/smlal-1.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/smlal-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap =
}
};
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+ assuming a target type of TYPE.
There are two cases:
- RHS makes some value at least twice as wide. Store that value
@@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched. */
static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+ tree *new_rhs_out)
{
gimple stmt;
- tree type, type1, rhs1;
+ tree type1, rhs1;
enum tree_code rhs_code;
if (TREE_CODE (rhs) == SSA_NAME)
{
- type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
if (!is_gimple_assign (stmt))
return false;
- rhs_code = gimple_assign_rhs_code (stmt);
- if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
- return false;
-
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
- *new_rhs_out = rhs1;
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+ *new_rhs_out = gimple_assign_lhs (stmt);
+ else
+ *new_rhs_out = rhs1;
*type_out = type1;
return true;
}
@@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
return false;
}
-/* Return true if STMT performs a widening multiplication. If so,
- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
- operands of the multiplication. */
+/* Return true if STMT performs a widening multiplication, assuming the
+ output type is TYPE. If so, store the unwidened types of the operands
+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+ and *TYPE2_OUT would give the operands of the multiplication. */
static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
{
- tree type;
-
- type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&& TREE_CODE (type) != FIXED_POINT_TYPE)
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+ rhs1_out))
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+ rhs2_out))
return false;
if (*type1_out == NULL)
@@ -2084,7 +2084,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
if (TREE_CODE (type) != INTEGER_TYPE)
return false;
- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
return false;
to_mode = TYPE_MODE (type);
@@ -2210,14 +2210,14 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
}
else if (rhs2_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-06-23 14:51 ` [PATCH (6/7)] More widening multiply-and-accumulate pattern matching Andrew Stubbs
@ 2011-06-28 15:49 ` Andrew Stubbs
2011-07-04 14:32 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 15:49 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 899 bytes --]
On 23/06/11 15:42, Andrew Stubbs wrote:
> This patch fixes the case where widening multiply-and-accumulate were
> not recognised because the multiplication itself is not actually widening.
>
> This can happen when you have "DI + SI * SI" - the multiplication will
> be done in SImode as a non-widening multiply, and it's only the final
> accumulate step that is widening.
>
> This was not recognised for two reasons:
>
> 1. is_widening_mult_p inferred the output type from the multiply
> statement, which in not useful in this case.
>
> 2. The inputs to the multiply instruction may not have been converted at
> all (because they're not being widened), so the pattern match failed.
>
> The patch fixes these issues by making the output type explicit, and by
> permitting unconverted inputs (the types are still checked, so this is
> safe).
>
> OK?
This update fixes Janis' testsuite issue.
Andrew
[-- Attachment #2: widening-multiplies-6.patch --]
[-- Type: text/x-patch, Size: 5023 bytes --]
2011-06-28 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-8.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap =
}
};
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+ assuming a target type of TYPE.
There are two cases:
- RHS makes some value at least twice as wide. Store that value
@@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched. */
static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+ tree *new_rhs_out)
{
gimple stmt;
- tree type, type1, rhs1;
+ tree type1, rhs1;
enum tree_code rhs_code;
if (TREE_CODE (rhs) == SSA_NAME)
{
- type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
if (!is_gimple_assign (stmt))
return false;
- rhs_code = gimple_assign_rhs_code (stmt);
- if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
- return false;
-
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
- *new_rhs_out = rhs1;
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+ *new_rhs_out = gimple_assign_lhs (stmt);
+ else
+ *new_rhs_out = rhs1;
*type_out = type1;
return true;
}
@@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
return false;
}
-/* Return true if STMT performs a widening multiplication. If so,
- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
- operands of the multiplication. */
+/* Return true if STMT performs a widening multiplication, assuming the
+ output type is TYPE. If so, store the unwidened types of the operands
+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+ and *TYPE2_OUT would give the operands of the multiplication. */
static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
{
- tree type;
-
- type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&& TREE_CODE (type) != FIXED_POINT_TYPE)
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+ rhs1_out))
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+ rhs2_out))
return false;
if (*type1_out == NULL)
@@ -2084,7 +2084,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
if (TREE_CODE (type) != INTEGER_TYPE)
return false;
- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
return false;
to_mode = TYPE_MODE (type);
@@ -2193,14 +2193,14 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
}
else if (rhs2_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-06-28 15:49 ` Andrew Stubbs
@ 2011-07-04 14:32 ` Andrew Stubbs
2011-07-07 10:20 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-04 14:32 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]
On 28/06/11 16:30, Andrew Stubbs wrote:
> On 23/06/11 15:42, Andrew Stubbs wrote:
>> This patch fixes the case where widening multiply-and-accumulate were
>> not recognised because the multiplication itself is not actually
>> widening.
>>
>> This can happen when you have "DI + SI * SI" - the multiplication will
>> be done in SImode as a non-widening multiply, and it's only the final
>> accumulate step that is widening.
>>
>> This was not recognised for two reasons:
>>
>> 1. is_widening_mult_p inferred the output type from the multiply
>> statement, which in not useful in this case.
>>
>> 2. The inputs to the multiply instruction may not have been converted at
>> all (because they're not being widened), so the pattern match failed.
>>
>> The patch fixes these issues by making the output type explicit, and by
>> permitting unconverted inputs (the types are still checked, so this is
>> safe).
>>
>> OK?
>
> This update fixes Janis' testsuite issue.
This updates the context changed by my update to patch 3.
The content of this patch has not changed.
Andrew
[-- Attachment #2: widening-multiplies-6.patch --]
[-- Type: text/x-patch, Size: 5113 bytes --]
2011-07-04 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-8.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap =
}
};
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+ assuming a target type of TYPE.
There are two cases:
- RHS makes some value at least twice as wide. Store that value
@@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched. */
static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+ tree *new_rhs_out)
{
gimple stmt;
- tree type, type1, rhs1;
+ tree type1, rhs1;
enum tree_code rhs_code;
if (TREE_CODE (rhs) == SSA_NAME)
{
- type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
if (!is_gimple_assign (stmt))
return false;
- rhs_code = gimple_assign_rhs_code (stmt);
- if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
- return false;
-
rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
- *new_rhs_out = rhs1;
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+ *new_rhs_out = gimple_assign_lhs (stmt);
+ else
+ *new_rhs_out = rhs1;
*type_out = type1;
return true;
}
@@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
return false;
}
-/* Return true if STMT performs a widening multiplication. If so,
- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
- operands of the multiplication. */
+/* Return true if STMT performs a widening multiplication, assuming the
+ output type is TYPE. If so, store the unwidened types of the operands
+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+ and *TYPE2_OUT would give the operands of the multiplication. */
static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
{
- tree type;
-
- type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&& TREE_CODE (type) != FIXED_POINT_TYPE)
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+ rhs1_out))
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+ rhs2_out))
return false;
if (*type1_out == NULL)
@@ -2084,7 +2084,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
if (TREE_CODE (type) != INTEGER_TYPE)
return false;
- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
return false;
to_mode = TYPE_MODE (type);
@@ -2280,7 +2280,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
mult_rhs = rhs1;
@@ -2288,7 +2288,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
}
else if (rhs2_code == MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
mult_rhs = rhs2;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-07-04 14:32 ` Andrew Stubbs
@ 2011-07-07 10:20 ` Richard Guenther
2011-07-14 14:35 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-07 10:20 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Mon, Jul 4, 2011 at 4:31 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 28/06/11 16:30, Andrew Stubbs wrote:
>>
>> On 23/06/11 15:42, Andrew Stubbs wrote:
>>>
>>> This patch fixes the case where widening multiply-and-accumulate were
>>> not recognised because the multiplication itself is not actually
>>> widening.
>>>
>>> This can happen when you have "DI + SI * SI" - the multiplication will
>>> be done in SImode as a non-widening multiply, and it's only the final
>>> accumulate step that is widening.
>>>
>>> This was not recognised for two reasons:
>>>
>>> 1. is_widening_mult_p inferred the output type from the multiply
>>> statement, which in not useful in this case.
>>>
>>> 2. The inputs to the multiply instruction may not have been converted at
>>> all (because they're not being widened), so the pattern match failed.
>>>
>>> The patch fixes these issues by making the output type explicit, and by
>>> permitting unconverted inputs (the types are still checked, so this is
>>> safe).
>>>
>>> OK?
>>
>> This update fixes Janis' testsuite issue.
>
> This updates the context changed by my update to patch 3.
>
> The content of this patch has not changed.
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-07-07 10:20 ` Richard Guenther
@ 2011-07-14 14:35 ` Andrew Stubbs
2011-07-14 14:41 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:35 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 527 bytes --]
On 07/07/11 11:13, Richard Guenther wrote:
>> This updates the context changed by my update to patch 3.
>> >
>> > The content of this patch has not changed.
> Ok.
I know this patch was already approved, but I discovered a bug in this
patch that missed optimizing the case where the input to multiply did
not come from an assign statement (this can happen when the value comes
from a function parameter).
This patch fixes that case, and updates the context changed by my
updates earlier in the patch series.
OK?
Andrew
[-- Attachment #2: widening-multiplies-6.patch --]
[-- Type: text/x-patch, Size: 5369 bytes --]
2011-07-14 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-8.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1965,7 +1965,8 @@ struct gimple_opt_pass pass_optimize_bswap =
}
};
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+ assuming a target type of TYPE.
There are two cases:
- RHS makes some value at least twice as wide. Store that value
@@ -1975,32 +1976,43 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched. */
static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+ tree *new_rhs_out)
{
gimple stmt;
- tree type, type1, rhs1;
+ tree type1, rhs1;
enum tree_code rhs_code;
if (TREE_CODE (rhs) == SSA_NAME)
{
- type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
if (!is_gimple_assign (stmt))
- return false;
-
- rhs_code = gimple_assign_rhs_code (stmt);
- if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
- return false;
+ {
+ rhs1 = NULL;
+ type1 = TREE_TYPE (rhs);
+ }
+ else
+ {
+ rhs1 = gimple_assign_rhs1 (stmt);
+ type1 = TREE_TYPE (rhs1);
+ }
- rhs1 = gimple_assign_rhs1 (stmt);
- type1 = TREE_TYPE (rhs1);
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
- *new_rhs_out = rhs1;
+ if (rhs1)
+ {
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+ *new_rhs_out = rhs;
+ else
+ *new_rhs_out = rhs1;
+ }
+ else
+ *new_rhs_out = rhs;
*type_out = type1;
return true;
}
@@ -2015,28 +2027,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
return false;
}
-/* Return true if STMT performs a widening multiplication. If so,
- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
- operands of the multiplication. */
+/* Return true if STMT performs a widening multiplication, assuming the
+ output type is TYPE. If so, store the unwidened types of the operands
+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+ and *TYPE2_OUT would give the operands of the multiplication. */
static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
{
- tree type;
-
- type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&& TREE_CODE (type) != FIXED_POINT_TYPE)
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+ rhs1_out))
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+ rhs2_out))
return false;
if (*type1_out == NULL)
@@ -2088,7 +2099,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
if (TREE_CODE (type) != INTEGER_TYPE)
return false;
- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
return false;
to_mode = TYPE_MODE (type);
@@ -2254,7 +2265,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (code == PLUS_EXPR
&& (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
@@ -2262,7 +2273,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-07-14 14:35 ` Andrew Stubbs
@ 2011-07-14 14:41 ` Richard Guenther
2011-08-19 15:03 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-14 14:41 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 14, 2011 at 4:34 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 07/07/11 11:13, Richard Guenther wrote:
>>>
>>> This updates the context changed by my update to patch 3.
>>> >
>>> > The content of this patch has not changed.
>>
>> Ok.
>
> I know this patch was already approved, but I discovered a bug in this patch
> that missed optimizing the case where the input to multiply did not come
> from an assign statement (this can happen when the value comes from a
> function parameter).
>
> This patch fixes that case, and updates the context changed by my updates
> earlier in the patch series.
>
> OK?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-07-14 14:41 ` Richard Guenther
@ 2011-08-19 15:03 ` Andrew Stubbs
2011-10-13 16:25 ` Matthew Gretton-Dann
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 15:03 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 366 bytes --]
On 14/07/11 15:35, Richard Guenther wrote:
> Ok.
I've just committed this updated patch.
I found bugs with VOIDmode constants that have caused me to recast my
patches to is_widening_mult_rhs_p. They should be logically the same for
non VOIDmode cases, but work correctly for constants. I think the new
version is a bit easier to understand in any case.
Andrew
[-- Attachment #2: widening-multiplies-6.patch --]
[-- Type: text/x-patch, Size: 5194 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-8.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1966,7 +1966,8 @@ struct gimple_opt_pass pass_optimize_bswap =
}
};
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+ assuming a target type of TYPE.
There are two cases:
- RHS makes some value at least twice as wide. Store that value
@@ -1976,27 +1977,31 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched. */
static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+ tree *new_rhs_out)
{
gimple stmt;
- tree type, type1, rhs1;
+ tree type1, rhs1;
enum tree_code rhs_code;
if (TREE_CODE (rhs) == SSA_NAME)
{
- type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
- if (!is_gimple_assign (stmt))
- return false;
-
- rhs_code = gimple_assign_rhs_code (stmt);
- if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
- return false;
+ if (is_gimple_assign (stmt))
+ {
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+ rhs1 = rhs;
+ else
+ rhs1 = gimple_assign_rhs1 (stmt);
+ }
+ else
+ rhs1 = rhs;
- rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
+
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
@@ -2016,28 +2021,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
return false;
}
-/* Return true if STMT performs a widening multiplication. If so,
- store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
- respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
- those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
- operands of the multiplication. */
+/* Return true if STMT performs a widening multiplication, assuming the
+ output type is TYPE. If so, store the unwidened types of the operands
+ in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
+ *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+ and *TYPE2_OUT would give the operands of the multiplication. */
static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
{
- tree type;
-
- type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&& TREE_CODE (type) != FIXED_POINT_TYPE)
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+ rhs1_out))
return false;
- if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+ if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+ rhs2_out))
return false;
if (*type1_out == NULL)
@@ -2089,7 +2093,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
if (TREE_CODE (type) != INTEGER_TYPE)
return false;
- if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+ if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
return false;
to_mode = TYPE_MODE (type);
@@ -2255,7 +2259,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (code == PLUS_EXPR
&& (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs1_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs2;
@@ -2263,7 +2267,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ if (!is_widening_mult_p (type, rhs2_stmt, &type1, &mult_rhs1,
&type2, &mult_rhs2))
return false;
add_rhs = rhs1;
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
2011-08-19 15:03 ` Andrew Stubbs
@ 2011-10-13 16:25 ` Matthew Gretton-Dann
0 siblings, 0 replies; 107+ messages in thread
From: Matthew Gretton-Dann @ 2011-10-13 16:25 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
This patch seems to have caused PR50717:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50717
Thanks,
Matt
On 19/08/11 15:49, Andrew Stubbs wrote:
> On 14/07/11 15:35, Richard Guenther wrote:
>> Ok.
>
> I've just committed this updated patch.
>
> I found bugs with VOIDmode constants that have caused me to recast my
> patches to is_widening_mult_rhs_p. They should be logically the same for
> non VOIDmode cases, but work correctly for constants. I think the new
> version is a bit easier to understand in any case.
>
> Andrew
>
>
> widening-multiplies-6.patch
>
>
> 2011-08-19 Andrew Stubbs<ams@codesourcery.com>
>
> gcc/
> * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
> 'type'.
> Use 'type' from caller, not inferred from 'rhs'.
> Don't reject non-conversion statements. Do return lhs in this case.
> (is_widening_mult_p): Add new argument 'type'.
> Use 'type' from caller, not inferred from 'stmt'.
> Pass type to is_widening_mult_rhs_p.
> (convert_mult_to_widen): Pass type to is_widening_mult_p.
> (convert_plusminus_to_widen): Likewise.
>
> gcc/testsuite/
> * gcc.target/arm/wmul-8.c: New file.
>
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-require-effective-target arm_dsp } */
> +
> +long long
> +foo (long long a, int *b, int *c)
> +{
> + return a + *b * *c;
> +}
> +
> +/* { dg-final { scan-assembler "smlal" } } */
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -1966,7 +1966,8 @@ struct gimple_opt_pass pass_optimize_bswap =
> }
> };
>
> -/* Return true if RHS is a suitable operand for a widening multiplication.
> +/* Return true if RHS is a suitable operand for a widening multiplication,
> + assuming a target type of TYPE.
> There are two cases:
>
> - RHS makes some value at least twice as wide. Store that value
> @@ -1976,27 +1977,31 @@ struct gimple_opt_pass pass_optimize_bswap =
> but leave *TYPE_OUT untouched. */
>
> static bool
> -is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
> +is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
> + tree *new_rhs_out)
> {
> gimple stmt;
> - tree type, type1, rhs1;
> + tree type1, rhs1;
> enum tree_code rhs_code;
>
> if (TREE_CODE (rhs) == SSA_NAME)
> {
> - type = TREE_TYPE (rhs);
> stmt = SSA_NAME_DEF_STMT (rhs);
> - if (!is_gimple_assign (stmt))
> - return false;
> -
> - rhs_code = gimple_assign_rhs_code (stmt);
> - if (TREE_CODE (type) == INTEGER_TYPE
> - ? !CONVERT_EXPR_CODE_P (rhs_code)
> - : rhs_code != FIXED_CONVERT_EXPR)
> - return false;
> + if (is_gimple_assign (stmt))
> + {
> + rhs_code = gimple_assign_rhs_code (stmt);
> + if (TREE_CODE (type) == INTEGER_TYPE
> + ? !CONVERT_EXPR_CODE_P (rhs_code)
> + : rhs_code != FIXED_CONVERT_EXPR)
> + rhs1 = rhs;
> + else
> + rhs1 = gimple_assign_rhs1 (stmt);
> + }
> + else
> + rhs1 = rhs;
>
> - rhs1 = gimple_assign_rhs1 (stmt);
> type1 = TREE_TYPE (rhs1);
> +
> if (TREE_CODE (type1) != TREE_CODE (type)
> || TYPE_PRECISION (type1) * 2> TYPE_PRECISION (type))
> return false;
> @@ -2016,28 +2021,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
> return false;
> }
>
> -/* Return true if STMT performs a widening multiplication. If so,
> - store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
> - respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting
> - those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
> - operands of the multiplication. */
> +/* Return true if STMT performs a widening multiplication, assuming the
> + output type is TYPE. If so, store the unwidened types of the operands
> + in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and
> + *RHS2_OUT such that converting those operands to types *TYPE1_OUT
> + and *TYPE2_OUT would give the operands of the multiplication. */
>
> static bool
> -is_widening_mult_p (gimple stmt,
> +is_widening_mult_p (tree type, gimple stmt,
> tree *type1_out, tree *rhs1_out,
> tree *type2_out, tree *rhs2_out)
> {
> - tree type;
> -
> - type = TREE_TYPE (gimple_assign_lhs (stmt));
> if (TREE_CODE (type) != INTEGER_TYPE
> && TREE_CODE (type) != FIXED_POINT_TYPE)
> return false;
>
> - if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
> + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
> + rhs1_out))
> return false;
>
> - if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
> + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
> + rhs2_out))
> return false;
>
> if (*type1_out == NULL)
> @@ -2089,7 +2093,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
> if (TREE_CODE (type) != INTEGER_TYPE)
> return false;
>
> - if (!is_widening_mult_p (stmt,&type1,&rhs1,&type2,&rhs2))
> + if (!is_widening_mult_p (type, stmt,&type1,&rhs1,&type2,&rhs2))
> return false;
>
> to_mode = TYPE_MODE (type);
> @@ -2255,7 +2259,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
> if (code == PLUS_EXPR
> && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
> {
> - if (!is_widening_mult_p (rhs1_stmt,&type1,&mult_rhs1,
> + if (!is_widening_mult_p (type, rhs1_stmt,&type1,&mult_rhs1,
> &type2,&mult_rhs2))
> return false;
> add_rhs = rhs2;
> @@ -2263,7 +2267,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
> }
> else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
> {
> - if (!is_widening_mult_p (rhs2_stmt,&type1,&mult_rhs1,
> + if (!is_widening_mult_p (type, rhs2_stmt,&type1,&mult_rhs1,
> &type2,&mult_rhs2))
> return false;
> add_rhs = rhs1;
--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (7/7)] Mixed-sign multiplies using narrowest mode
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (5 preceding siblings ...)
2011-06-23 14:51 ` [PATCH (6/7)] More widening multiply-and-accumulate pattern matching Andrew Stubbs
@ 2011-06-23 14:54 ` Andrew Stubbs
2011-06-28 17:02 ` Andrew Stubbs
2011-06-25 16:14 ` [PATCH (0/7)] Improve use of Widening Multiplies Bernd Schmidt
` (2 subsequent siblings)
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-23 14:54 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 733 bytes --]
Patch 4 introduced support for using signed multiplies to code unsigned
multiplies in a narrower mode. Patch 5 then introduced support for
mis-matched input modes.
These two combined mean that there is case where only the smaller of two
inputs is unsigned, and yet it still tries to user a mode wider than the
larger, signed input. This is bad because it means unnecessary extends
and because the wider operation might not exist.
This patch catches that case, and ensures that the smaller, unsigned
input, is zero-extended to match the mode of the larger, signed input.
Of course, both inputs may still have to be extended to fit the nearest
available instruction, so it doesn't make a difference every time.
OK?
Andrew
[-- Attachment #2: widening-multiplies-7.patch --]
[-- Type: text/x-patch, Size: 2437 bytes --]
2011-06-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
unsigned inputs of different modes.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/smlalbb-3.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/smlalbb-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, short *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2103,9 +2103,17 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
if (op != smul_widen_optab)
{
- from_mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
- return false;
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if (TYPE_UNSIGNED (type1)
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_MODE (type2) == from_mode))
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+ }
op = smul_widen_optab;
handler = find_widening_optab_handler_and_mode (op, to_mode,
@@ -2244,14 +2252,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
{
enum machine_mode mode = TYPE_MODE (type1);
- mode = GET_MODE_WIDER_MODE (mode);
- if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if (TYPE_UNSIGNED (type1)
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_MODE (type2) == mode))
{
- type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
- cast1 = cast2 = true;
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) >= GET_MODE_SIZE (TYPE_MODE (type)))
+ return false;
}
- else
- return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ cast1 = cast2 = true;
}
if (TYPE_MODE (type2) != TYPE_MODE (type1))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode
2011-06-23 14:54 ` [PATCH (7/7)] Mixed-sign multiplies using narrowest mode Andrew Stubbs
@ 2011-06-28 17:02 ` Andrew Stubbs
2011-07-14 14:44 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-28 17:02 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 845 bytes --]
On 23/06/11 15:43, Andrew Stubbs wrote:
> Patch 4 introduced support for using signed multiplies to code unsigned
> multiplies in a narrower mode. Patch 5 then introduced support for
> mis-matched input modes.
>
> These two combined mean that there is case where only the smaller of two
> inputs is unsigned, and yet it still tries to user a mode wider than the
> larger, signed input. This is bad because it means unnecessary extends
> and because the wider operation might not exist.
>
> This patch catches that case, and ensures that the smaller, unsigned
> input, is zero-extended to match the mode of the larger, signed input.
>
> Of course, both inputs may still have to be extended to fit the nearest
> available instruction, so it doesn't make a difference every time.
>
> OK?
This update fixes Janis' issue with the testsuite.
Andrew
[-- Attachment #2: widening-multiplies-7.patch --]
[-- Type: text/x-patch, Size: 2431 bytes --]
2011-06-24 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
unsigned inputs of different modes.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-9.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, short *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2103,9 +2103,17 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
if (op != smul_widen_optab)
{
- from_mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
- return false;
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if (TYPE_UNSIGNED (type1)
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_MODE (type2) == from_mode))
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+ }
op = smul_widen_optab;
handler = find_widening_optab_handler_and_mode (op, to_mode,
@@ -2227,14 +2235,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
{
enum machine_mode mode = TYPE_MODE (type1);
- mode = GET_MODE_WIDER_MODE (mode);
- if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if (TYPE_UNSIGNED (type1)
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_MODE (type2) == mode))
{
- type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
- cast1 = cast2 = true;
+ mode = GET_MODE_WIDER_MODE (mode);
+ if (GET_MODE_SIZE (mode) >= GET_MODE_SIZE (TYPE_MODE (type)))
+ return false;
}
- else
- return false;
+
+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ cast1 = cast2 = true;
}
if (TYPE_MODE (type2) != TYPE_MODE (type1))
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode
2011-06-28 17:02 ` Andrew Stubbs
@ 2011-07-14 14:44 ` Andrew Stubbs
2011-07-14 14:48 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-14 14:44 UTC (permalink / raw)
Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]
On 28/06/11 17:23, Andrew Stubbs wrote:
> On 23/06/11 15:43, Andrew Stubbs wrote:
>> Patch 4 introduced support for using signed multiplies to code unsigned
>> multiplies in a narrower mode. Patch 5 then introduced support for
>> mis-matched input modes.
>>
>> These two combined mean that there is case where only the smaller of two
>> inputs is unsigned, and yet it still tries to user a mode wider than the
>> larger, signed input. This is bad because it means unnecessary extends
>> and because the wider operation might not exist.
>>
>> This patch catches that case, and ensures that the smaller, unsigned
>> input, is zero-extended to match the mode of the larger, signed input.
>>
>> Of course, both inputs may still have to be extended to fit the nearest
>> available instruction, so it doesn't make a difference every time.
>>
>> OK?
>
> This update fixes Janis' issue with the testsuite.
And this version is updated to fit the changes made earlier in the
series, and also to use the precision, instead of the mode-size, in
order to better optimize bitfields.
OK?
Andrew
[-- Attachment #2: widening-multiplies-7.patch --]
[-- Type: text/x-patch, Size: 3017 bytes --]
2011-06-24 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
unsigned inputs of different modes.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-9.c: New file.
* gcc.target/arm/wmul-bitfield-2.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, short *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+struct bf
+{
+ int a : 3;
+ unsigned int b : 15;
+ int c : 3;
+};
+
+long long
+foo (long long a, struct bf b, struct bf c)
+{
+ return a + b.b * c.c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2121,9 +2121,18 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
if (op != smul_widen_optab)
{
- from_mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
- return false;
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if ((TYPE_UNSIGNED (type1)
+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+ }
op = smul_widen_optab;
handler = find_widening_optab_handler_and_mode (op, to_mode,
@@ -2290,14 +2299,20 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* There's no such thing as a mixed sign madd yet, so use a wider mode. */
if (from_unsigned1 != from_unsigned2)
{
- enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if ((from_unsigned1
+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+ || (from_unsigned2
+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
{
- from_mode = mode;
- from_unsigned1 = from_unsigned2 = false;
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode))
+ return false;
}
- else
- return false;
+
+ from_unsigned1 = from_unsigned2 = false;
}
/* If there was a conversion between the multiply and addition
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode
2011-07-14 14:44 ` Andrew Stubbs
@ 2011-07-14 14:48 ` Richard Guenther
2011-08-19 15:56 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-14 14:48 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 14, 2011 at 4:38 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 28/06/11 17:23, Andrew Stubbs wrote:
>>
>> On 23/06/11 15:43, Andrew Stubbs wrote:
>>>
>>> Patch 4 introduced support for using signed multiplies to code unsigned
>>> multiplies in a narrower mode. Patch 5 then introduced support for
>>> mis-matched input modes.
>>>
>>> These two combined mean that there is case where only the smaller of two
>>> inputs is unsigned, and yet it still tries to user a mode wider than the
>>> larger, signed input. This is bad because it means unnecessary extends
>>> and because the wider operation might not exist.
>>>
>>> This patch catches that case, and ensures that the smaller, unsigned
>>> input, is zero-extended to match the mode of the larger, signed input.
>>>
>>> Of course, both inputs may still have to be extended to fit the nearest
>>> available instruction, so it doesn't make a difference every time.
>>>
>>> OK?
>>
>> This update fixes Janis' issue with the testsuite.
>
> And this version is updated to fit the changes made earlier in the series,
> and also to use the precision, instead of the mode-size, in order to better
> optimize bitfields.
>
> OK?
Ok.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode
2011-07-14 14:48 ` Richard Guenther
@ 2011-08-19 15:56 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 15:56 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 105 bytes --]
On 14/07/11 15:41, Richard Guenther wrote:
> Ok.
Committed, unchanged apart from the test case.
Andrew
[-- Attachment #2: widening-multiplies-7.patch --]
[-- Type: text/x-patch, Size: 3081 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
unsigned inputs of different modes.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.target/arm/wmul-9.c: New file.
* gcc.target/arm/wmul-bitfield-2.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, short *b, char *c)
+{
+ return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+struct bf
+{
+ int a : 3;
+ unsigned int b : 15;
+ int c : 3;
+};
+
+long long
+foo (long long a, struct bf b, struct bf c)
+{
+ return a + b.b * c.c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2115,9 +2115,18 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
{
if (op != smul_widen_optab)
{
- from_mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
- return false;
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if ((TYPE_UNSIGNED (type1)
+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+ || (TYPE_UNSIGNED (type2)
+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
+ {
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+ return false;
+ }
op = smul_widen_optab;
handler = find_widening_optab_handler_and_mode (op, to_mode,
@@ -2284,14 +2293,20 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
/* There's no such thing as a mixed sign madd yet, so use a wider mode. */
if (from_unsigned1 != from_unsigned2)
{
- enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
- if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+ /* We can use a signed multiply with unsigned types as long as
+ there is a wider mode to use, or it is the smaller of the two
+ types that is unsigned. Note that type1 >= type2, always. */
+ if ((from_unsigned1
+ && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+ || (from_unsigned2
+ && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
{
- from_mode = mode;
- from_unsigned1 = from_unsigned2 = false;
+ from_mode = GET_MODE_WIDER_MODE (from_mode);
+ if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode))
+ return false;
}
- else
- return false;
+
+ from_unsigned1 = from_unsigned2 = false;
}
/* If there was a conversion between the multiply and addition
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (0/7)] Improve use of Widening Multiplies
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (6 preceding siblings ...)
2011-06-23 14:54 ` [PATCH (7/7)] Mixed-sign multiplies using narrowest mode Andrew Stubbs
@ 2011-06-25 16:14 ` Bernd Schmidt
2011-06-27 9:16 ` Andrew Stubbs
2011-07-18 14:34 ` [PATCH (8/7)] Fix a bug in multiply-and-accumulate Andrew Stubbs
2011-07-21 13:14 ` [PATCH (9/7)] Widening multiplies with constant inputs Andrew Stubbs
9 siblings, 1 reply; 107+ messages in thread
From: Bernd Schmidt @ 2011-06-25 16:14 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches
On 06/23/11 16:34, Andrew Stubbs wrote:
> The patches provide a number of improvements:
>
> * Support for instructions that widen by more than one mode
> (e.g. from HImode to DImode).
>
> * Use of widening multiplies even when the input mode is narrower than
> the instruction uses. (e.g. Use HI->DI to do QI->DI).
>
> * Use of signed widening multiplies (of a larger mode) where unsigned
> multiplies are not available.
>
> * Support for input operands with mis-matched signedness, with or
> without usmul_widen_optab.
>
> * Support for input operands with mis-matched mode [1].
>
> * Improved pattern matching in the widening_mult pass.
> * Recognition of true types, even if obscured by a cast.
> * Insertion of extra gimple statements where the existing code was
> incompatible with widening multiplies.
> * Recognition of widening multiply-and-accumulate even where the
> multiply expression was not widening.
That all sounds good, but missing from this list is something that
occurs on many CPUs - widening from the high part of a register. The
current machinery only recognizes lowxlow widening multiplication, but
hardware often exists for highxlow and highxhigh. For example, Blackfin
has "<su_optab>hisi_lh"/hl/hh instruction patterns; C6X also has a full
set; ARM has mulhisi3tb/bt/tt.
Do you think it will be possible to extend your new framework to handle
this case as well?
Bernd
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (0/7)] Improve use of Widening Multiplies
2011-06-25 16:14 ` [PATCH (0/7)] Improve use of Widening Multiplies Bernd Schmidt
@ 2011-06-27 9:16 ` Andrew Stubbs
0 siblings, 0 replies; 107+ messages in thread
From: Andrew Stubbs @ 2011-06-27 9:16 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: gcc-patches
On 25/06/11 15:12, Bernd Schmidt wrote:
> That all sounds good, but missing from this list is something that
> occurs on many CPUs - widening from the high part of a register. The
> current machinery only recognizes lowxlow widening multiplication, but
> hardware often exists for highxlow and highxhigh. For example, Blackfin
> has "<su_optab>hisi_lh"/hl/hh instruction patterns; C6X also has a full
> set; ARM has mulhisi3tb/bt/tt.
>
> Do you think it will be possible to extend your new framework to handle
> this case as well?
No, I can't think of a way to implement widening from the high part
using anything like my framework.
I mean, what I've done is add a new dimension to the optab table, but
not changed the meaning of that optab. The expand pass has to look at
the input types to know what insn to use, but it doesn't need to look
any further than that. If I added yet another dimension to cover expand
from high part, then we could detect that in convert_mult_to_widen (and
maybe clean it up), but the expander would still have to re-detect it
later on.
I would think that the best way to implement that would still be to add
a new optab entry, new tree code, etc., etc., and then fix up all the
"if (optab == smul_widen_optab)" and such that would need to consider it.
In any case, on ARM at any rate, the combine pass already combines shift
and widening-mult patterns quite reliably (I committed at a patch for
that not so long ago).
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (8/7)] Fix a bug in multiply-and-accumulate
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (7 preceding siblings ...)
2011-06-25 16:14 ` [PATCH (0/7)] Improve use of Widening Multiplies Bernd Schmidt
@ 2011-07-18 14:34 ` Andrew Stubbs
2011-07-18 16:09 ` Richard Guenther
2011-07-21 13:14 ` [PATCH (9/7)] Widening multiplies with constant inputs Andrew Stubbs
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-18 14:34 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 479 bytes --]
As far as I can tell, the patch series so far works great as long as the
input type of the accumulate value is the same as the output type.
Unfortunately you get an ICE otherwise .... doh!
This patch should fix the problem.
I could have inserted this fix into the correct spot in the existing
series, but I've already regenerated the whole lot several times, it's
getting confusing, and they're all approved already, so I'm just going
to tack this one on the end.
Andrew
[-- Attachment #2: widening-multiplies-8.patch --]
[-- Type: text/x-patch, Size: 1091 bytes --]
2011-07-18 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Convert add_rhs
to the correct type.
gcc/testsuite/
* gcc.target/arm/wmul-10.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-10.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+unsigned long long
+foo (unsigned short a, unsigned short *b, unsigned short *c)
+{
+ return (unsigned)a + (unsigned long long)*b * (unsigned long long)*c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2375,6 +2375,10 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
}
+ if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (add_rhs)))
+ add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
+ add_rhs);
+
gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (8/7)] Fix a bug in multiply-and-accumulate
2011-07-18 14:34 ` [PATCH (8/7)] Fix a bug in multiply-and-accumulate Andrew Stubbs
@ 2011-07-18 16:09 ` Richard Guenther
2011-07-21 13:48 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-18 16:09 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches
On Mon, Jul 18, 2011 at 3:14 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> As far as I can tell, the patch series so far works great as long as the
> input type of the accumulate value is the same as the output type.
> Unfortunately you get an ICE otherwise .... doh!
>
> This patch should fix the problem.
>
> I could have inserted this fix into the correct spot in the existing series,
> but I've already regenerated the whole lot several times, it's getting
> confusing, and they're all approved already, so I'm just going to tack this
> one on the end.
Will signedness be always the same? Usually the canonical check to
use would be !useless_type_conversion_p (type, TREE_TYPE (add_rhs)).
Ok if you use that.
Thanks,
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (8/7)] Fix a bug in multiply-and-accumulate
2011-07-18 16:09 ` Richard Guenther
@ 2011-07-21 13:48 ` Andrew Stubbs
2011-08-19 16:22 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-21 13:48 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 540 bytes --]
On 18/07/11 15:46, Richard Guenther wrote:
> Will signedness be always the same? Usually the canonical check to
> use would be !useless_type_conversion_p (type, TREE_TYPE (add_rhs)).
The signedness ought to be unimportant - any extend will be based on the
source type, and the signedness should not affect the addition
operation. That said, it really ought to remain correct or else bad
things could happen in later optimizations ....
Here is the patch I plan to commit, when patch 1 is approved, and my
testing is complete.
Andrew
[-- Attachment #2: widening-multiplies-8.patch --]
[-- Type: text/x-patch, Size: 1118 bytes --]
2011-07-21 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Convert add_rhs
to the correct type.
gcc/testsuite/
* gcc.target/arm/wmul-10.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-10.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+
+unsigned long long
+foo (unsigned short a, unsigned short *b, unsigned short *c)
+{
+ return (unsigned)a + (unsigned long long)*b * (unsigned long long)*c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2375,6 +2375,10 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
mult_rhs2 = build_and_insert_cast (gsi, loc, tmp, mult_rhs2);
}
+ if (!useless_type_conversion_p (type, TREE_TYPE (add_rhs)))
+ add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
+ add_rhs);
+
gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
^ permalink raw reply [flat|nested] 107+ messages in thread
* [PATCH (9/7)] Widening multiplies with constant inputs
2011-06-23 14:38 [PATCH (0/7)] Improve use of Widening Multiplies Andrew Stubbs
` (8 preceding siblings ...)
2011-07-18 14:34 ` [PATCH (8/7)] Fix a bug in multiply-and-accumulate Andrew Stubbs
@ 2011-07-21 13:14 ` Andrew Stubbs
2011-07-21 14:34 ` Richard Guenther
9 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-21 13:14 UTC (permalink / raw)
To: gcc-patches; +Cc: patches
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
This patch is part bug fix, part better optimization.
Firstly, my initial patch series introduced a bug that caused an
internal compiler error when the input to a multiply was a constant.
This was caused by the gimple verification rejecting such things. I'm
not totally clear how this ever worked, but I've corrected it by
inserting a temporary SSA_NAME between the constant and the multiply.
I also discovered that widening multiply-and-accumulate operations were
not recognised if any one of the three inputs were a constant. I've
corrected this by adjusting the pattern matching. This also required
inserting new SSA_NAMEs to make it work.
In order to insert the new SSA_NAME, I've simply reused the existing
type conversion code - the only difference is that the conversion may be
a no-op, so it just generates a straight forward assignment.
OK?
Andrew
[-- Attachment #2: widening-multiplies-9.patch --]
[-- Type: text/x-patch, Size: 4353 bytes --]
2011-07-21 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants
beyond conversions.
(convert_mult_to_widen): Create SSA_NAME for constant inputs.
(convert_plusminus_to_widen): Don't automatically reject inputs that are
not an SSA_NAME.
Create SSA_NAME for constant inputs.
gcc/testsuite/
* gcc.target/arm/wmul-11.c: New file.
* gcc.target/arm/wmul-12.c: New file.
* gcc.target/arm/wmul-13.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b)
+{
+ return 10 * (long long)*b;
+}
+
+/* { dg-final { scan-assembler "smull" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-12.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b, int *c)
+{
+ int tmp = *b * *c;
+ return 10 + (long long)tmp;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-13.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *a, int *b)
+{
+ return *a + (long long)*b * 10;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1997,6 +1997,13 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
type1 = TREE_TYPE (rhs1);
}
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ {
+ *new_rhs_out = rhs1;
+ *type_out = NULL;
+ return true;
+ }
+
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
@@ -2152,7 +2159,8 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
if (actual_precision != TYPE_PRECISION (type1)
- || from_unsigned1 != TYPE_UNSIGNED (type1))
+ || from_unsigned1 != TYPE_UNSIGNED (type1)
+ || TREE_CODE (rhs1) != SSA_NAME)
{
tmp = create_tmp_var (build_nonstandard_integer_type
(actual_precision, from_unsigned1),
@@ -2160,7 +2168,8 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
}
if (actual_precision != TYPE_PRECISION (type2)
- || from_unsigned2 != TYPE_UNSIGNED (type2))
+ || from_unsigned2 != TYPE_UNSIGNED (type2)
+ || TREE_CODE (rhs2) != SSA_NAME)
{
/* Reuse the same type info, if possible. */
if (!tmp || from_unsigned1 != from_unsigned2)
@@ -2221,8 +2230,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs1_stmt))
rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
- else
- return false;
if (TREE_CODE (rhs2) == SSA_NAME)
{
@@ -2230,8 +2237,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs2_stmt))
rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
}
- else
- return false;
/* Allow for one conversion statement between the multiply
and addition/subtraction statement. If there are more than
@@ -2358,7 +2363,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
for the opcode. This will be the full mode size. */
actual_precision = GET_MODE_PRECISION (actual_mode);
if (actual_precision != TYPE_PRECISION (type1)
- || from_unsigned1 != TYPE_UNSIGNED (type1))
+ || from_unsigned1 != TYPE_UNSIGNED (type1)
+ || TREE_CODE (mult_rhs1) != SSA_NAME)
{
tmp = create_tmp_var (build_nonstandard_integer_type
(actual_precision, from_unsigned1),
@@ -2366,7 +2372,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1);
}
if (actual_precision != TYPE_PRECISION (type2)
- || from_unsigned2 != TYPE_UNSIGNED (type2))
+ || from_unsigned2 != TYPE_UNSIGNED (type2)
+ || TREE_CODE (mult_rhs2) != SSA_NAME)
{
if (!tmp || from_unsigned1 != from_unsigned2)
tmp = create_tmp_var (build_nonstandard_integer_type
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-21 13:14 ` [PATCH (9/7)] Widening multiplies with constant inputs Andrew Stubbs
@ 2011-07-21 14:34 ` Richard Guenther
2011-07-22 12:28 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-21 14:34 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Thu, Jul 21, 2011 at 2:53 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> This patch is part bug fix, part better optimization.
>
> Firstly, my initial patch series introduced a bug that caused an internal
> compiler error when the input to a multiply was a constant. This was caused
> by the gimple verification rejecting such things. I'm not totally clear how
> this ever worked, but I've corrected it by inserting a temporary SSA_NAME
> between the constant and the multiply.
Huh? Constant operands should be perfectly fine. What was the error
you got?
> I also discovered that widening multiply-and-accumulate operations were not
> recognised if any one of the three inputs were a constant. I've corrected
> this by adjusting the pattern matching. This also required inserting new
> SSA_NAMEs to make it work.
See above.
> In order to insert the new SSA_NAME, I've simply reused the existing type
> conversion code - the only difference is that the conversion may be a no-op,
> so it just generates a straight forward assignment.
>
> OK?
Nope. I suppose you forget to adjust the constants type? Just
fold-convert it before using it as input to a macc.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-21 14:34 ` Richard Guenther
@ 2011-07-22 12:28 ` Andrew Stubbs
2011-07-22 12:32 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 12:28 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
On 21/07/11 14:22, Richard Guenther wrote:
> On Thu, Jul 21, 2011 at 2:53 PM, Andrew Stubbs<ams@codesourcery.com> wrote:
>> This patch is part bug fix, part better optimization.
>>
>> Firstly, my initial patch series introduced a bug that caused an internal
>> compiler error when the input to a multiply was a constant. This was caused
>> by the gimple verification rejecting such things. I'm not totally clear how
>> this ever worked, but I've corrected it by inserting a temporary SSA_NAME
>> between the constant and the multiply.
>
> Huh? Constant operands should be perfectly fine. What was the error
> you got?
Ok, so it seems that the fold_convert we thought was redundant in patch
5 (now moved to patch 2) was in fact responsible for making constants
the right type.
I've rewritten it to use fold_convert to change the constant.
>> I also discovered that widening multiply-and-accumulate operations were not
>> recognised if any one of the three inputs were a constant. I've corrected
>> this by adjusting the pattern matching. This also required inserting new
>> SSA_NAMEs to make it work.
>
> See above.
The pattern matching stuff remains the same, but the constant
conversions have been updated.
>> In order to insert the new SSA_NAME, I've simply reused the existing type
>> conversion code - the only difference is that the conversion may be a no-op,
>> so it just generates a straight forward assignment.
>>
>> OK?
>
> Nope. I suppose you forget to adjust the constants type? Just
> fold-convert it before using it as input to a macc.
Done.
OK?
Andrew
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-22 12:28 ` Andrew Stubbs
@ 2011-07-22 12:32 ` Andrew Stubbs
2011-07-22 12:34 ` Richard Guenther
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 12:32 UTC (permalink / raw)
Cc: Richard Guenther, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]
ENOPATCH ....
On 22/07/11 12:57, Andrew Stubbs wrote:
> On 21/07/11 14:22, Richard Guenther wrote:
>> On Thu, Jul 21, 2011 at 2:53 PM, Andrew Stubbs<ams@codesourcery.com>
>> wrote:
>>> This patch is part bug fix, part better optimization.
>>>
>>> Firstly, my initial patch series introduced a bug that caused an
>>> internal
>>> compiler error when the input to a multiply was a constant. This was
>>> caused
>>> by the gimple verification rejecting such things. I'm not totally
>>> clear how
>>> this ever worked, but I've corrected it by inserting a temporary
>>> SSA_NAME
>>> between the constant and the multiply.
>>
>> Huh? Constant operands should be perfectly fine. What was the error
>> you got?
>
> Ok, so it seems that the fold_convert we thought was redundant in patch
> 5 (now moved to patch 2) was in fact responsible for making constants
> the right type.
>
> I've rewritten it to use fold_convert to change the constant.
>
>>> I also discovered that widening multiply-and-accumulate operations
>>> were not
>>> recognised if any one of the three inputs were a constant. I've
>>> corrected
>>> this by adjusting the pattern matching. This also required inserting new
>>> SSA_NAMEs to make it work.
>>
>> See above.
>
> The pattern matching stuff remains the same, but the constant
> conversions have been updated.
>
>>> In order to insert the new SSA_NAME, I've simply reused the existing
>>> type
>>> conversion code - the only difference is that the conversion may be a
>>> no-op,
>>> so it just generates a straight forward assignment.
>>>
>>> OK?
>>
>> Nope. I suppose you forget to adjust the constants type? Just
>> fold-convert it before using it as input to a macc.
>
> Done.
>
> OK?
>
> Andrew
>
[-- Attachment #2: widening-multiplies-9.patch --]
[-- Type: text/x-patch, Size: 3431 bytes --]
2011-07-22 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants
beyond conversions.
(convert_mult_to_widen): Convert constant inputs to the right type.
(convert_plusminus_to_widen): Don't automatically reject inputs that
are not an SSA_NAME.
Convert constant inputs to the right type.
gcc/testsuite/
* gcc.target/arm/wmul-11.c: New file.
* gcc.target/arm/wmul-12.c: New file.
* gcc.target/arm/wmul-13.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b)
+{
+ return 10 * (long long)*b;
+}
+
+/* { dg-final { scan-assembler "smull" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-12.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b, int *c)
+{
+ int tmp = *b * *c;
+ return 10 + (long long)tmp;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-13.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *a, int *b)
+{
+ return *a + (long long)*b * 10;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1997,6 +1997,13 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
type1 = TREE_TYPE (rhs1);
}
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ {
+ *new_rhs_out = rhs1;
+ *type_out = NULL;
+ return true;
+ }
+
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
@@ -2170,6 +2177,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
}
+ /* Handle constants. */
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, rhs1);
+ if (TREE_CODE (rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, rhs2);
+
gimple_assign_set_rhs1 (stmt, rhs1);
gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2221,8 +2234,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs1_stmt))
rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
- else
- return false;
if (TREE_CODE (rhs2) == SSA_NAME)
{
@@ -2230,8 +2241,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs2_stmt))
rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
}
- else
- return false;
/* Allow for one conversion statement between the multiply
and addition/subtraction statement. If there are more than
@@ -2379,6 +2388,12 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
add_rhs);
+ /* Handle constants. */
+ if (TREE_CODE (mult_rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, mult_rhs1);
+ if (TREE_CODE (mult_rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, mult_rhs2);
+
gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-22 12:32 ` Andrew Stubbs
@ 2011-07-22 12:34 ` Richard Guenther
2011-07-22 16:06 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Richard Guenther @ 2011-07-22 12:34 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: gcc-patches, patches
On Fri, Jul 22, 2011 at 2:07 PM, Andrew Stubbs <ams@codesourcery.com> wrote:
> ENOPATCH ....
>
> On 22/07/11 12:57, Andrew Stubbs wrote:
>>
>> On 21/07/11 14:22, Richard Guenther wrote:
>>>
>>> On Thu, Jul 21, 2011 at 2:53 PM, Andrew Stubbs<ams@codesourcery.com>
>>> wrote:
>>>>
>>>> This patch is part bug fix, part better optimization.
>>>>
>>>> Firstly, my initial patch series introduced a bug that caused an
>>>> internal
>>>> compiler error when the input to a multiply was a constant. This was
>>>> caused
>>>> by the gimple verification rejecting such things. I'm not totally
>>>> clear how
>>>> this ever worked, but I've corrected it by inserting a temporary
>>>> SSA_NAME
>>>> between the constant and the multiply.
>>>
>>> Huh? Constant operands should be perfectly fine. What was the error
>>> you got?
>>
>> Ok, so it seems that the fold_convert we thought was redundant in patch
>> 5 (now moved to patch 2) was in fact responsible for making constants
>> the right type.
>>
>> I've rewritten it to use fold_convert to change the constant.
>>
>>>> I also discovered that widening multiply-and-accumulate operations
>>>> were not
>>>> recognised if any one of the three inputs were a constant. I've
>>>> corrected
>>>> this by adjusting the pattern matching. This also required inserting new
>>>> SSA_NAMEs to make it work.
>>>
>>> See above.
>>
>> The pattern matching stuff remains the same, but the constant
>> conversions have been updated.
>>
>>>> In order to insert the new SSA_NAME, I've simply reused the existing
>>>> type
>>>> conversion code - the only difference is that the conversion may be a
>>>> no-op,
>>>> so it just generates a straight forward assignment.
>>>>
>>>> OK?
>>>
>>> Nope. I suppose you forget to adjust the constants type? Just
>>> fold-convert it before using it as input to a macc.
>>
>> Done.
>>
>> OK?
Ok.
Thanks,
Richard.
>> Andrew
>>
>
>
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-22 12:34 ` Richard Guenther
@ 2011-07-22 16:06 ` Andrew Stubbs
2011-08-19 16:24 ` Andrew Stubbs
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-07-22 16:06 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 197 bytes --]
On 22/07/11 13:17, Richard Guenther wrote:
> Ok.
I found a NULL-pointer dereference bug.
Fixed in the attached. I'll commit this version when the rest of my
testing is complete.
Thanks
Andrew
[-- Attachment #2: widening-multiplies-9.patch --]
[-- Type: text/x-patch, Size: 3439 bytes --]
2011-07-22 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants
beyond conversions.
(convert_mult_to_widen): Convert constant inputs to the right type.
(convert_plusminus_to_widen): Don't automatically reject inputs that
are not an SSA_NAME.
Convert constant inputs to the right type.
gcc/testsuite/
* gcc.target/arm/wmul-11.c: New file.
* gcc.target/arm/wmul-12.c: New file.
* gcc.target/arm/wmul-13.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b)
+{
+ return 10 * (long long)*b;
+}
+
+/* { dg-final { scan-assembler "smull" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-12.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b, int *c)
+{
+ int tmp = *b * *c;
+ return 10 + (long long)tmp;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-13.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *a, int *b)
+{
+ return *a + (long long)*b * 10;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1997,6 +1997,13 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
type1 = TREE_TYPE (rhs1);
}
+ if (rhs1 && TREE_CODE (rhs1) == INTEGER_CST)
+ {
+ *new_rhs_out = rhs1;
+ *type_out = NULL;
+ return true;
+ }
+
if (TREE_CODE (type1) != TREE_CODE (type)
|| TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
return false;
@@ -2170,6 +2177,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
}
+ /* Handle constants. */
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, rhs1);
+ if (TREE_CODE (rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, rhs2);
+
gimple_assign_set_rhs1 (stmt, rhs1);
gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2221,8 +2234,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs1_stmt))
rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
- else
- return false;
if (TREE_CODE (rhs2) == SSA_NAME)
{
@@ -2230,8 +2241,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs2_stmt))
rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
}
- else
- return false;
/* Allow for one conversion statement between the multiply
and addition/subtraction statement. If there are more than
@@ -2379,6 +2388,12 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
add_rhs);
+ /* Handle constants. */
+ if (TREE_CODE (mult_rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, mult_rhs1);
+ if (TREE_CODE (mult_rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, mult_rhs2);
+
gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [PATCH (9/7)] Widening multiplies with constant inputs
2011-07-22 16:06 ` Andrew Stubbs
@ 2011-08-19 16:24 ` Andrew Stubbs
2011-08-19 16:52 ` H.J. Lu
0 siblings, 1 reply; 107+ messages in thread
From: Andrew Stubbs @ 2011-08-19 16:24 UTC (permalink / raw)
Cc: Richard Guenther, gcc-patches, patches
[-- Attachment #1: Type: text/plain, Size: 196 bytes --]
On 22/07/11 16:38, Andrew Stubbs wrote:
> Fixed in the attached. I'll commit this version when the rest of my
> testing is complete.
Now committed. Here's the patch with updated context.
Andrew
[-- Attachment #2: widening-multiplies-9.patch --]
[-- Type: text/x-patch, Size: 3474 bytes --]
2011-08-19 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants
beyond conversions.
(convert_mult_to_widen): Convert constant inputs to the right type.
(convert_plusminus_to_widen): Don't automatically reject inputs that
are not an SSA_NAME.
Convert constant inputs to the right type.
gcc/testsuite/
* gcc.target/arm/wmul-11.c: New file.
* gcc.target/arm/wmul-12.c: New file.
* gcc.target/arm/wmul-13.c: New file.
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b)
+{
+ return 10 * (long long)*b;
+}
+
+/* { dg-final { scan-assembler "smull" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-12.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b, int *c)
+{
+ int tmp = *b * *c;
+ return 10 + (long long)tmp;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-13.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *a, int *b)
+{
+ return *a + (long long)*b * 10;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1995,7 +1995,16 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
: rhs_code != FIXED_CONVERT_EXPR)
rhs1 = rhs;
else
- rhs1 = gimple_assign_rhs1 (stmt);
+ {
+ rhs1 = gimple_assign_rhs1 (stmt);
+
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ {
+ *new_rhs_out = rhs1;
+ *type_out = NULL;
+ return true;
+ }
+ }
}
else
rhs1 = rhs;
@@ -2164,6 +2173,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
}
+ /* Handle constants. */
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, rhs1);
+ if (TREE_CODE (rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, rhs2);
+
gimple_assign_set_rhs1 (stmt, rhs1);
gimple_assign_set_rhs2 (stmt, rhs2);
gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2215,8 +2230,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs1_stmt))
rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
}
- else
- return false;
if (TREE_CODE (rhs2) == SSA_NAME)
{
@@ -2224,8 +2237,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
if (is_gimple_assign (rhs2_stmt))
rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
}
- else
- return false;
/* Allow for one conversion statement between the multiply
and addition/subtraction statement. If there are more than
@@ -2373,6 +2384,12 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
add_rhs);
+ /* Handle constants. */
+ if (TREE_CODE (mult_rhs1) == INTEGER_CST)
+ rhs1 = fold_convert (type1, mult_rhs1);
+ if (TREE_CODE (mult_rhs2) == INTEGER_CST)
+ rhs2 = fold_convert (type2, mult_rhs2);
+
gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
add_rhs);
update_stmt (gsi_stmt (*gsi));
^ permalink raw reply [flat|nested] 107+ messages in thread