From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Richard Sandiford <richard.sandiford@arm.com>,
Bill Schmidt <wschmidt@linux.ibm.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
Uros Bizjak <ubizjak@gmail.com>,
Segher Boessenkool <segher@kernel.crashing.org>
Subject: [PATCH v4] vect: Recog mul_highpart pattern
Date: Fri, 16 Jul 2021 13:33:21 +0800 [thread overview]
Message-ID: <af73a46c-7c35-0f69-5c14-c030721bb653@linux.ibm.com> (raw)
In-Reply-To: <CAFiYyc3umJbhRFbCLzXWacXpqb7j6He6aUTb1SYhZDAtP1mg7A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4958 bytes --]
on 2021/7/15 下午7:58, Richard Biener wrote:
> On Thu, Jul 15, 2021 at 10:41 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> on 2021/7/15 下午4:04, Kewen.Lin via Gcc-patches wrote:
>>> Hi Uros,
>>>
>>> on 2021/7/15 下午3:17, Uros Bizjak wrote:
>>>> On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>
>>>>> on 2021/7/14 下午3:45, Kewen.Lin via Gcc-patches wrote:
>>>>>> on 2021/7/14 下午2:38, Richard Biener wrote:
>>>>>>> On Tue, Jul 13, 2021 at 4:59 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> on 2021/7/13 下午8:42, Richard Biener wrote:
>>>>>>>>> On Tue, Jul 13, 2021 at 12:25 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>>>>>
>>>>>>>> I guess the proposed IFN would be directly mapped for [us]mul_highpart?
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>
>>>>>> Thanks for confirming! The related patch v2 is attached and the testing
>>>>>> is ongoing.
>>>>>>
>>>>>
>>>>> It's bootstrapped & regtested on powerpc64le-linux-gnu P9 and
>>>>> aarch64-linux-gnu. But on x86_64-redhat-linux there are XPASSes as below:
>>>>>
>>>>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw
>>>>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw
>>>>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw
>>>>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw
>>>>
>>>> These XFAILs should be removed after your patch.
>>>>
>>> I'm curious whether it's intentional not to specify -fno-vect-cost-model
>>> for this test case. As noted above, this case is sensitive on how we
>>> cost mult_highpart. Without cost modeling, the XFAILs can be removed
>>> only with this mul_highpart pattern support, no matter how we model it
>>> (x86 part of this patch exists or not).
>>>
>>>> This is PR100696 [1], we want PMULH.W here, so x86 part of the patch
>>>> is actually not needed.
>>>>
>>>
>>> Thanks for the information! The justification for the x86 part is that:
>>> the IFN_MULH essentially covers MULT_HIGHPART_EXPR with mul_highpart
>>> optab support, i386 port has already customized costing for
>>> MULT_HIGHPART_EXPR (should mean/involve the case with mul_highpart optab
>>> support), if we don't follow the same way for IFN_MULH, I'm worried that
>>> we may cost the IFN_MULH wrongly. If taking IFN_MULH as normal stmt is
>>> a right thing (we shouldn't cost it specially), it at least means we
>>> have to adjust ix86_multiplication_cost for MULT_HIGHPART_EXPR when it
>>> has direct mul_highpart optab support, I think they should be costed
>>> consistently. Does it sound reasonable?
>>>
>>
>> Hi Richard(s),
>>
>> This possibly inconsistent handling problem seems like a counter example
>> better to use a new IFN rather than the existing tree_code, it seems hard
>> to maintain (should remember to keep consistent for its handlings). ;)
>> From this perspective, maybe it's better to move backward to use tree_code
>> and guard it under can_mult_highpart_p == 1 (just like IFN and avoid
>> costing issue Richi pointed out before)?
>>
>> What do you think?
>
> No, whenever we want to do code generation based on machine
> capabilities the canonical way to test for those is to look at optabs
> and then it's most natural to keep that 1:1 relation and emit
> internal function calls which directly map to supported optabs
> instead of going back to some tree codes.
>
> When targets "lie" and provide expanders for something they can
> only emulate then they have to compensate in their costing.
> But as I understand this isn't the case for x86 here.
>
> Now, in this case we already have the MULT_HIGHPART_EXPR tree,
> so yes, it might make sense to use that instead of introducing an
> alternate way via the direct internal function. Somebody decided
> that MULT_HIGHPART is generic enough to warrant this - but I
> see that expand_mult_highpart can fail unless can_mult_highpart_p
> and this is exactly one of the cases we want to avoid - either
> we can handle something generally in which case it can be a
> tree code or we can't, then it should be 1:1 tied to optabs at best
> (mult_highpart has scalar support only for the direct optab,
> vector support also for widen_mult).
>
Thanks for the detailed explanation! The attached v4 follows the
preferred IFN way like v3, just with extra test case updates.
Bootstrapped & regtested again on powerpc64le-linux-gnu P9,
x86_64-redhat-linux and aarch64-linux-gnu.
Is it ok for trunk?
BR,
Kewen
-----
gcc/ChangeLog:
PR tree-optimization/100696
* internal-fn.c (first_commutative_argument): Add info for IFN_MULH.
* internal-fn.def (IFN_MULH): New internal function.
* tree-vect-patterns.c (vect_recog_mulhs_pattern): Add support to
recog normal multiply highpart as IFN_MULH.
* config/i386/i386.c (ix86_add_stmt_cost): Adjust for combined
function CFN_MULH.
gcc/testsuite/ChangeLog:
PR tree-optimization/100696
* gcc.target/i386/pr100637-3w.c: Adjust for mul_highpart recog.
[-- Attachment #2: vect-recog-hmul-v4.patch --]
[-- Type: text/plain, Size: 5204 bytes --]
---
gcc/config/i386/i386.c | 3 ++
gcc/internal-fn.c | 1 +
gcc/internal-fn.def | 2 ++
gcc/testsuite/gcc.target/i386/pr100637-3w.c | 6 ++--
gcc/tree-vect-patterns.c | 38 +++++++++++++++------
5 files changed, 37 insertions(+), 13 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a93128fa0a4..1dd9108353c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22559,6 +22559,9 @@ ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
mode == SFmode ? ix86_cost->fmass
: ix86_cost->fmasd);
break;
+ case CFN_MULH:
+ stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
+ break;
default:
break;
}
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index fb8b43d1ce2..b1b4289357c 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3703,6 +3703,7 @@ first_commutative_argument (internal_fn fn)
case IFN_FNMS:
case IFN_AVG_FLOOR:
case IFN_AVG_CEIL:
+ case IFN_MULH:
case IFN_MULHS:
case IFN_MULHRS:
case IFN_FMIN:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index c3b8e730960..ed6d7de1680 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -169,6 +169,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first,
DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first,
savg_ceil, uavg_ceil, binary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (MULH, ECF_CONST | ECF_NOTHROW, first,
+ smul_highpart, umul_highpart, binary)
DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | ECF_NOTHROW, first,
smulhs, umulhs, binary)
DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/testsuite/gcc.target/i386/pr100637-3w.c b/gcc/testsuite/gcc.target/i386/pr100637-3w.c
index b951f30f571..4ea467b4af5 100644
--- a/gcc/testsuite/gcc.target/i386/pr100637-3w.c
+++ b/gcc/testsuite/gcc.target/i386/pr100637-3w.c
@@ -1,6 +1,6 @@
/* PR target/100637 */
/* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -msse4" } */
+/* { dg-options "-O2 -ftree-vectorize -msse4 -fno-vect-cost-model" } */
short r[2], a[2], b[2];
unsigned short ur[2], ua[2], ub[2];
@@ -13,7 +13,7 @@ void mulh (void)
r[i] = ((int) a[i] * b[i]) >> 16;
}
-/* { dg-final { scan-assembler "pmulhw" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler "pmulhw" } } */
void mulhu (void)
{
@@ -23,7 +23,7 @@ void mulhu (void)
ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16;
}
-/* { dg-final { scan-assembler "pmulhuw" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler "pmulhuw" } } */
void mulhrs (void)
{
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index b2e7fc2cc7a..ada89d7060b 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -1896,8 +1896,15 @@ vect_recog_over_widening_pattern (vec_info *vinfo,
1) Multiply high with scaling
TYPE res = ((TYPE) a * (TYPE) b) >> c;
+ Here, c is bitsize (TYPE) / 2 - 1.
+
2) ... or also with rounding
TYPE res = (((TYPE) a * (TYPE) b) >> d + 1) >> 1;
+ Here, d is bitsize (TYPE) / 2 - 2.
+
+ 3) Normal multiply high
+ TYPE res = ((TYPE) a * (TYPE) b) >> e;
+ Here, e is bitsize (TYPE) / 2.
where only the bottom half of res is used. */
@@ -1942,7 +1949,6 @@ vect_recog_mulhs_pattern (vec_info *vinfo,
stmt_vec_info mulh_stmt_info;
tree scale_term;
internal_fn ifn;
- unsigned int expect_offset;
/* Check for the presence of the rounding term. */
if (gimple_assign_rhs_code (rshift_input_stmt) == PLUS_EXPR)
@@ -1991,25 +1997,37 @@ vect_recog_mulhs_pattern (vec_info *vinfo,
/* Get the scaling term. */
scale_term = gimple_assign_rhs2 (plus_input_stmt);
+ /* Check that the scaling factor is correct. */
+ if (TREE_CODE (scale_term) != INTEGER_CST)
+ return NULL;
+
+ /* Check pattern 2). */
+ if (wi::to_widest (scale_term) + target_precision + 2
+ != TYPE_PRECISION (lhs_type))
+ return NULL;
- expect_offset = target_precision + 2;
ifn = IFN_MULHRS;
}
else
{
mulh_stmt_info = rshift_input_stmt_info;
scale_term = gimple_assign_rhs2 (last_stmt);
+ /* Check that the scaling factor is correct. */
+ if (TREE_CODE (scale_term) != INTEGER_CST)
+ return NULL;
- expect_offset = target_precision + 1;
- ifn = IFN_MULHS;
+ /* Check for pattern 1). */
+ if (wi::to_widest (scale_term) + target_precision + 1
+ == TYPE_PRECISION (lhs_type))
+ ifn = IFN_MULHS;
+ /* Check for pattern 3). */
+ else if (wi::to_widest (scale_term) + target_precision
+ == TYPE_PRECISION (lhs_type))
+ ifn = IFN_MULH;
+ else
+ return NULL;
}
- /* Check that the scaling factor is correct. */
- if (TREE_CODE (scale_term) != INTEGER_CST
- || wi::to_widest (scale_term) + expect_offset
- != TYPE_PRECISION (lhs_type))
- return NULL;
-
/* Check whether the scaling input term can be seen as two widened
inputs multiplied together. */
vect_unpromoted_value unprom_mult[2];
--
2.17.1
next prev parent reply other threads:[~2021-07-16 5:33 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-13 8:52 [RFC/PATCH] " Kewen.Lin
2021-07-13 8:58 ` [PATCH] rs6000: Support [u]mul<mode>3_highpart for vector Kewen.Lin
2021-07-13 22:07 ` Segher Boessenkool
2021-07-14 2:12 ` Kewen.Lin
2021-07-14 18:38 ` Segher Boessenkool
2021-07-13 9:35 ` [RFC/PATCH] vect: Recog mul_highpart pattern Richard Biener
2021-07-13 9:40 ` Richard Sandiford
2021-07-13 9:44 ` Richard Biener
2021-07-13 10:11 ` Richard Sandiford
2021-07-13 10:25 ` Kewen.Lin
2021-07-13 12:42 ` Richard Biener
2021-07-13 14:59 ` Kewen.Lin
2021-07-14 6:38 ` Richard Biener
2021-07-14 7:45 ` Kewen.Lin
2021-07-14 8:38 ` Richard Sandiford
2021-07-14 10:02 ` Kewen.Lin
2021-07-14 11:32 ` Richard Sandiford
2021-07-14 19:32 ` Segher Boessenkool
2021-07-15 1:40 ` Kewen.Lin
2021-07-15 23:08 ` Segher Boessenkool
2021-07-15 1:37 ` Kewen.Lin
2021-07-15 7:06 ` [PATCH v3] " Kewen.Lin
2021-07-15 7:17 ` Uros Bizjak
2021-07-15 8:04 ` Kewen.Lin
2021-07-15 8:23 ` Uros Bizjak
2021-07-15 8:49 ` Kewen.Lin
2021-07-15 9:41 ` Uros Bizjak
2021-07-15 8:40 ` Kewen.Lin
2021-07-15 11:58 ` Richard Biener
2021-07-16 5:33 ` Kewen.Lin [this message]
2021-07-19 10:35 ` [PATCH v4] " Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af73a46c-7c35-0f69-5c14-c030721bb653@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
--cc=ubizjak@gmail.com \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).