From: Sudakshina Das <sudi.das@arm.com>
To: Christophe Lyon <christophe.lyon@linaro.org>
Cc: James Greenhalgh <james.greenhalgh@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
nd <nd@arm.com>, Richard Earnshaw <Richard.Earnshaw@arm.com>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>
Subject: Re: [PATCH PR81647][AARCH64] Fix handling of Unordered Comparisons in aarch64-simd.md
Date: Tue, 20 Mar 2018 10:58:00 -0000 [thread overview]
Message-ID: <d29a367b-6614-5a4f-21e8-13135243298f@arm.com> (raw)
In-Reply-To: <CAKdteOb1Ue_RDgUO839+_KBZVkZqUXcE-f9KG6iLN8ChwxQwFA@mail.gmail.com>
Hi
On 20/03/18 08:13, Christophe Lyon wrote:
> On 19 March 2018 at 19:55, Sudakshina Das <sudi.das@arm.com> wrote:
>> Hi
>>
>>
>> On 19/03/18 14:29, James Greenhalgh wrote:
>>>
>>> On Fri, Dec 15, 2017 at 11:57:46AM +0000, Sudi Das wrote:
>>>>
>>>> Hi
>>>>
>>>> This patch fixes the inconsistent behavior observed at -O3 for the
>>>> unordered comparisons. According to the online docs
>>>>
>>>> (https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gccint/Unary-and-Binary-Expressions.html),
>>>> all of the following should not raise an FP exception:
>>>> - UNGE_EXPR
>>>> - UNGT_EXPR
>>>> - UNLE_EXPR
>>>> - UNLT_EXPR
>>>> - UNEQ_EXPR
>>>> Also ORDERED_EXPR and UNORDERED_EXPR should only return zero or one.
>>>>
>>>> The aarch64-simd.md handling of these were generating exception raising
>>>> instructions such as fcmgt. This patch changes the instructions that are
>>>> emitted to in order to not give out the exceptions. We first check each
>>>> operand for NaNs and force any elements containing NaN to zero before
>>>> using them in the compare.
>>>>
>>>> Example: UN<cc> (a, b) -> UNORDERED (a, b) | (cm<cc> (isnan (a) ? 0.0 :
>>>> a, isnan (b) ? 0.0 : b))
>>>>
>>>>
>>>> The ORDERED_EXPR is now handled as (cmeq (a, a) & cmeq (b, b)) and
>>>> UNORDERED_EXPR as ~ORDERED_EXPR and UNEQ as (~ORDERED_EXPR | cmeq (a,b)).
>>>>
>>>> Testing done: Checked for regressions on bootstrapped
>>>> aarch64-none-linux-gnu and added a new test case.
>>>>
>>>> Is this ok for trunk? This will probably need a back-port to
>>>> gcc-7-branch as well.
>>>
>>>
>>> OK.
>>>
>>> Let it soak on trunk for a while before the backport.
>>
>>
>> Thanks. Committed to trunk as r258653. Will wait a week before backport.
>>
>
> Hi,
>
> As the test failed to compile on aarch64 bare-metal targets, I added
> /* { dg-require-effective-target fenv_exceptions } */
> as obvious (r258672).
>
> 2018-03-20 Christophe Lyon <christophe.lyon@linaro.org>
>
> PR target/81647
> * gcc.target/aarch64/pr81647.c: Require fenv_exceptions.
>
> Index: testsuite/gcc.target/aarch64/pr81647.c
> ===================================================================
> --- testsuite/gcc.target/aarch64/pr81647.c (revision 258671)
> +++ testsuite/gcc.target/aarch64/pr81647.c (revision 258672)
> @@ -1,5 +1,6 @@
> /* { dg-do run } */
> /* { dg-options "-O3 -fdump-tree-ssa" } */
> +/* { dg-require-effective-target fenv_exceptions } */
>
> #include <fenv.h>
>
>
>
> Christophe
Thanks for fixing this and apologies for missing it on the first place!
Sudi
>
>> Sudi
>>
>>
>>>
>>> Thanks,
>>> James
>>>
>>>> ChangeLog Entries:
>>>>
>>>> *** gcc/ChangeLog ***
>>>>
>>>> 2017-12-15 Sudakshina Das <sudi.das@arm.com>
>>>>
>>>> PR target/81647
>>>> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_int_equiv>):
>>>> Modify
>>>> instructions for
>>>> UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.
>>>>
>>>> *** gcc/testsuite/ChangeLog ***
>>>>
>>>> 2017-12-15 Sudakshina Das <sudi.das@arm.com>
>>>>
>>>> PR target/81647
>>>> * gcc.target/aarch64/pr81647.c: New.
>>>
>>>
>>>> diff --git a/gcc/config/aarch64/aarch64-simd.md
>>>> b/gcc/config/aarch64/aarch64-simd.md
>>>> index
>>>> f90f74fe7fd5990a97b9f4eb68f5735b7d4fb9aa..acff06c753b3e3aaa5775632929909afa4d3294b
>>>> 100644
>>>> --- a/gcc/config/aarch64/aarch64-simd.md
>>>> +++ b/gcc/config/aarch64/aarch64-simd.md
>>>> @@ -2731,10 +2731,10 @@
>>>> break;
>>>> }
>>>> /* Fall through. */
>>>> - case UNGE:
>>>> + case UNLT:
>>>> std::swap (operands[2], operands[3]);
>>>> /* Fall through. */
>>>> - case UNLE:
>>>> + case UNGT:
>>>> case GT:
>>>> comparison = gen_aarch64_cmgt<mode>;
>>>> break;
>>>> @@ -2745,10 +2745,10 @@
>>>> break;
>>>> }
>>>> /* Fall through. */
>>>> - case UNGT:
>>>> + case UNLE:
>>>> std::swap (operands[2], operands[3]);
>>>> /* Fall through. */
>>>> - case UNLT:
>>>> + case UNGE:
>>>> case GE:
>>>> comparison = gen_aarch64_cmge<mode>;
>>>> break;
>>>> @@ -2771,21 +2771,35 @@
>>>> case UNGT:
>>>> case UNLE:
>>>> case UNLT:
>>>> - case NE:
>>>> - /* FCM returns false for lanes which are unordered, so if we use
>>>> - the inverse of the comparison we actually want to emit, then
>>>> - invert the result, we will end up with the correct result.
>>>> - Note that a NE NaN and NaN NE b are true for all a, b.
>>>> -
>>>> - Our transformations are:
>>>> - a UNGE b -> !(b GT a)
>>>> - a UNGT b -> !(b GE a)
>>>> - a UNLE b -> !(a GT b)
>>>> - a UNLT b -> !(a GE b)
>>>> - a NE b -> !(a EQ b) */
>>>> - gcc_assert (comparison != NULL);
>>>> - emit_insn (comparison (operands[0], operands[2], operands[3]));
>>>> - emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
>>>> + {
>>>> + /* All of the above must not raise any FP exceptions. Thus we
>>>> first
>>>> + check each operand for NaNs and force any elements containing
>>>> NaN to
>>>> + zero before using them in the compare.
>>>> + Example: UN<cc> (a, b) -> UNORDERED (a, b) |
>>>> + (cm<cc> (isnan (a) ? 0.0 : a,
>>>> + isnan (b) ? 0.0 : b))
>>>> + We use the following transformations for doing the
>>>> comparisions:
>>>> + a UNGE b -> a GE b
>>>> + a UNGT b -> a GT b
>>>> + a UNLE b -> b GE a
>>>> + a UNLT b -> b GT a. */
>>>> +
>>>> + rtx tmp0 = gen_reg_rtx (<V_INT_EQUIV>mode);
>>>> + rtx tmp1 = gen_reg_rtx (<V_INT_EQUIV>mode);
>>>> + rtx tmp2 = gen_reg_rtx (<V_INT_EQUIV>mode);
>>>> + emit_insn (gen_aarch64_cmeq<mode> (tmp0, operands[2],
>>>> operands[2]));
>>>> + emit_insn (gen_aarch64_cmeq<mode> (tmp1, operands[3],
>>>> operands[3]));
>>>> + emit_insn (gen_and<v_int_equiv>3 (tmp2, tmp0, tmp1));
>>>> + emit_insn (gen_and<v_int_equiv>3 (tmp0, tmp0,
>>>> + lowpart_subreg (<V_INT_EQUIV>mode, operands[2],
>>>> <MODE>mode)));
>>>> + emit_insn (gen_and<v_int_equiv>3 (tmp1, tmp1,
>>>> + lowpart_subreg (<V_INT_EQUIV>mode, operands[3],
>>>> <MODE>mode)));
>>>> + gcc_assert (comparison != NULL);
>>>> + emit_insn (comparison (operands[0],
>>>> + lowpart_subreg (<MODE>mode, tmp0,
>>>> <V_INT_EQUIV>mode),
>>>> + lowpart_subreg (<MODE>mode, tmp1,
>>>> <V_INT_EQUIV>mode)));
>>>> + emit_insn (gen_orn<v_int_equiv>3 (operands[0], tmp2,
>>>> operands[0]));
>>>> + }
>>>> break;
>>>> case LT:
>>>> @@ -2793,25 +2807,19 @@
>>>> case GT:
>>>> case GE:
>>>> case EQ:
>>>> + case NE:
>>>> /* The easy case. Here we emit one of FCMGE, FCMGT or FCMEQ.
>>>> As a LT b <=> b GE a && a LE b <=> b GT a. Our transformations
>>>> are:
>>>> a GE b -> a GE b
>>>> a GT b -> a GT b
>>>> a LE b -> b GE a
>>>> a LT b -> b GT a
>>>> - a EQ b -> a EQ b */
>>>> + a EQ b -> a EQ b
>>>> + a NE b -> ~(a EQ b) */
>>>> gcc_assert (comparison != NULL);
>>>> emit_insn (comparison (operands[0], operands[2], operands[3]));
>>>> - break;
>>>> -
>>>> - case UNEQ:
>>>> - /* We first check (a > b || b > a) which is !UNEQ, inverting
>>>> - this result will then give us (a == b || a UNORDERED b). */
>>>> - emit_insn (gen_aarch64_cmgt<mode> (operands[0],
>>>> - operands[2], operands[3]));
>>>> - emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3],
>>>> operands[2]));
>>>> - emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
>>>> - emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
>>>> + if (code == NE)
>>>> + emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0],
>>>> operands[0]));
>>>> break;
>>>> case LTGT:
>>>> @@ -2823,21 +2831,22 @@
>>>> emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0],
>>>> tmp));
>>>> break;
>>>> - case UNORDERED:
>>>> - /* Operands are ORDERED iff (a > b || b >= a), so we can compute
>>>> - UNORDERED as !ORDERED. */
>>>> - emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[2],
>>>> operands[3]));
>>>> - emit_insn (gen_aarch64_cmge<mode> (operands[0],
>>>> - operands[3], operands[2]));
>>>> - emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
>>>> - emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
>>>> - break;
>>>> -
>>>> case ORDERED:
>>>> - emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[2],
>>>> operands[3]));
>>>> - emit_insn (gen_aarch64_cmge<mode> (operands[0],
>>>> - operands[3], operands[2]));
>>>> - emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
>>>> + case UNORDERED:
>>>> + case UNEQ:
>>>> + /* cmeq (a, a) & cmeq (b, b). */
>>>> + emit_insn (gen_aarch64_cmeq<mode> (operands[0],
>>>> + operands[2], operands[2]));
>>>> + emit_insn (gen_aarch64_cmeq<mode> (tmp, operands[3],
>>>> operands[3]));
>>>> + emit_insn (gen_and<v_int_equiv>3 (operands[0], operands[0], tmp));
>>>> +
>>>> + if (code == UNORDERED)
>>>> + emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0],
>>>> operands[0]));
>>>> + else if (code == UNEQ)
>>>> + {
>>>> + emit_insn (gen_aarch64_cmeq<mode> (tmp, operands[2],
>>>> operands[3]));
>>>> + emit_insn (gen_orn<v_int_equiv>3 (operands[0], operands[0],
>>>> tmp));
>>>> + }
>>>> break;
>>>> default:
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/pr81647.c
>>>> b/gcc/testsuite/gcc.target/aarch64/pr81647.c
>>>> new file mode 100644
>>>> index
>>>> 0000000000000000000000000000000000000000..f60dfba49d538e3b2164b11392ab8dbfdba6546e
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/pr81647.c
>>>> @@ -0,0 +1,44 @@
>>>> +/* { dg-do run } */
>>>> +/* { dg-options "-O3 -fdump-tree-ssa" } */
>>>> +
>>>> +#include <fenv.h>
>>>> +
>>>> +double x[28], y[28];
>>>> +int res[28];
>>>> +
>>>> +int
>>>> +main (void)
>>>> +{
>>>> + int i;
>>>> + for (i = 0; i < 28; ++i)
>>>> + {
>>>> + x[i] = __builtin_nan ("");
>>>> + y[i] = i;
>>>> + }
>>>> + __asm__ volatile ("" ::: "memory");
>>>> + feclearexcept (FE_ALL_EXCEPT);
>>>> + for (i = 0; i < 4; ++i)
>>>> + res[i] = __builtin_isgreater (x[i], y[i]);
>>>> + for (i = 4; i < 8; ++i)
>>>> + res[i] = __builtin_isgreaterequal (x[i], y[i]);
>>>> + for (i = 8; i < 12; ++i)
>>>> + res[i] = __builtin_isless (x[i], y[i]);
>>>> + for (i = 12; i < 16; ++i)
>>>> + res[i] = __builtin_islessequal (x[i], y[i]);
>>>> + for (i = 16; i < 20; ++i)
>>>> + res[i] = __builtin_islessgreater (x[i], y[i]);
>>>> + for (i = 20; i < 24; ++i)
>>>> + res[i] = __builtin_isunordered (x[i], y[i]);
>>>> + for (i = 24; i < 28; ++i)
>>>> + res[i] = !(__builtin_isunordered (x[i], y[i]));
>>>> + __asm__ volatile ("" ::: "memory");
>>>> + return fetestexcept (FE_ALL_EXCEPT) != 0;
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-tree-dump " u> " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " u>= " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " u< " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " u<= " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " u== " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " unord " "ssa" } } */
>>>> +/* { dg-final { scan-tree-dump " ord " "ssa" } } */
>>>
>>>
>>
next prev parent reply other threads:[~2018-03-20 10:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-15 11:57 Sudakshina Das
2018-01-05 11:47 ` [PATCH PR81647][AARCH64] PING " Sudakshina Das
2018-02-16 10:54 ` [PATCH PR81647][AARCH64][PING] " Sudakshina Das
2018-03-19 14:32 ` [PATCH PR81647][AARCH64] " James Greenhalgh
2018-03-19 19:31 ` Sudakshina Das
2018-03-20 8:19 ` Christophe Lyon
2018-03-20 10:58 ` Sudakshina Das [this message]
2018-03-28 11:37 ` Sudakshina Das
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d29a367b-6614-5a4f-21e8-13135243298f@arm.com \
--to=sudi.das@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=christophe.lyon@linaro.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=james.greenhalgh@arm.com \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).