From: Tejas Belagod <tejas.belagod@arm.com>
To: Christophe Lyon <christophe.lyon@linaro.org>
Cc: Marcus Shawcroft <marcus.shawcroft@gmail.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [[ARM/AArch64][testsuite] 03/36] Add vmax, vmin, vhadd, vhsub and vrhadd tests.
Date: Mon, 26 Jan 2015 13:23:00 -0000 [thread overview]
Message-ID: <54C62EC7.2030702@arm.com> (raw)
In-Reply-To: <CAKdteObyBRRiYN9PSxwKUhm7iKQycTLsynFiviAu5txx3dDhOg@mail.gmail.com>
On 25/01/15 21:05, Christophe Lyon wrote:
> On 23 January 2015 at 14:44, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>> On 23 January 2015 at 12:42, Christophe Lyon <christophe.lyon@linaro.org> wrote:
>>> On 23 January 2015 at 11:18, Tejas Belagod <tejas.belagod@arm.com> wrote:
>>>> On 22/01/15 21:31, Christophe Lyon wrote:
>>>>>
>>>>> On 22 January 2015 at 16:22, Tejas Belagod <tejas.belagod@arm.com> wrote:
>>>>>>
>>>>>> On 22/01/15 14:28, Christophe Lyon wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 22 January 2015 at 12:19, Tejas Belagod <tejas.belagod@arm.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 21/01/15 15:07, Christophe Lyon wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 19 January 2015 at 17:54, Marcus Shawcroft
>>>>>>>>> <marcus.shawcroft@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 19 January 2015 at 15:43, Christophe Lyon
>>>>>>>>>> <christophe.lyon@linaro.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 19 January 2015 at 14:29, Marcus Shawcroft
>>>>>>>>>>> <marcus.shawcroft@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 16 January 2015 at 17:52, Christophe Lyon
>>>>>>>>>>>> <christophe.lyon@linaro.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>> OK provided, as per the previous couple, that we don;t regression
>>>>>>>>>>>>>> or
>>>>>>>>>>>>>> introduce new fails on aarch64[_be] or aarch32.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch shows failures on aarch64 and aarch64_be for vmax and
>>>>>>>>>>>>> vmin
>>>>>>>>>>>>> when the input is -NaN.
>>>>>>>>>>>>> It's a corner case, and my reading of the ARM ARM is that the
>>>>>>>>>>>>> result
>>>>>>>>>>>>> should the same as on aarch32.
>>>>>>>>>>>>> I haven't had time to look at it in more details though.
>>>>>>>>>>>>> So, not OK?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> They should have the same behaviour in aarch32 and aarch64. Did you
>>>>>>>>>>>> test on HW or a model?
>>>>>>>>>>>>
>>>>>>>>>>> I ran the tests on qemu for aarch32 and aarch64-linux, and on the
>>>>>>>>>>> foundation model for aarch64*-elf.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Leave this one out until we understand why it fails. /Marcus
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've looked at this a bit more.
>>>>>>>>> We have
>>>>>>>>> fmax v0.4s, v0.4s, v1.4s
>>>>>>>>> where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1.
>>>>>>>>>
>>>>>>>>> The output is still -NaN (0xffc00000), while the test expects
>>>>>>>>> defaultNaN (0x7fc00000).
>>>>>>>>>
>>>>>>>>
>>>>>>>> In the AArch32 execution state, Advanced SIMD FP arithmetic always uses
>>>>>>>> the
>>>>>>>> DefaultNaN setting regardless of the DN-bit value in the FPSCR. In
>>>>>>>> AArch64
>>>>>>>> execution state, result of Advanced SIMD FP arithmetic operations
>>>>>>>> depend
>>>>>>>> on
>>>>>>>> the value of the DN-bit i.e. either propagate the input NaN or generate
>>>>>>>> DefaultNaN depending on the value of DN.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I
>>>>>>> can see only the latter (no diff between aarch32 and aarch64 in
>>>>>>> FPProcessNan pseudo-code)
>>>>>>>
>>>>>>
>>>>>> If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec -
>>>>>> under DN:
>>>>>>
>>>>>> "The value of this bit only controls scalar floating-point arithmetic.
>>>>>> Advanced SIMD arithmetic always uses the Default NaN setting, regardless
>>>>>> of
>>>>>> the value of the DN bit."
>>>>>>
>>>>>> Also on page 3180 for the description of VMAX(vector FP), it says:
>>>>>> "
>>>>>> * max(+0.0, -0.0) = +0.0
>>>>>> * If any input is a NaN, the corresponding result element is the default
>>>>>> NaN.
>>>>>> "
>>>>>>
>>>>> Oops I was looking at FMAX (vector) pg 936.
>>>>>
>>>>>> The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to
>>>>>> FPMax() which is on pg. 2285
>>>>>>
>>>>>> // StandardFPSCRValue()
>>>>>> // ====================
>>>>>> FPCRType StandardFPSCRValue()
>>>>>> return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’
>>>>>>
>>>>>> Here bit-25(FPSCR.DN) is set to 1.
>>>>>>
>>>>>
>>>>> So, we should get defaultNaN too on aarch64, and no need to try to
>>>>> force DN to 1 in gdb?
>>>>>
>>>>> What can be wrong?
>>>>>
>>>>
>>>> On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading
>>>> the same document.
>>>>
>>>> Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD)
>>>> for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which
>>>> means generate DefaultNaN() regardless.
>>>>
>>>> OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the
>>>> pseudocode gets just FPCR.
>>>>
>>>>
>>> Ok, that was my initial understanding but our discussion confused me.
>>>
>>> And that's why I tried to force DN = 1 in gdb before single-stepping over
>>> fmax v0.4s, v0.4s, v1.4s
>>>
>>> but it changed nothing :-(
>>> Hence my question about a gdb possible bug or misuse.
>>
>> Hmm... user error, I missed one bit
>> set $fpcr=0x2000000
>> works under gdb.
>>
>>> I'll try modifying the test to have it force DN=1.
>>>
>> Forcing DN=1 in the test makes it pass.
>>
>> I am going to look at adding that cleanly to my test, and resubmit it.
>>
>> Thanks, and sorry for the noise.
>>
> Here is the updated version:
> - Now I set DN=1 on AArch64 in clean_results, as it is the main
> initialization function.
> - I removed the double negative :-)
> - I removed the useless [u]int64 and poly variants
>
> Christophe.
>
> 2015-01-25 Christophe Lyon <christophe.lyon@linaro.org>
>
> * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
> (_ARM_FPSRC): Add DN and AHP fields.
> (clean_results): Force DN=1 on AArch64.
> * gcc.target/aarch64/advsimd-intrinsics/binary_op_no64.inc: New file.
> * gcc.target/aarch64/advsimd-intrinsics/vhadd.c: New file.
> * gcc.target/aarch64/advsimd-intrinsics/vhsub.c: New file.
> * gcc.target/aarch64/advsimd-intrinsics/vmax.c: New file.
> * gcc.target/aarch64/advsimd-intrinsics/vmin.c: New file.
> * gcc.target/aarch64/advsimd-intrinsics/vrhadd.c: New file.
>
I guess you don't need the fake dependency fix for this as this is
mostly called only once?
+ _ARM_FPSCR _afpscr_for_dn;
+ asm volatile ("mrs %0,fpcr" : "=r" (_afpscr_for_dn));
+ _afpscr_for_dn.b.DN = 1;
+ asm volatile ("msr fpcr,%0" : : "r" (_afpscr_for_dn));
Otherwise, your patch looks OK to me(but I can't approve it).
Thanks,
Tejas.
next prev parent reply other threads:[~2015-01-26 12:10 UTC|newest]
Thread overview: 144+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-13 15:19 [[ARM/AArch64][testsuite] 00/36] More Neon intrinsics tests Christophe Lyon
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 18/36] Add vsli_n and vsri_n tests Christophe Lyon
2015-01-16 18:11 ` Tejas Belagod
2015-01-19 14:15 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 34/36] Add vqdmull tests Christophe Lyon
2015-01-19 16:52 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 05/36] Add vldX_dup test Christophe Lyon
2015-01-16 15:35 ` Tejas Belagod
2015-01-16 18:17 ` Christophe Lyon
2015-01-19 13:39 ` Marcus Shawcroft
2015-01-22 16:32 ` Tejas Belagod
2015-01-22 22:23 ` Christophe Lyon
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 21/36] Add vmovl tests Christophe Lyon
2015-01-16 18:18 ` Tejas Belagod
2015-01-20 15:35 ` Christophe Lyon
2015-01-26 14:19 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 13/36] Add vmla_n and vmls_n tests Christophe Lyon
2015-01-16 16:30 ` Tejas Belagod
2015-01-20 15:33 ` Christophe Lyon
2015-01-26 14:08 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 26/36] Add vmull_lane tests Christophe Lyon
2015-01-16 18:28 ` Tejas Belagod
2015-01-19 15:35 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 11/36] Add vmlal_lane and vmlsl_lane tests Christophe Lyon
2015-01-16 16:23 ` Tejas Belagod
2015-01-19 13:53 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 33/36] Add vqdmulh_n tests Christophe Lyon
2015-01-19 16:48 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 24/36] Add vmul_n tests Christophe Lyon
2015-01-16 18:24 ` Tejas Belagod
2015-01-19 15:23 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 15/36] Add vqdmlal_lane and vqdmlsl_lane tests Christophe Lyon
2015-01-16 16:52 ` Tejas Belagod
2015-01-19 14:13 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 27/36] Add vmull_n tests Christophe Lyon
2015-01-16 18:32 ` Tejas Belagod
2015-01-19 15:35 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 16/36] Add vqdmlal_n and vqdmlsl_n tests Christophe Lyon
2015-01-16 17:26 ` Tejas Belagod
2015-01-19 14:14 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 29/36] Add vpadal tests Christophe Lyon
2015-01-16 18:41 ` Tejas Belagod
2015-01-20 15:39 ` Christophe Lyon
2015-01-26 14:34 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 08/36] Add vtrn tests. Refactor vzup and vzip tests Christophe Lyon
2015-01-16 16:06 ` Tejas Belagod
2015-01-16 18:12 ` Christophe Lyon
2015-01-19 13:52 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 30/36] Add vpaddl tests Christophe Lyon
2015-01-16 18:48 ` Tejas Belagod
2015-01-16 19:05 ` Christophe Lyon
2015-01-16 20:34 ` Christophe Lyon
2015-01-20 15:50 ` Christophe Lyon
2015-01-26 14:47 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 32/36] Add vqdmulh_lane tests Christophe Lyon
2015-01-19 16:47 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 25/36] Add vmull tests Christophe Lyon
2015-01-16 18:26 ` Tejas Belagod
2015-01-19 15:34 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 02/36] Be more verbose, and actually confirm that a test was checked Christophe Lyon
2015-01-16 13:46 ` Tejas Belagod
2015-01-16 17:17 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 23/36] Add vmul_lane tests Christophe Lyon
2015-01-16 18:23 ` Tejas Belagod
2015-01-19 15:17 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 06/36] Add vmla and vmls tests Christophe Lyon
2015-01-16 15:52 ` Tejas Belagod
2015-01-16 16:32 ` Christophe Lyon
2015-01-19 13:42 ` Marcus Shawcroft
2015-01-20 22:23 ` Christophe Lyon
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 31/36] Add vqdmulh tests Christophe Lyon
2015-01-19 16:46 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 20/36] Add vsubw tests, putting most of the code in common with vaddw through vXXWw.inc Christophe Lyon
2015-01-16 18:16 ` Tejas Belagod
2015-01-19 14:41 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 22/36] Add vmovn tests Christophe Lyon
2015-01-16 18:21 ` Tejas Belagod
2015-01-19 14:44 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 09/36] Add vsubhn, vraddhn and vrsubhn tests. Split vaddhn.c into vXXXhn.inc and vaddhn.c to share code with other new tests Christophe Lyon
2015-01-16 16:21 ` Tejas Belagod
2015-01-16 16:35 ` Christophe Lyon
2015-01-20 15:30 ` Christophe Lyon
2015-01-26 14:03 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 04/36] Add vld1_lane tests Christophe Lyon
2015-01-16 14:31 ` Tejas Belagod
2015-01-16 16:31 ` Christophe Lyon
2015-01-16 17:22 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 28/36] Add vmnv tests Christophe Lyon
2015-01-16 18:39 ` Tejas Belagod
2015-01-20 15:36 ` Christophe Lyon
2015-01-26 14:30 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 07/36] Add vmla_lane and vmls_lane tests Christophe Lyon
2015-01-16 15:57 ` Tejas Belagod
2015-01-19 13:43 ` Marcus Shawcroft
2015-01-21 0:02 ` Christophe Lyon
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 12/36] Add vmlal_n and vmlsl_n tests Christophe Lyon
2015-01-16 16:29 ` Tejas Belagod
2015-01-19 13:54 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 14/36] Add vqdmlal and vqdmlsl tests Christophe Lyon
2015-01-16 16:45 ` Tejas Belagod
2015-01-19 14:11 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 01/36] Add explicit dependency on Neon Cumulative Saturation flag (QC) Christophe Lyon
2015-01-16 13:43 ` Tejas Belagod
2015-01-16 17:15 ` Marcus Shawcroft
2015-01-13 15:19 ` [[ARM/AArch64][testsuite] 03/36] Add vmax, vmin, vhadd, vhsub and vrhadd tests Christophe Lyon
2015-01-16 14:08 ` Tejas Belagod
2015-01-16 16:23 ` Christophe Lyon
2015-01-16 17:20 ` Marcus Shawcroft
2015-01-16 17:59 ` Christophe Lyon
2015-01-19 13:34 ` Marcus Shawcroft
2015-01-19 15:49 ` Christophe Lyon
2015-01-19 17:33 ` Marcus Shawcroft
2015-01-21 16:35 ` Christophe Lyon
2015-01-22 12:37 ` Tejas Belagod
2015-01-22 14:42 ` Christophe Lyon
2015-01-22 15:58 ` Tejas Belagod
2015-01-22 23:10 ` Christophe Lyon
2015-01-23 11:02 ` Tejas Belagod
2015-01-23 12:08 ` Christophe Lyon
2015-01-23 15:21 ` Christophe Lyon
2015-01-25 22:51 ` Christophe Lyon
2015-01-26 13:23 ` Tejas Belagod [this message]
2015-01-26 13:57 ` Christophe Lyon
2015-02-02 10:39 ` Christophe Lyon
2015-02-02 15:38 ` Marcus Shawcroft
2015-01-13 15:20 ` [[ARM/AArch64][testsuite] 19/36] Add vsubl tests, put most of the code in common with vaddl in vXXXl.inc Christophe Lyon
2015-01-16 18:12 ` Tejas Belagod
2015-01-19 14:37 ` Marcus Shawcroft
2015-01-13 15:20 ` [[ARM/AArch64][testsuite] 10/36] Add vmlal and vmlsl tests Christophe Lyon
2015-01-16 16:22 ` Tejas Belagod
2015-01-19 13:51 ` Marcus Shawcroft
2015-01-13 15:20 ` [[ARM/AArch64][testsuite] 17/36] Add vpadd, vpmax and vpmin tests Christophe Lyon
2015-01-16 17:54 ` Tejas Belagod
2015-01-16 18:02 ` Christophe Lyon
2015-01-20 15:34 ` Christophe Lyon
2015-01-26 14:19 ` Marcus Shawcroft
2015-01-13 15:21 ` [[ARM/AArch64][testsuite] 35/36] Add vqdmull_lane tests Christophe Lyon
2015-01-19 16:54 ` Marcus Shawcroft
2015-01-13 15:22 ` [[ARM/AArch64][testsuite] 36/36] Add vqdmull_n tests Christophe Lyon
2015-01-16 18:49 ` Tejas Belagod
2015-01-16 19:20 ` Christophe Lyon
2015-01-19 17:16 ` Marcus Shawcroft
2015-01-19 17:18 ` [[ARM/AArch64][testsuite] 00/36] More Neon intrinsics tests Marcus Shawcroft
2015-01-20 15:26 ` Christophe Lyon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54C62EC7.2030702@arm.com \
--to=tejas.belagod@arm.com \
--cc=christophe.lyon@linaro.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=marcus.shawcroft@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).