On 23 January 2015 at 14:44, Christophe Lyon wrote: > On 23 January 2015 at 12:42, Christophe Lyon wrote: >> On 23 January 2015 at 11:18, Tejas Belagod wrote: >>> On 22/01/15 21:31, Christophe Lyon wrote: >>>> >>>> On 22 January 2015 at 16:22, Tejas Belagod wrote: >>>>> >>>>> On 22/01/15 14:28, Christophe Lyon wrote: >>>>>> >>>>>> >>>>>> On 22 January 2015 at 12:19, Tejas Belagod >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> On 21/01/15 15:07, Christophe Lyon wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 19 January 2015 at 17:54, Marcus Shawcroft >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 19 January 2015 at 15:43, Christophe Lyon >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 19 January 2015 at 14:29, Marcus Shawcroft >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 16 January 2015 at 17:52, Christophe Lyon >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>>> OK provided, as per the previous couple, that we don;t regression >>>>>>>>>>>>> or >>>>>>>>>>>>> introduce new fails on aarch64[_be] or aarch32. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This patch shows failures on aarch64 and aarch64_be for vmax and >>>>>>>>>>>> vmin >>>>>>>>>>>> when the input is -NaN. >>>>>>>>>>>> It's a corner case, and my reading of the ARM ARM is that the >>>>>>>>>>>> result >>>>>>>>>>>> should the same as on aarch32. >>>>>>>>>>>> I haven't had time to look at it in more details though. >>>>>>>>>>>> So, not OK? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> They should have the same behaviour in aarch32 and aarch64. Did you >>>>>>>>>>> test on HW or a model? >>>>>>>>>>> >>>>>>>>>> I ran the tests on qemu for aarch32 and aarch64-linux, and on the >>>>>>>>>> foundation model for aarch64*-elf. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Leave this one out until we understand why it fails. /Marcus >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I've looked at this a bit more. >>>>>>>> We have >>>>>>>> fmax v0.4s, v0.4s, v1.4s >>>>>>>> where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1. >>>>>>>> >>>>>>>> The output is still -NaN (0xffc00000), while the test expects >>>>>>>> defaultNaN (0x7fc00000). >>>>>>>> >>>>>>> >>>>>>> In the AArch32 execution state, Advanced SIMD FP arithmetic always uses >>>>>>> the >>>>>>> DefaultNaN setting regardless of the DN-bit value in the FPSCR. In >>>>>>> AArch64 >>>>>>> execution state, result of Advanced SIMD FP arithmetic operations >>>>>>> depend >>>>>>> on >>>>>>> the value of the DN-bit i.e. either propagate the input NaN or generate >>>>>>> DefaultNaN depending on the value of DN. >>>>>> >>>>>> >>>>>> >>>>>> Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I >>>>>> can see only the latter (no diff between aarch32 and aarch64 in >>>>>> FPProcessNan pseudo-code) >>>>>> >>>>> >>>>> If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec - >>>>> under DN: >>>>> >>>>> "The value of this bit only controls scalar floating-point arithmetic. >>>>> Advanced SIMD arithmetic always uses the Default NaN setting, regardless >>>>> of >>>>> the value of the DN bit." >>>>> >>>>> Also on page 3180 for the description of VMAX(vector FP), it says: >>>>> " >>>>> * max(+0.0, -0.0) = +0.0 >>>>> * If any input is a NaN, the corresponding result element is the default >>>>> NaN. >>>>> " >>>>> >>>> Oops I was looking at FMAX (vector) pg 936. >>>> >>>>> The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to >>>>> FPMax() which is on pg. 2285 >>>>> >>>>> // StandardFPSCRValue() >>>>> // ==================== >>>>> FPCRType StandardFPSCRValue() >>>>> return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’ >>>>> >>>>> Here bit-25(FPSCR.DN) is set to 1. >>>>> >>>> >>>> So, we should get defaultNaN too on aarch64, and no need to try to >>>> force DN to 1 in gdb? >>>> >>>> What can be wrong? >>>> >>> >>> On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading >>> the same document. >>> >>> Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD) >>> for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which >>> means generate DefaultNaN() regardless. >>> >>> OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the >>> pseudocode gets just FPCR. >>> >>> >> Ok, that was my initial understanding but our discussion confused me. >> >> And that's why I tried to force DN = 1 in gdb before single-stepping over >> fmax v0.4s, v0.4s, v1.4s >> >> but it changed nothing :-( >> Hence my question about a gdb possible bug or misuse. > > Hmm... user error, I missed one bit > set $fpcr=0x2000000 > works under gdb. > >> I'll try modifying the test to have it force DN=1. >> > Forcing DN=1 in the test makes it pass. > > I am going to look at adding that cleanly to my test, and resubmit it. > > Thanks, and sorry for the noise. > Here is the updated version: - Now I set DN=1 on AArch64 in clean_results, as it is the main initialization function. - I removed the double negative :-) - I removed the useless [u]int64 and poly variants Christophe. 2015-01-25 Christophe Lyon * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (_ARM_FPSRC): Add DN and AHP fields. (clean_results): Force DN=1 on AArch64. * gcc.target/aarch64/advsimd-intrinsics/binary_op_no64.inc: New file. * gcc.target/aarch64/advsimd-intrinsics/vhadd.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vhsub.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vmax.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vmin.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vrhadd.c: New file. >>> Thanks, >>> Tejas. >>> >>> >>>>> Thanks, >>>>> Tejas. >>>>> >>>>> >>>>>>> If you're running your test in the AArch64 execution state, you'd want >>>>>>> to >>>>>>> define the DN bit and modify the expected results accordingly or have >>>>>>> the >>>>>>> test poll at runtime what the DN-bit is set to and check expected >>>>>>> results >>>>>>> dynamically. >>>>>> >>>>>> >>>>>> Makes sense, I hadn't noticed the different aarch64 spec here. >>>>>> >>>>>>> I think the test already has expected behaviour for AArch32 execution >>>>>>> state >>>>>>> by expecting DefaultNaN regardless. >>>>>> >>>>>> >>>>>> Yes. >>>>>> >>>>>>>> I have executed the test under GDB on AArch64 HW, and noticed that >>>>>>>> fpcr >>>>>>>> was 0. >>>>>>>> I forced it to have DN==1: >>>>>>>> set $fpcr=0x1000000 >>>>>>>> but this didn't change the result. >>>>>>>> >>>>>>>> Does setting fpcr.dn under gdb actually work? >>>>>>>> >>>>>>> >>>>>>> It should. Possibly a bug, patches welcome :-). >>>>>>> >>>>>> :-) >>>>>> >>>>> >>>>> >>>> >>> >>>