public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][AArch64] Fix argument types for some high_lane* intrinsics implemented in assembly
@ 2014-07-09 14:37 Kyrill Tkachov
  2014-07-17  9:57 ` Marcus Shawcroft
  0 siblings, 1 reply; 2+ messages in thread
From: Kyrill Tkachov @ 2014-07-09 14:37 UTC (permalink / raw)
  To: GCC Patches; +Cc: Marcus Shawcroft

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

Hi all,

These intrinsics are implemented as macros that map down to asms but the 
types they accept are inconsistent with the ACLE spec. This patch fixes 
them, although they should be reimplemented properly in C in the future.

This is a bugfix and it applies cleanly to trunk, 4.9 and 4.8.
I know we're close to the 4.9.1 release, but this is not an ABI-breaking 
change so it's the aarch64 maintainers' call on whether it should be 
backported.

Tested aarch64-none-elf

Ok?

Thanks,
Kyrill

2014-07-09  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/arm_neon.h (vmlal_high_lane_s16): Fix type.
     (vmlal_high_lane_s32): Likewise.
     (vmlal_high_lane_u16): Likewise.
     (vmlal_high_lane_u32): Likewise.
     (vmlsl_high_lane_s16): Likewise.
     (vmlsl_high_lane_s32): Likewise.
     (vmlsl_high_lane_u16): Likewise.
     (vmlsl_high_lane_u32): Likewise.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-assembly-intrinsics-types.patch --]
[-- Type: text/x-patch; name=aarch64-assembly-intrinsics-types.patch, Size: 5839 bytes --]

commit 991893519ceea282bfaf696b88d5c9291ce2e3a0
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu Jun 26 13:59:19 2014 +0100

    [AArch64] Fix types for some assembly intrinsics

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 7807181..9e8d15a 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -6735,7 +6735,7 @@ vmla_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlal_high_lane_s16(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       int16x8_t c_ = (c);                                              \
+       int16x4_t c_ = (c);                                              \
        int16x8_t b_ = (b);                                              \
        int32x4_t a_ = (a);                                              \
        int32x4_t result;                                                \
@@ -6749,7 +6749,7 @@ vmla_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlal_high_lane_s32(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       int32x4_t c_ = (c);                                              \
+       int32x2_t c_ = (c);                                              \
        int32x4_t b_ = (b);                                              \
        int64x2_t a_ = (a);                                              \
        int64x2_t result;                                                \
@@ -6763,7 +6763,7 @@ vmla_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlal_high_lane_u16(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       uint16x8_t c_ = (c);                                             \
+       uint16x4_t c_ = (c);                                             \
        uint16x8_t b_ = (b);                                             \
        uint32x4_t a_ = (a);                                             \
        uint32x4_t result;                                               \
@@ -6777,7 +6777,7 @@ vmla_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlal_high_lane_u32(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       uint32x4_t c_ = (c);                                             \
+       uint32x2_t c_ = (c);                                             \
        uint32x4_t b_ = (b);                                             \
        uint64x2_t a_ = (a);                                             \
        uint64x2_t result;                                               \
@@ -7423,7 +7423,7 @@ vmls_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlsl_high_lane_s16(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       int16x8_t c_ = (c);                                              \
+       int16x4_t c_ = (c);                                              \
        int16x8_t b_ = (b);                                              \
        int32x4_t a_ = (a);                                              \
        int32x4_t result;                                                \
@@ -7437,7 +7437,7 @@ vmls_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlsl_high_lane_s32(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       int32x4_t c_ = (c);                                              \
+       int32x2_t c_ = (c);                                              \
        int32x4_t b_ = (b);                                              \
        int64x2_t a_ = (a);                                              \
        int64x2_t result;                                                \
@@ -7451,7 +7451,7 @@ vmls_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlsl_high_lane_u16(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       uint16x8_t c_ = (c);                                             \
+       uint16x4_t c_ = (c);                                             \
        uint16x8_t b_ = (b);                                             \
        uint32x4_t a_ = (a);                                             \
        uint32x4_t result;                                               \
@@ -7465,7 +7465,7 @@ vmls_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)
 #define vmlsl_high_lane_u32(a, b, c, d)                                 \
   __extension__                                                         \
     ({                                                                  \
-       uint32x4_t c_ = (c);                                             \
+       uint32x2_t c_ = (c);                                             \
        uint32x4_t b_ = (b);                                             \
        uint64x2_t a_ = (a);                                             \
        uint64x2_t result;                                               \

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH][AArch64] Fix argument types for some high_lane* intrinsics implemented in assembly
  2014-07-09 14:37 [PATCH][AArch64] Fix argument types for some high_lane* intrinsics implemented in assembly Kyrill Tkachov
@ 2014-07-17  9:57 ` Marcus Shawcroft
  0 siblings, 0 replies; 2+ messages in thread
From: Marcus Shawcroft @ 2014-07-17  9:57 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: GCC Patches

On 9 July 2014 15:37, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> These intrinsics are implemented as macros that map down to asms but the
> types they accept are inconsistent with the ACLE spec. This patch fixes
> them, although they should be reimplemented properly in C in the future.
>
> This is a bugfix and it applies cleanly to trunk, 4.9 and 4.8.
> I know we're close to the 4.9.1 release, but this is not an ABI-breaking
> change so it's the aarch64 maintainers' call on whether it should be
> backported.
>
> Tested aarch64-none-elf
>
> Ok?
>
> Thanks,
> Kyrill
>
> 2014-07-09  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/arm_neon.h (vmlal_high_lane_s16): Fix type.
>     (vmlal_high_lane_s32): Likewise.
>     (vmlal_high_lane_u16): Likewise.
>     (vmlal_high_lane_u32): Likewise.
>     (vmlsl_high_lane_s16): Likewise.
>     (vmlsl_high_lane_s32): Likewise.
>     (vmlsl_high_lane_u16): Likewise.
>     (vmlsl_high_lane_u32): Likewise.

OK thanks.

/Marcus

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-07-17  9:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-09 14:37 [PATCH][AArch64] Fix argument types for some high_lane* intrinsics implemented in assembly Kyrill Tkachov
2014-07-17  9:57 ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).