public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [AArch64] Cheap fix for argument types of vmull_high_lane_{us}{16,32}
@ 2014-09-11 12:15 James Greenhalgh
  2014-09-11 14:26 ` Marcus Shawcroft
  0 siblings, 1 reply; 3+ messages in thread
From: James Greenhalgh @ 2014-09-11 12:15 UTC (permalink / raw)
  To: gcc-patches; +Cc: marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 680 bytes --]


Hi,

I'd been putting this patch off in the hope that I might find
time to move these intrinsics to a C/builtin implementation, but it
is probably better to get them right for now and come back to improving
them later.

All four of these suffer the same problem, their "lane" argument should
be a 64-bit rather than 128-bit vector.

Fix it the obvious way.

Tested cross on aarch64-none-eabi.

OK?

Thanks,
James

---
2014-09-11  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/arm_neon.h (vmull_high_lane_s16): Fix argument
	types.
	(vmull_high_lane_s32): Likewise.
	(vmull_high_lane_u16): Likewise.
	(vmull_high_lane_u32): Likewise.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-AArch64-Cheap-fix-for-argument-types-of-vmull_high_l.patch --]
[-- Type: text/x-patch;  name=0001-AArch64-Cheap-fix-for-argument-types-of-vmull_high_l.patch, Size: 2855 bytes --]

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index c31f7e3..77e3688 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -8249,7 +8249,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_s16(a, b, c)                                    \
   __extension__                                                         \
     ({                                                                  \
-       int16x8_t b_ = (b);                                              \
+       int16x4_t b_ = (b);                                              \
        int16x8_t a_ = (a);                                              \
        int32x4_t result;                                                \
        __asm__ ("smull2 %0.4s, %1.8h, %2.h[%3]"                         \
@@ -8262,7 +8262,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_s32(a, b, c)                                    \
   __extension__                                                         \
     ({                                                                  \
-       int32x4_t b_ = (b);                                              \
+       int32x2_t b_ = (b);                                              \
        int32x4_t a_ = (a);                                              \
        int64x2_t result;                                                \
        __asm__ ("smull2 %0.2d, %1.4s, %2.s[%3]"                         \
@@ -8275,7 +8275,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_u16(a, b, c)                                    \
   __extension__                                                         \
     ({                                                                  \
-       uint16x8_t b_ = (b);                                             \
+       uint16x4_t b_ = (b);                                             \
        uint16x8_t a_ = (a);                                             \
        uint32x4_t result;                                               \
        __asm__ ("umull2 %0.4s, %1.8h, %2.h[%3]"                         \
@@ -8288,7 +8288,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_u32(a, b, c)                                    \
   __extension__                                                         \
     ({                                                                  \
-       uint32x4_t b_ = (b);                                             \
+       uint32x2_t b_ = (b);                                             \
        uint32x4_t a_ = (a);                                             \
        uint64x2_t result;                                               \
        __asm__ ("umull2 %0.2d, %1.4s, %2.s[%3]"                         \

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-09-11 15:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-11 12:15 [AArch64] Cheap fix for argument types of vmull_high_lane_{us}{16,32} James Greenhalgh
2014-09-11 14:26 ` Marcus Shawcroft
2014-09-11 15:13   ` James Greenhalgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).