public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] 1/3 Add x86 SSE <xmmintrin,h> intrinsics to GCC PPC64LE taget
@ 2017-08-16 20:55 Steven Munroe
  2017-08-17  3:25 ` Segher Boessenkool
  0 siblings, 1 reply; 2+ messages in thread
From: Steven Munroe @ 2017-08-16 20:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David Edelsohn

These is the third major contribution of X86 intrinsic equivalent
headers for PPC64LE.

X86 SSE technology was the second SIMD extension which added wider
128-bit vector (XMM) registers and single precision float capability.
They also addressed missing MMX capabilies and provided transfers (move,
pack, unpack) operations between MMX and XMM registers. This was
embodied in the xmmintrin.h> header (in part 2/3). The implementation
also provided the mm_malloc.h API to allow for correct 16-byte alignment
where the system malloc may only provide 8-byte alignment. PowerPC64LE
can assume the PowerPC quadword (16-byte) alignment but we provide this
header and API to ease the application porting process. The mm_malloc.h
header is implicitly included by xmmintrin.h.

In general the SSE (__m128) intrinsic's are a better match to the
PowerISA VMX/VSX 128-bit vector facilities. This allows direct mapping
of the __m128 type to PowerPC __vector float and allows natural handling
of parameter passing return values and SIMD float operations. 

However while both ISA's support float scalars in vector registers the
X86_64 and PowerPC64LE use different formats (and bits within the vector
register) for float scalars. This requires extra PowerISA operations to
exactly match the X86 scalar float (intrinsics ending in *_ss)
semantics. The intent is to provide a functionally correct
implementation at some reduction in performance.

This patch just adds the mm_malloc.h header with is will be needed by
xmmintrin.h and cleans up some noisy warnings from the previous MMX
commit.

Part 2 adds the xmmintrin.h include and associated config.gcc and
x86intrin.h changes

part 3 adds the associated DG test cases.


./gcc/ChangeLog:

2017-08-16  Steven Munroe  <munroesj@gcc.gnu.org>

	* config/rs6000/mm_malloc.h: New file.


[gcc/testsuite]

2017-07-21  Steven Munroe  <munroesj@gcc.gnu.org>

	* gcc.target/powerpc/mmx-packuswb-1.c [NO_WARN_X86_INTRINSICS]:
	Define. Suppress warning during tests.


Index: gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c	(revision 250986)
+++ gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c	(working copy)
@@ -3,6 +3,8 @@
 /* { dg-require-effective-target lp64 } */
 /* { dg-require-effective-target p8vector_hw } */
 
+#define NO_WARN_X86_INTRINSICS 1
+
 #ifndef CHECK_H
 #define CHECK_H "mmx-check.h"
 #endif
Index: gcc/config/rs6000/mm_malloc.h
===================================================================
--- gcc/config/rs6000/mm_malloc.h	(revision 0)
+++ gcc/config/rs6000/mm_malloc.h	(revision 0)
@@ -0,0 +1,62 @@
+/* Copyright (C) 2004-2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License
and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not,
see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MM_MALLOC_H_INCLUDED
+#define _MM_MALLOC_H_INCLUDED
+
+#include <stdlib.h>
+
+/* We can't depend on <stdlib.h> since the prototype of posix_memalign
+   may not be visible.  */
+#ifndef __cplusplus
+extern int posix_memalign (void **, size_t, size_t);
+#else
+extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+#endif
+
+static __inline void *
+_mm_malloc (size_t size, size_t alignment)
+{
+  /* PowerPC64 ELF V2 ABI requires quadword alignment. */
+  size_t vec_align = sizeof (__vector float);
+  /* Linux GLIBC malloc alignment is at least 2 X ptr size.  */
+  size_t malloc_align = (sizeof (void *) + sizeof (void *));
+  void *ptr;
+
+  if (alignment == malloc_align && alignment == vec_align)
+    return malloc (size);
+  if (alignment < vec_align)
+    alignment = vec_align;
+  if (posix_memalign (&ptr, alignment, size) == 0)
+    return ptr;
+  else
+    return NULL;
+}
+
+static __inline void
+_mm_free (void * ptr)
+{
+  free (ptr);
+}
+
+#endif /* _MM_MALLOC_H_INCLUDED */


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH, rs6000] 1/3 Add x86 SSE <xmmintrin,h> intrinsics to GCC PPC64LE taget
  2017-08-16 20:55 [PATCH, rs6000] 1/3 Add x86 SSE <xmmintrin,h> intrinsics to GCC PPC64LE taget Steven Munroe
@ 2017-08-17  3:25 ` Segher Boessenkool
  0 siblings, 0 replies; 2+ messages in thread
From: Segher Boessenkool @ 2017-08-17  3:25 UTC (permalink / raw)
  To: Steven Munroe; +Cc: gcc-patches, David Edelsohn

Hi!

On Wed, Aug 16, 2017 at 02:11:31PM -0500, Steven Munroe wrote:
> These is the third major contribution of X86 intrinsic equivalent
> headers for PPC64LE.

> This patch just adds the mm_malloc.h header with is will be needed by
> xmmintrin.h and cleans up some noisy warnings from the previous MMX
> commit.

> +static __inline void *
> +_mm_malloc (size_t size, size_t alignment)
> +{
> +  /* PowerPC64 ELF V2 ABI requires quadword alignment. */

(Two spaces after full stop; there's one in the changelog as well).

Okay for trunk, thanks,


Segher

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-08-17  1:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-16 20:55 [PATCH, rs6000] 1/3 Add x86 SSE <xmmintrin,h> intrinsics to GCC PPC64LE taget Steven Munroe
2017-08-17  3:25 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).