public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] [libgomp] Add a stream communication framework to GOMP
@ 2008-04-25 11:24 Antoniu Pop
  2008-04-25 11:27 ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Antoniu Pop @ 2008-04-25 11:24 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2746 bytes --]

Hi,

This patch extends libgomp with stream communication primitives and
adds the respective builtins. More information on the stream extension
and the streamization pass will be published in the GCC Summit '08
(http://www.gccsummit.org/2008/view_abstract.php?content_key=19).

This patch is needed for the automatic loop streamization pass
(subsequent patches).

Trunk plus patch have been bootstrapped and tested on amd64-linux.

Antoniu

ChangeLog:

2008-04-22  Antoniu Pop  <antoniu.pop@gmail.com>
            Sebastian Pop  <sebastian.pop@amd.com>

libgomp/
	* stream.c: New.

	* libgomp.h (gomp_stream, gomp_stream_create,
	gomp_stream_push, gomp_stream_head, gomp_stream_pop,
	gomp_stream_eos_p, gomp_stream_set_eos, gomp_stream_destroy,
	gomp_stream_align_push, gomp_stream_align_pop): Declared.

	* libgomp_g.h (GOMP_stream_create, GOMP_stream_push,
	GOMP_stream_head, GOMP_stream_pop, GOMP_stream_eos_p,
	GOMP_stream_set_eos, GOMP_stream_destroy, GOMP_stream_align_push,
	GOMP_stream_align_pop): Declared.

	* Makefile.am: Added stream.c to libgomp_la_SOURCES.

	* Makefile.in: Regenerate.

gcc/
	* builtin-types.def (BT_FN_BOOL_PTR, BT_FN_PTR_SIZE_UINT,
	BT_FN_VOID_PTR_PTR_INT): New types.

	* fortran/types.def (BT_FN_BOOL_PTR, BT_FN_PTR_SIZE_UINT,
	BT_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_INT, BT_FN_VOID_PTR_PTR_INT):
	New types.

	* omp-builtins.def (BUILT_IN_GOMP_STREAM_CREATE,
	BUILT_IN_GOMP_STREAM_PUSH, BUILT_IN_GOMP_STREAM_HEAD,
	BUILT_IN_GOMP_STREAM_POP, BUILT_IN_GOMP_STREAM_EOS_P,
	BUILT_IN_GOMP_STREAM_SET_EOS, BUILT_IN_GOMP_STREAM_DESTROY,
	BUILT_IN_GOMP_STREAM_ALIGN_PUSH, BUILT_IN_GOMP_STREAM_ALIGN_POP):
	New builtins.


---------- Forwarded message ----------
From:  <apop@gcc12.fsffrance.org>
Date: Thu, Apr 24, 2008 at 6:12 PM
Subject: [regtest] Results for 2008_04_23_17_03_25_gomp_streams.diff
on x86_64-unknown-linux-gnu
To: antoniu.pop@gmail.com


Checker: (2008_04_24_16_12_40): (cat /home/apop/results/testing/patched/report
 there are no regressions with your patch.
 Checker: (2008_04_24_16_12_40): tac)
 Checker: (2008_04_24_16_12_40): FAILs with patched version:
 Checker: (2008_04_24_16_12_40): (cat /home/apop/results/testing/patched/failed
 gcc.sum gcc.dg/vect/vect-vfa-slp.c
 libstdc++.sum 22_locale/time_get/get_date/wchar_t/4.cc
 Checker: (2008_04_24_16_12_40): tac)
 Checker: (2008_04_24_16_12_40): FAILs with pristine version:
 Checker: (2008_04_24_16_12_40): (cat /home/apop/results//trunk/134625/failed
 gcc.sum gcc.dg/vect/vect-vfa-slp.c
 libstdc++.sum 22_locale/time_get/get_date/wchar_t/4.cc
 Checker: (2008_04_24_16_12_40): tac)
 Checker: (2008_04_24_16_12_40): The files used for the validation of
your patch are stored in
/home/apop/results/patched/2008_04_24_16_12_40 on the tester machine.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 2008_04_23_17_03_25_gomp_streams.diff --]
[-- Type: text/x-diff; name=2008_04_23_17_03_25_gomp_streams.diff, Size: 20015 bytes --]

email:antoniu.pop@gmail.com
branch:trunk
revision:HEAD
configure:
make:
check:

Index: libgomp/Makefile.in
===================================================================
--- libgomp/Makefile.in	(revision 134583)
+++ libgomp/Makefile.in	(working copy)
@@ -85,7 +85,7 @@ libgomp_la_LIBADD =
 am_libgomp_la_OBJECTS = alloc.lo barrier.lo critical.lo env.lo \
 	error.lo iter.lo loop.lo ordered.lo parallel.lo sections.lo \
 	single.lo team.lo work.lo lock.lo mutex.lo proc.lo sem.lo \
-	bar.lo time.lo fortran.lo affinity.lo
+	bar.lo time.lo fortran.lo affinity.lo stream.lo
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 DEFAULT_INCLUDES = -I. -I$(srcdir) -I.
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -219,12 +219,9 @@ USE_FORTRAN_TRUE = @USE_FORTRAN_TRUE@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -240,6 +237,9 @@ build_os = @build_os@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -248,6 +248,7 @@ host_alias = @host_alias@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -255,14 +256,17 @@ libdir = @libdir@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
@@ -290,7 +294,8 @@ libgomp_version_info = -version-info $(l
 libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script)
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	loop.c ordered.c parallel.c sections.c single.c team.c work.c \
-	lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c
+	lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c \
+	stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
@@ -435,6 +440,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stream.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/work.Plo@am__quote@
@@ -543,10 +549,13 @@ dist-info: $(INFO_DEPS)
 	    $(srcdir)/*) base=`echo "$$base" | sed "s|^$$srcdirstrip/||"`;; \
 	  esac; \
 	  if test -f $$base; then d=.; else d=$(srcdir); fi; \
-	  for file in $$d/$$base*; do \
-	    relfile=`expr "$$file" : "$$d/\(.*\)"`; \
-	    test -f $(distdir)/$$relfile || \
-	      cp -p $$file $(distdir)/$$relfile; \
+	  base_i=`echo "$$base" | sed 's|\.info$$||;s|$$|.i|'`; \
+	  for file in $$d/$$base $$d/$$base-[0-9] $$d/$$base-[0-9][0-9] $$d/$$base_i[0-9] $$d/$$base_i[0-9][0-9]; do \
+	    if test -f $$file; then \
+	      relfile=`expr "$$file" : "$$d/\(.*\)"`; \
+	      test -f $(distdir)/$$relfile || \
+		cp -p $$file $(distdir)/$$relfile; \
+	    else :; fi; \
 	  done; \
 	done
 
Index: libgomp/libgomp_g.h
===================================================================
--- libgomp/libgomp_g.h	(revision 134583)
+++ libgomp/libgomp_g.h	(working copy)
@@ -108,4 +108,15 @@ extern bool GOMP_single_start (void);
 extern void *GOMP_single_copy_start (void);
 extern void GOMP_single_copy_end (void *);
 
+/* stream.c */
+extern void *GOMP_stream_create (size_t, unsigned);
+extern void GOMP_stream_push (void *, void *);
+extern void *GOMP_stream_head (void *);
+extern void GOMP_stream_pop (void *);
+extern bool GOMP_stream_eos_p (void *);
+extern void GOMP_stream_set_eos (void *);
+extern void GOMP_stream_destroy (void *);
+extern void GOMP_stream_align_push (void *, void *, int);
+extern void GOMP_stream_align_pop (void *, int);
+
 #endif /* LIBGOMP_G_H */
Index: libgomp/stream.c
===================================================================
--- libgomp/stream.c	(revision 0)
+++ libgomp/stream.c	(revision 0)
@@ -0,0 +1,276 @@
+/* Copyright (C) 2008 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <antoniu.pop@gmail.com> 
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This file handles streams.  */
+
+#include "libgomp.h"
+#include <stdlib.h>
+#include <string.h>
+#include <sched.h>
+
+/* Set to L1 line cache size.  */
+#define SIZE_LOCAL_BUFFER 64
+
+/* Returns a new stream of COUNT * SIZE_LOCAL_BUFFER elements.  Each
+   element is of size SIZE bytes.  Returns NULL when the allocation
+   fails or when COUNT is less than 2.  */
+
+gomp_stream
+gomp_stream_create (size_t size, unsigned count)
+{
+  gomp_stream s;
+
+  /* There should be enough place for two sliding windows.  */
+  if (count < 2)
+    return NULL;
+
+  s = (gomp_stream) gomp_malloc (sizeof (struct gomp_stream));
+
+  if (!s)
+    return NULL;
+
+  s->eos_p = false;
+  s->read_buffer_index = 0;
+  s->write_buffer_index = 0;
+  s->write_index = 0;
+  s->read_index = 0;
+  s->size_elt = size;
+  s->size_local_buffer = SIZE_LOCAL_BUFFER;
+  s->capacity = count * s->size_local_buffer;
+  s->buffer = (char *) gomp_malloc (s->capacity);
+
+  if (!s->buffer)
+    {
+      free (s);
+      return NULL;
+    }
+
+  return s;
+}
+
+static inline unsigned
+next_window (gomp_stream s, unsigned index)
+{
+  unsigned next = index + s->size_local_buffer;
+  return ((next >= s->capacity) ? 0 : next);
+}
+
+static inline void 
+slide_read_window (gomp_stream s)
+{
+  unsigned next = next_window (s, s->read_buffer_index);
+
+  s->read_buffer_index = next;
+  s->read_index = next;
+}
+
+static inline void
+slide_write_window (gomp_stream s)
+{
+  unsigned next = next_window (s, s->write_buffer_index);
+
+  while (s->read_buffer_index == next)
+    sched_yield ();
+
+  s->write_buffer_index = next;
+  s->write_index = next;
+}
+
+/* Returns the number of read elements in the read sliding window of
+   stream S.  */
+
+static inline unsigned
+read_bytes_in_read_window (gomp_stream s)
+{
+  return s->read_index - s->read_buffer_index;
+}
+
+/* Returns the number of written elements in the write sliding window
+   of stream S.  */
+
+static inline unsigned
+written_bytes_in_write_window (gomp_stream s)
+{
+  return s->write_index - s->write_buffer_index;
+}
+
+/* Push element ELT to stream S.  */
+
+void
+gomp_stream_push (gomp_stream s, char *elt)
+{
+  if (written_bytes_in_write_window (s) + s->size_elt > s->size_local_buffer)
+    slide_write_window (s);
+
+  memcpy (s->buffer + s->write_index, elt, s->size_elt);
+  s->write_index += s->size_elt;
+}
+
+/* Release from stream S the next element.  */
+
+void
+gomp_stream_pop (gomp_stream s)
+{
+  if (read_bytes_in_read_window (s) + 2 * s->size_elt > s->size_local_buffer)
+    slide_read_window (s);
+  else
+    s->read_index += s->size_elt;
+}
+
+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->read_buffer_index == s->write_buffer_index)
+    sched_yield ();
+}
+
+/* Returns the first element of the stream S.  Don't remove the
+   element: for that, a call to gomp_stream_pop is needed.  */
+
+char *
+gomp_stream_head (gomp_stream s)
+{
+  wait_used_space (s);
+  return s->buffer + s->read_index;
+}
+
+/* Returns true when there are no more elements to be read from the
+   stream S.  */
+
+bool
+gomp_stream_eos_p (gomp_stream s)
+{
+  return (s->eos_p && (s->read_index == s->write_index));
+}
+
+/* Producer can set End Of Stream to stream S.  The producer has to
+   slide the write window if it wrote something.  */
+
+void
+gomp_stream_set_eos (gomp_stream s)
+{
+  if (written_bytes_in_write_window (s) > 0)
+    slide_write_window (s);
+
+  s->eos_p = true;
+}
+
+/* Free stream S.  */
+
+void
+gomp_stream_destroy (gomp_stream s)
+{
+  /* No need to synchronize here: the consumer that detects when eos
+     is set, and based on that it decides to destroy the stream.  */
+
+  free (s->buffer);
+  free (s);
+}
+
+/* Align the producer and consumer accesses by pushing in the stream
+   COUNT successive elements starting at address START.  */
+
+void
+gomp_stream_align_push (gomp_stream s, char *start, int count)
+{
+  int i;
+
+  for (i = 0; i < count; ++i)
+    {
+      gomp_stream_push (s, start);
+      start += s->size_elt;
+    }
+}
+
+/* Align the producer and consumer accesses by removing from the
+   stream COUNT elements.  */
+
+void
+gomp_stream_align_pop (gomp_stream s, int count)
+{
+  int i;
+
+  for (i = 0; i < count; ++i)
+    gomp_stream_pop (s);
+}
+
+void *
+GOMP_stream_create (size_t size, unsigned count)
+{
+  return gomp_stream_create (size, count);
+}
+
+void
+GOMP_stream_push (void *s, void *elt)
+{
+  gomp_stream_push ((gomp_stream) s, (char *) elt);
+}
+
+void *
+GOMP_stream_head (void *s)
+{
+  return gomp_stream_head ((gomp_stream) s);
+}
+
+void
+GOMP_stream_pop (void *s)
+{
+  gomp_stream_pop ((gomp_stream) s);
+}
+
+bool
+GOMP_stream_eos_p (void *s)
+{
+  return gomp_stream_eos_p ((gomp_stream) s);
+}
+
+void
+GOMP_stream_set_eos (void *s)
+{
+  gomp_stream_set_eos ((gomp_stream) s);
+}
+
+void
+GOMP_stream_destroy (void *s)
+{
+  gomp_stream_destroy ((gomp_stream) s);
+}
+
+void
+GOMP_stream_align_push (void *s, void *start, int offset)
+{
+  gomp_stream_align_push ((gomp_stream) s, (char *) start, offset);
+}
+
+void
+GOMP_stream_align_pop (void *s, int offset)
+{
+  gomp_stream_align_pop ((gomp_stream) s, offset);
+}
Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h	(revision 134583)
+++ libgomp/libgomp.h	(working copy)
@@ -218,6 +218,40 @@ struct gomp_thread
   gomp_sem_t release;
 };
 
+/* This structure represents a stream between tasks.  */
+
+typedef struct gomp_stream
+{
+  /* Offset in bytes of the sliding reading window.  Read window is of
+     size LOCAL_BUFFER_SIZE bytes.  */
+  unsigned read_buffer_index;
+
+  /* Offset in bytes of the first used element in the stream.  */
+  unsigned read_index;
+
+  /* Offset in bytes of the sliding writing window.  Writing window is
+     of size LOCAL_BUFFER_SIZE bytes.  */
+  unsigned write_buffer_index;
+
+  /* Offset in bytes of the first empty element in the stream.  */
+  unsigned write_index;
+
+  /* Size in bytes of sub-buffers for unsynchronized reads and writes.  */
+  unsigned size_local_buffer;
+
+  /* End of stream: true when producer has finished inserting elements.  */
+  bool eos_p;
+
+  /* Size in bytes of an element in the stream.  */
+  size_t size_elt;
+
+  /* Number of bytes in the circular buffer.  */
+  unsigned capacity;
+
+  /* Circular buffer.  */
+  char *buffer;
+} *gomp_stream;
+
 /* ... and here is that TLS data.  */
 
 #ifdef HAVE_TLS
@@ -304,6 +338,18 @@ extern unsigned gomp_resolve_num_threads
 extern void gomp_init_num_threads (void);
 extern unsigned gomp_dynamic_max_threads (void);
 
+/* stream.c */
+
+extern gomp_stream gomp_stream_create (size_t, unsigned);
+extern void gomp_stream_push (gomp_stream, char *);
+extern char *gomp_stream_head (gomp_stream);
+extern void gomp_stream_pop (gomp_stream);
+extern bool gomp_stream_eos_p (gomp_stream);
+extern void gomp_stream_set_eos (gomp_stream);
+extern void gomp_stream_destroy (gomp_stream);
+extern void gomp_stream_align_push (gomp_stream, char *, int);
+extern void gomp_stream_align_pop (gomp_stream, int);
+
 /* team.c */
 
 extern void gomp_team_start (void (*) (void *), void *, unsigned,
Index: libgomp/testsuite/Makefile.in
===================================================================
--- libgomp/testsuite/Makefile.in	(revision 134583)
+++ libgomp/testsuite/Makefile.in	(working copy)
@@ -138,12 +138,9 @@ USE_FORTRAN_TRUE = @USE_FORTRAN_TRUE@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -159,6 +156,9 @@ build_os = @build_os@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -167,6 +167,7 @@ host_alias = @host_alias@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -174,14 +175,17 @@ libdir = @libdir@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
Index: libgomp/Makefile.am
===================================================================
--- libgomp/Makefile.am	(revision 134583)
+++ libgomp/Makefile.am	(working copy)
@@ -31,7 +31,8 @@ libgomp_la_LDFLAGS = $(libgomp_version_i
 
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	loop.c ordered.c parallel.c sections.c single.c team.c work.c \
-	lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c
+	lock.c mutex.c proc.c sem.c bar.c time.c fortran.c affinity.c \
+	stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def	(revision 134583)
+++ gcc/builtin-types.def	(working copy)
@@ -215,6 +215,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, 
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -289,6 +290,7 @@ DEF_FUNCTION_TYPE_2 (BT_FN_INT_CONST_STR
 		     BT_INT, BT_CONST_STRING, BT_VALIST_ARG)
 DEF_FUNCTION_TYPE_2 (BT_FN_PTR_SIZE_SIZE,
 		     BT_PTR, BT_SIZE, BT_SIZE)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_SIZE_UINT, BT_PTR, BT_SIZE, BT_UINT)
 DEF_FUNCTION_TYPE_2 (BT_FN_PTR_PTR_SIZE,
 		     BT_PTR, BT_PTR, BT_SIZE)
 DEF_FUNCTION_TYPE_2 (BT_FN_COMPLEX_FLOAT_COMPLEX_FLOAT_COMPLEX_FLOAT,
@@ -374,6 +376,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_OMPFN_PT
 		     BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_INT_SIZE, BT_PTR,
 		     BT_CONST_PTR, BT_INT, BT_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_INT, BT_VOID, BT_PTR, BT_PTR, BT_INT)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 		     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
Index: gcc/fortran/types.def
===================================================================
--- gcc/fortran/types.def	(revision 134583)
+++ gcc/fortran/types.def	(working copy)
@@ -82,6 +82,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, 
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -92,6 +93,9 @@ DEF_FUNCTION_TYPE_2 (BT_FN_I2_VPTR_I2, B
 DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_I4, BT_I4, BT_VOLATILE_PTR, BT_I4)
 DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_I8, BT_I8, BT_VOLATILE_PTR, BT_I8)
 DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_I16, BT_I16, BT_VOLATILE_PTR, BT_I16)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_SIZE_UINT, BT_PTR, BT_INT, BT_UINT)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTR, BT_VOID, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_INT, BT_VOID, BT_PTR, BT_INT)
 
 DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_VPTR_I1_I1, BT_BOOL, BT_VOLATILE_PTR,
                      BT_I1, BT_I1)
@@ -111,6 +115,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_I16_VPTR_I16_
 		     BT_I16, BT_I16)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_OMPFN_PTR_UINT, BT_VOID, BT_PTR_FN_VOID_PTR,
                      BT_PTR, BT_UINT)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_INT, BT_VOID, BT_PTR, BT_PTR, BT_INT)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT,
                      BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT)
Index: gcc/omp-builtins.def
===================================================================
--- gcc/omp-builtins.def	(revision 134583)
+++ gcc/omp-builtins.def	(working copy)
@@ -149,3 +149,22 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_C
 		  BT_FN_PTR, ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
 		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_CREATE, "GOMP_stream_create",
+		  BT_FN_PTR_SIZE_UINT, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_PUSH, "GOMP_stream_push",
+		  BT_FN_VOID_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_HEAD, "GOMP_stream_head",
+		  BT_FN_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_POP, "GOMP_stream_pop",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_EOS_P, "GOMP_stream_eos_p",
+		  BT_FN_BOOL_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_SET_EOS, "GOMP_stream_set_eos",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_DESTROY, "GOMP_stream_destroy",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_PUSH, "GOMP_stream_align_push",
+		  BT_FN_VOID_PTR_PTR_INT, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_POP, "GOMP_stream_align_pop",
+		  BT_FN_VOID_PTR_INT, ATTR_NOTHROW_LIST)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-04-25 11:24 [PATCH] [libgomp] Add a stream communication framework to GOMP Antoniu Pop
@ 2008-04-25 11:27 ` Jakub Jelinek
  2008-04-30 19:23   ` Sebastian Pop
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2008-04-25 11:27 UTC (permalink / raw)
  To: Antoniu Pop; +Cc: gcc-patches

On Fri, Apr 25, 2008 at 10:06:56AM +0200, Antoniu Pop wrote:
> This patch extends libgomp with stream communication primitives and
> adds the respective builtins. More information on the stream extension
> and the streamization pass will be published in the GCC Summit '08
> (http://www.gccsummit.org/2008/view_abstract.php?content_key=19).
> 
> This patch is needed for the automatic loop streamization pass
> (subsequent patches).

+  s = (gomp_stream) gomp_malloc (sizeof (struct gomp_stream));
+
+  if (!s)
+    return NULL;

gomp_malloc is guaranteed to return non-NULL (similarly to e.g. xmalloc in
libiberty).  This is in several places.

+static inline void
+slide_write_window (gomp_stream s)
+{
+  unsigned next = next_window (s, s->write_buffer_index);
+
+  while (s->read_buffer_index == next)
+    sched_yield ();

Unconditional busy waiting and sched_yield is a bad idea.  What if the
number of threads in the team is bigger than the number of available CPUs
(which can be limited with affinity, etc.)?  Or even once we support
multiple concurrent non-nested parallel regions from different
pthread_create created threads?  IMHO you want a proper synchronization
primitive, with optional busy waiting.  See gomp-3_0-branch, where user can
determine how long if at all threads should be busy waiting through
GOMP_SPINCOUNT and OMP_WAIT_POLICY env vars, throttling is used for
number of active threads bigger than number of available CPUs and after that
falls back to futexes (on Linux, on other OSes whatever is supported).

sched_yield on Linux can result in terrible latency and performance,
because the thread is put at the end of the run queue.

+void *
+GOMP_stream_create (size_t size, unsigned count)
+{
+  return gomp_stream_create (size, count);
+}

What's the point of these wrappers, if you don't call the
gomp_* functions from elsewhere within libgomp, it is IMHO better
to just name the real functions GOMP_* and avoid the tail calls.

+typedef struct gomp_stream
+{
+  /* Offset in bytes of the sliding reading window.  Read window is of
+     size LOCAL_BUFFER_SIZE bytes.  */
+  unsigned read_buffer_index;
+
+  /* Offset in bytes of the first used element in the stream.  */
+  unsigned read_index;
...

Why don't you use size_t for indexes and object sizes?

Also, I don't see any changes to libgomp/libgomp.map, which means
the new symbols aren't exported.  How can they be called then
(unless you use libgomp.a)?

	Jakub

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-04-25 11:27 ` Jakub Jelinek
@ 2008-04-30 19:23   ` Sebastian Pop
  2008-06-02 23:06     ` Antoniu Pop
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Pop @ 2008-04-30 19:23 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Antoniu Pop, gcc-patches

On Fri, Apr 25, 2008 at 3:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>  Unconditional busy waiting and sched_yield is a bad idea.  IMHO you
>  want a proper synchronization primitive, with optional busy
>  waiting.  See gomp-3_0-branch, where user can determine how long if
>  at all threads should be busy waiting through GOMP_SPINCOUNT and
>  OMP_WAIT_POLICY env vars, throttling is used for number of active
>  threads bigger than number of available CPUs and after that falls
>  back to futexes (on Linux, on other OSes whatever is supported).

If I understand correctly, we should use do_wait instead of
sched_yield, so we will have to work in the gomp-3_0-branch.

We started working on the gomp-2.5 and we were quite limited by the
tasking capabilities: we used parallel sections instead of tasks; but
the goal was to use the more flexible task support of gomp-3.0 for being
able to dynamically create tasks.

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-04-30 19:23   ` Sebastian Pop
@ 2008-06-02 23:06     ` Antoniu Pop
  2008-06-04  5:18       ` Antoniu Pop
  2008-06-04  7:00       ` Jakub Jelinek
  0 siblings, 2 replies; 8+ messages in thread
From: Antoniu Pop @ 2008-06-02 23:06 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2323 bytes --]

Hi,

This is a revised version of the patch, targetting the gomp-3_0-branch.
The gomp-3_0-branch plus patch have been bootstrapped and tested on amd64-linux.

Antoniu

ChangeLog:

2008-04-22  Antoniu Pop  <antoniu.pop@gmail.com>
           Sebastian Pop  <sebastian.pop@amd.com>

libgomp/
       * libgomp.h (gomp_stream): Declared.

       * stream.c: New.
       * libgomp_g.h (GOMP_stream_create, GOMP_stream_push,
       GOMP_stream_head, GOMP_stream_pop, GOMP_stream_eos_p,
       GOMP_stream_set_eos, GOMP_stream_destroy, GOMP_stream_align_push,
       GOMP_stream_align_pop): Declared.
       * libgomp.map: Export GOMP_*.

       * Makefile.am: Added stream.c to libgomp_la_SOURCES.
       * Makefile.in: Regenerate.

gcc/
       * builtin-types.def (BT_FN_BOOL_PTR, BT_FN_VOID_PTR_PTR_SIZE):
       New types.
       * fortran/types.def (BT_FN_BOOL_PTR, BT_FN_PTR_SIZE_SIZE,
       BT_FN_VOID_PTR_SIZE, BT_FN_VOID_PTR_PTR_SIZE): New types.

       * omp-builtins.def (BUILT_IN_GOMP_STREAM_CREATE,
       BUILT_IN_GOMP_STREAM_PUSH, BUILT_IN_GOMP_STREAM_HEAD,
       BUILT_IN_GOMP_STREAM_POP, BUILT_IN_GOMP_STREAM_EOS_P,
       BUILT_IN_GOMP_STREAM_SET_EOS, BUILT_IN_GOMP_STREAM_DESTROY,
       BUILT_IN_GOMP_STREAM_ALIGN_PUSH, BUILT_IN_GOMP_STREAM_ALIGN_POP):
       New builtins.


On Wed, Apr 30, 2008 at 6:37 PM, Sebastian Pop <sebpop@gmail.com> wrote:
> On Fri, Apr 25, 2008 at 3:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>  Unconditional busy waiting and sched_yield is a bad idea.  IMHO you
>>  want a proper synchronization primitive, with optional busy
>>  waiting.  See gomp-3_0-branch, where user can determine how long if
>>  at all threads should be busy waiting through GOMP_SPINCOUNT and
>>  OMP_WAIT_POLICY env vars, throttling is used for number of active
>>  threads bigger than number of available CPUs and after that falls
>>  back to futexes (on Linux, on other OSes whatever is supported).
>
> If I understand correctly, we should use do_wait instead of
> sched_yield, so we will have to work in the gomp-3_0-branch.
>
> We started working on the gomp-2.5 and we were quite limited by the
> tasking capabilities: we used parallel sections instead of tasks; but
> the goal was to use the more flexible task support of gomp-3.0 for being
> able to dynamically create tasks.
>
> Sebastian
>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 2008_06_02_16_18_13_gstreams.diff --]
[-- Type: text/x-diff; name=2008_06_02_16_18_13_gstreams.diff, Size: 19579 bytes --]

email:antoniu.pop@gmail.com
branch:gomp-3_0-branch
revision:HEAD
configure:
make:
check:

Index: libgomp/Makefile.in
===================================================================
--- libgomp/Makefile.in	(revision 136152)
+++ libgomp/Makefile.in	(working copy)
@@ -86,7 +86,7 @@
 	error.lo iter.lo iter_ull.lo loop.lo loop_ull.lo ordered.lo \
 	parallel.lo sections.lo single.lo task.lo team.lo work.lo \
 	lock.lo mutex.lo proc.lo sem.lo bar.lo ptrlock.lo time.lo \
-	fortran.lo affinity.lo
+	fortran.lo affinity.lo stream.lo
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 DEFAULT_INCLUDES = -I. -I$(srcdir) -I.
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -226,12 +226,9 @@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -247,6 +244,9 @@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -255,6 +255,7 @@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -262,14 +263,17 @@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
@@ -298,7 +302,7 @@
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \
 	task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \
-	time.c fortran.c affinity.c
+	time.c fortran.c affinity.c stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
@@ -446,6 +450,7 @@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stream.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@
@@ -555,10 +560,13 @@
 	    $(srcdir)/*) base=`echo "$$base" | sed "s|^$$srcdirstrip/||"`;; \
 	  esac; \
 	  if test -f $$base; then d=.; else d=$(srcdir); fi; \
-	  for file in $$d/$$base*; do \
-	    relfile=`expr "$$file" : "$$d/\(.*\)"`; \
-	    test -f $(distdir)/$$relfile || \
-	      cp -p $$file $(distdir)/$$relfile; \
+	  base_i=`echo "$$base" | sed 's|\.info$$||;s|$$|.i|'`; \
+	  for file in $$d/$$base $$d/$$base-[0-9] $$d/$$base-[0-9][0-9] $$d/$$base_i[0-9] $$d/$$base_i[0-9][0-9]; do \
+	    if test -f $$file; then \
+	      relfile=`expr "$$file" : "$$d/\(.*\)"`; \
+	      test -f $(distdir)/$$relfile || \
+		cp -p $$file $(distdir)/$$relfile; \
+	    else :; fi; \
 	  done; \
 	done
 
Index: libgomp/libgomp_g.h
===================================================================
--- libgomp/libgomp_g.h	(revision 136152)
+++ libgomp/libgomp_g.h	(working copy)
@@ -182,4 +182,15 @@
 extern void *GOMP_single_copy_start (void);
 extern void GOMP_single_copy_end (void *);
 
+/* stream.c */
+extern void *GOMP_stream_create (size_t, size_t);
+extern void GOMP_stream_push (void *, void *);
+extern void *GOMP_stream_head (void *);
+extern void GOMP_stream_pop (void *);
+extern bool GOMP_stream_eos_p (void *);
+extern void GOMP_stream_set_eos (void *);
+extern void GOMP_stream_destroy (void *);
+extern void GOMP_stream_align_push (void *, void *, size_t);
+extern void GOMP_stream_align_pop (void *, size_t);
+
 #endif /* LIBGOMP_G_H */
Index: libgomp/libgomp.map
===================================================================
--- libgomp/libgomp.map	(revision 136152)
+++ libgomp/libgomp.map	(working copy)
@@ -167,4 +167,13 @@
 	GOMP_loop_ull_runtime_start;
 	GOMP_loop_ull_static_next;
 	GOMP_loop_ull_static_start;
+	GOMP_stream_create;
+	GOMP_stream_push;
+	GOMP_stream_head;
+	GOMP_stream_pop;
+ 	GOMP_stream_eos_p;
+ 	GOMP_stream_set_eos;
+ 	GOMP_stream_destroy;
+ 	GOMP_stream_align_push;
+ 	GOMP_stream_align_pop;
 } GOMP_1.0;
Index: libgomp/stream.c
===================================================================
--- libgomp/stream.c	(revision 0)
+++ libgomp/stream.c	(revision 0)
@@ -0,0 +1,227 @@
+/* Copyright (C) 2008 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <antoniu.pop@gmail.com> 
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This file handles streams.  */
+
+#include "libgomp.h"
+#include "wait.h"
+#include <stdlib.h>
+#include <string.h>
+
+/* Set to L1 line cache size.  */
+#define SIZE_LOCAL_BUFFER 64
+
+/* Returns a new stream of COUNT * SIZE_LOCAL_BUFFER elements.  Each
+   element is of size SIZE bytes.  Returns NULL when the allocation
+   fails or when COUNT is less than 2.  */
+
+void *
+GOMP_stream_create (size_t size, size_t count)
+{
+  gomp_stream s;
+
+  /* There should be enough place for two sliding windows.  */
+  if (count < 2)
+    return NULL;
+
+  s = (gomp_stream) gomp_malloc (sizeof (struct gomp_stream));
+
+  s->eos_p = false;
+  s->read_buffer_index = 0;
+  s->write_buffer_index = 0;
+  s->write_index = 0;
+  s->read_index = 0;
+  s->size_elt = size;
+  s->size_local_buffer = SIZE_LOCAL_BUFFER;
+  s->capacity = count * s->size_local_buffer;
+  s->buffer = (char *) gomp_malloc (s->capacity);
+
+  return s;
+}
+
+static inline size_t
+next_window (gomp_stream s, size_t index)
+{
+  size_t next = index + s->size_local_buffer;
+  return ((next >= s->capacity) ? 0 : next);
+}
+
+static inline void 
+slide_read_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->read_buffer_index);
+
+  s->read_buffer_index = next;
+  s->read_index = next;
+}
+
+static inline void
+slide_write_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->write_buffer_index);
+
+  while (s->read_buffer_index == next)
+    do_wait ((int *) &s->read_buffer_index, next);
+
+  s->write_buffer_index = next;
+  s->write_index = next;
+}
+
+/* Returns the number of read elements in the read sliding window of
+   stream S.  */
+
+static inline size_t
+read_bytes_in_read_window (gomp_stream s)
+{
+  return s->read_index - s->read_buffer_index;
+}
+
+/* Returns the number of written elements in the write sliding window
+   of stream S.  */
+
+static inline size_t
+written_bytes_in_write_window (gomp_stream s)
+{
+  return s->write_index - s->write_buffer_index;
+}
+
+/* Push element ELT to stream S.  */
+
+static inline void
+gomp_stream_push (gomp_stream s, char *elt)
+{
+  if (written_bytes_in_write_window (s) + s->size_elt > s->size_local_buffer)
+    slide_write_window (s);
+
+  memcpy (s->buffer + s->write_index, elt, s->size_elt);
+  s->write_index += s->size_elt;
+}
+
+/* Release from stream S the next element.  */
+
+static inline void
+gomp_stream_pop (gomp_stream s)
+{
+  if (read_bytes_in_read_window (s) + 2 * s->size_elt > s->size_local_buffer)
+    slide_read_window (s);
+  else
+    s->read_index += s->size_elt;
+}
+
+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->read_buffer_index == s->write_buffer_index)
+    do_wait ((int *) &s->read_buffer_index, s->write_buffer_index);
+}
+
+/* Returns the first element of the stream S.  Don't remove the
+   element: for that, a call to gomp_stream_pop is needed.  */
+
+void *
+GOMP_stream_head (void *s)
+{
+  wait_used_space ((gomp_stream) s);
+  return ((gomp_stream) s)->buffer + ((gomp_stream) s)->read_index;
+}
+
+/* Returns true when there are no more elements to be read from the
+   stream S.  */
+
+bool
+GOMP_stream_eos_p (void *s)
+{
+  return (((gomp_stream) s)->eos_p && 
+	  (((gomp_stream) s)->read_index == ((gomp_stream) s)->write_index));
+}
+
+/* Producer can set End Of Stream to stream S.  The producer has to
+   slide the write window if it wrote something.  */
+
+void
+GOMP_stream_set_eos (void *s)
+{
+  if (written_bytes_in_write_window ((gomp_stream) s) > 0)
+    slide_write_window ((gomp_stream) s);
+
+  ((gomp_stream) s)->eos_p = true;
+}
+
+/* Free stream S.  */
+
+void
+GOMP_stream_destroy (void *s)
+{
+  /* No need to synchronize here: the consumer that detects when eos
+     is set, and based on that it decides to destroy the stream.  */
+
+  free (((gomp_stream) s)->buffer);
+  free ((gomp_stream) s);
+}
+
+/* Align the producer and consumer accesses by pushing in the stream
+   COUNT successive elements starting at address START.  */
+
+void
+GOMP_stream_align_push (void *s, void *start, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    {
+      gomp_stream_push ((gomp_stream) s, (char *) start);
+      start += ((gomp_stream) s)->size_elt;
+    }
+}
+
+/* Align the producer and consumer accesses by removing from the
+   stream COUNT elements.  */
+
+void
+GOMP_stream_align_pop (void *s, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    gomp_stream_pop ((gomp_stream) s);
+}
+
+
+void
+GOMP_stream_push (void *s, void *elt)
+{
+  gomp_stream_push ((gomp_stream) s, (char *) elt);
+}
+
+void
+GOMP_stream_pop (void *s)
+{
+  gomp_stream_pop ((gomp_stream) s);
+}
Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h	(revision 136152)
+++ libgomp/libgomp.h	(working copy)
@@ -357,6 +357,67 @@
   gomp_barrier_t threads_dock;
 };
 
+/* This structure represents a stream between tasks.  Special care
+   needs to be taken to ensure proper cache behaviour.  We need to
+   separate groups of fields on independent cache lines according to
+   the usage of each group.  In the case of SMPs, this is a critical
+   requirement.  */
+
+typedef struct gomp_stream
+{
+  /* Read-only group.  These fields will be initialized or redefined
+     *very* sparsely (i.e., change of strategy in the runtime).
+     Therefore, the behaviour of the cache protocol for the cache
+     lines holding the following fields will be to stay in the Shared
+     state.  */
+
+  /* Circular buffer.  */
+  char *buffer;
+
+  /* Number of bytes in the circular buffer.  */
+  size_t capacity;
+
+  /* Size in bytes of an element in the stream.  */
+  size_t size_elt;
+
+  /* Size in bytes of sub-buffers for unsynchronized reads and writes.  */
+  size_t size_local_buffer;
+
+  /* End of stream: true when producer has finished inserting elements.  */
+  bool eos_p;
+
+  /* Producer private group.  The following fields are only touched by
+     the producer (read and written).  They should be on a separate
+     cache line, which will ensure that the line stays in the Modified
+     (or Exclusive) state and avoids ping-pong.  */
+
+  /* Offset in bytes of the first empty element in the stream.  */
+  size_t write_index __attribute__((aligned (64)));
+
+  /* Consumer private group.  The following fields are only touched by
+     the consumer (read and written).  Same remark as for the consumer
+     private group.  */
+
+  /* Offset in bytes of the first used element in the stream.  */
+  size_t read_index __attribute__((aligned (64)));
+
+  /* Shared/Modified groups.  As a stream implicitely synchronizes
+     tasks/threads, it is natural that some fields will be
+     read/written by both producer and consumer.  It is therefore
+     critical to separate those on individual cache lines (obviously
+     same situation as locks).  */
+
+  /* Offset in bytes of the sliding writing window.  Writing window is
+     of size LOCAL_BUFFER_SIZE bytes.  Read/Written by producer, Read
+     by consumer.  */
+  size_t write_buffer_index __attribute__((aligned (64)));
+
+  /* Offset in bytes of the sliding reading window.  Read window is of
+     size LOCAL_BUFFER_SIZE bytes.  Read/Written by consumer, Read by
+     producer.  */
+  size_t read_buffer_index __attribute__((aligned (64)));
+} *gomp_stream;
+
 /* ... and here is that TLS data.  */
 
 #ifdef HAVE_TLS
Index: libgomp/testsuite/Makefile.in
===================================================================
--- libgomp/testsuite/Makefile.in	(revision 136152)
+++ libgomp/testsuite/Makefile.in	(working copy)
@@ -144,12 +144,9 @@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -165,6 +162,9 @@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -173,6 +173,7 @@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -180,14 +181,17 @@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
Index: libgomp/Makefile.am
===================================================================
--- libgomp/Makefile.am	(revision 136152)
+++ libgomp/Makefile.am	(working copy)
@@ -32,7 +32,7 @@
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \
 	task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \
-	time.c fortran.c affinity.c
+	time.c fortran.c affinity.c stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def	(revision 136152)
+++ gcc/builtin-types.def	(working copy)
@@ -216,6 +216,7 @@
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -379,6 +380,7 @@
 		     BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_INT_SIZE, BT_PTR,
 		     BT_CONST_PTR, BT_INT, BT_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_SIZE, BT_VOID, BT_PTR, BT_PTR, BT_INT)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 		     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
Index: gcc/fortran/types.def
===================================================================
--- gcc/fortran/types.def	(revision 136152)
+++ gcc/fortran/types.def	(working copy)
@@ -85,6 +85,7 @@
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -97,7 +98,9 @@
 DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_I4, BT_I4, BT_VOLATILE_PTR, BT_I4)
 DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_I8, BT_I8, BT_VOLATILE_PTR, BT_I8)
 DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_I16, BT_I16, BT_VOLATILE_PTR, BT_I16)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_SIZE_SIZE, BT_PTR, BT_INT, BT_UINT)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTR, BT_VOID, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_SIZE, BT_VOID, BT_PTR, BT_INT)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
 
@@ -119,6 +122,7 @@
 		     BT_I16, BT_I16)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_OMPFN_PTR_UINT, BT_VOID, BT_PTR_FN_VOID_PTR,
                      BT_PTR, BT_UINT)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_SIZE, BT_VOID, BT_PTR, BT_PTR, BT_INT)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT,
                      BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT)
Index: gcc/omp-builtins.def
===================================================================
--- gcc/omp-builtins.def	(revision 136152)
+++ gcc/omp-builtins.def	(working copy)
@@ -206,3 +206,22 @@
 		  BT_FN_PTR, ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
 		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_CREATE, "GOMP_stream_create",
+		  BT_FN_PTR_SIZE_SIZE, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_PUSH, "GOMP_stream_push",
+		  BT_FN_VOID_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_HEAD, "GOMP_stream_head",
+		  BT_FN_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_POP, "GOMP_stream_pop",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_EOS_P, "GOMP_stream_eos_p",
+		  BT_FN_BOOL_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_SET_EOS, "GOMP_stream_set_eos",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_DESTROY, "GOMP_stream_destroy",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_PUSH, "GOMP_stream_align_push",
+		  BT_FN_VOID_PTR_PTR_SIZE, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_POP, "GOMP_stream_align_pop",
+		  BT_FN_VOID_PTR_SIZE, ATTR_NOTHROW_LIST)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-06-02 23:06     ` Antoniu Pop
@ 2008-06-04  5:18       ` Antoniu Pop
  2008-06-04  7:00       ` Jakub Jelinek
  1 sibling, 0 replies; 8+ messages in thread
From: Antoniu Pop @ 2008-06-04  5:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

Hi,

I have a correction to make on the previous patch. The code for the
function wait_used_space should read:

+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->write_buffer_index == s->read_buffer_index)
+    do_wait ((int *) &s->write_buffer_index, s->read_buffer_index);
+}

instead of:

+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->read_buffer_index == s->write_buffer_index)
+    do_wait ((int *) &s->read_buffer_index, s->write_buffer_index);
+}

The write_buffer_index and read_buffer_index were reversed and though
it is not semantically incorrect, it might not be properly optimized
or even incorrect in the do_wait (or more precisely in the
__builtin_expect). The address where a value change will allow to
continue is that of the write_buffer_index as the producer needs to
slide his write window.

Antoniu

On Tue, Jun 3, 2008 at 1:05 AM, Antoniu Pop <antoniu.pop@gmail.com> wrote:
> Hi,
>
> This is a revised version of the patch, targetting the gomp-3_0-branch.
> The gomp-3_0-branch plus patch have been bootstrapped and tested on amd64-linux.
>
> Antoniu
>
> ChangeLog:
>
> 2008-04-22  Antoniu Pop  <antoniu.pop@gmail.com>
>           Sebastian Pop  <sebastian.pop@amd.com>
>
> libgomp/
>       * libgomp.h (gomp_stream): Declared.
>
>       * stream.c: New.
>       * libgomp_g.h (GOMP_stream_create, GOMP_stream_push,
>       GOMP_stream_head, GOMP_stream_pop, GOMP_stream_eos_p,
>       GOMP_stream_set_eos, GOMP_stream_destroy, GOMP_stream_align_push,
>       GOMP_stream_align_pop): Declared.
>       * libgomp.map: Export GOMP_*.
>
>       * Makefile.am: Added stream.c to libgomp_la_SOURCES.
>       * Makefile.in: Regenerate.
>
> gcc/
>       * builtin-types.def (BT_FN_BOOL_PTR, BT_FN_VOID_PTR_PTR_SIZE):
>       New types.
>       * fortran/types.def (BT_FN_BOOL_PTR, BT_FN_PTR_SIZE_SIZE,
>       BT_FN_VOID_PTR_SIZE, BT_FN_VOID_PTR_PTR_SIZE): New types.
>
>       * omp-builtins.def (BUILT_IN_GOMP_STREAM_CREATE,
>       BUILT_IN_GOMP_STREAM_PUSH, BUILT_IN_GOMP_STREAM_HEAD,
>       BUILT_IN_GOMP_STREAM_POP, BUILT_IN_GOMP_STREAM_EOS_P,
>       BUILT_IN_GOMP_STREAM_SET_EOS, BUILT_IN_GOMP_STREAM_DESTROY,
>       BUILT_IN_GOMP_STREAM_ALIGN_PUSH, BUILT_IN_GOMP_STREAM_ALIGN_POP):
>       New builtins.
>
>
> On Wed, Apr 30, 2008 at 6:37 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>> On Fri, Apr 25, 2008 at 3:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>  Unconditional busy waiting and sched_yield is a bad idea.  IMHO you
>>>  want a proper synchronization primitive, with optional busy
>>>  waiting.  See gomp-3_0-branch, where user can determine how long if
>>>  at all threads should be busy waiting through GOMP_SPINCOUNT and
>>>  OMP_WAIT_POLICY env vars, throttling is used for number of active
>>>  threads bigger than number of available CPUs and after that falls
>>>  back to futexes (on Linux, on other OSes whatever is supported).
>>
>> If I understand correctly, we should use do_wait instead of
>> sched_yield, so we will have to work in the gomp-3_0-branch.
>>
>> We started working on the gomp-2.5 and we were quite limited by the
>> tasking capabilities: we used parallel sections instead of tasks; but
>> the goal was to use the more flexible task support of gomp-3.0 for being
>> able to dynamically create tasks.
>>
>> Sebastian
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-06-02 23:06     ` Antoniu Pop
  2008-06-04  5:18       ` Antoniu Pop
@ 2008-06-04  7:00       ` Jakub Jelinek
  2008-06-04 22:49         ` Antoniu Pop
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2008-06-04  7:00 UTC (permalink / raw)
  To: Antoniu Pop; +Cc: gcc-patches

On Tue, Jun 03, 2008 at 01:05:54AM +0200, Antoniu Pop wrote:
> This is a revised version of the patch, targetting the gomp-3_0-branch.
> The gomp-3_0-branch plus patch have been bootstrapped and tested on amd64-linux.


--- libgomp/libgomp.map	(revision 136152)
+++ libgomp/libgomp.map	(working copy)
@@ -167,4 +167,13 @@
 	GOMP_loop_ull_runtime_start;
 	GOMP_loop_ull_static_next;
 	GOMP_loop_ull_static_start;
+	GOMP_stream_create;
+	GOMP_stream_push;
+	GOMP_stream_head;
+	GOMP_stream_pop;
+ 	GOMP_stream_eos_p;
+ 	GOMP_stream_set_eos;
+ 	GOMP_stream_destroy;
+ 	GOMP_stream_align_push;
+ 	GOMP_stream_align_pop;
 } GOMP_1.0;

New symbols must go into GOMP_2.0 symver, not GOMP_1.0.

+static inline void 
+slide_read_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->read_buffer_index);
+
+  s->read_buffer_index = next;
+  s->read_index = next;

Are you sure you don't need any barrier in between the two?

+}
+
+static inline void
+slide_write_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->write_buffer_index);
+
+  while (s->read_buffer_index == next)
+    do_wait ((int *) &s->read_buffer_index, next);
+
+  s->write_buffer_index = next;
+  s->write_index = next;
+}

First of all, do_wait is config/linux/ specific.  So, either you
need to come up with new config/* inlines that stream.c will use,
or need to move stream.c to config/*/ and provide separate implementation
for linux and posix.  Note that config/posix/ never busy waits ATM,
it is an implementation that just needs to work somehow.  Can be tested
e.g. by configurying libgomp with --disable-linux-futex.
Another thing is that if waiting too long,
do_wait will eventually sleep on the futex.  So, you need a corresponding
futex_wake on the side which is changing read_buffer_index.

+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->read_buffer_index == s->write_buffer_index)
+    do_wait ((int *) &s->read_buffer_index, s->write_buffer_index);

Do you have guarantees s->write_buffer_index won't change?
Otherwise this is racy.  You need to read it into some
temporary variable, have some kind of barrier to ensure it isn't reread
from memory and then pass that to do_wait, otherwise if
it changes between the while and the do_wait call,
a different value is passed.

	Jakub

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-06-04  7:00       ` Jakub Jelinek
@ 2008-06-04 22:49         ` Antoniu Pop
  2008-07-04 16:26           ` Antoniu Pop
  0 siblings, 1 reply; 8+ messages in thread
From: Antoniu Pop @ 2008-06-04 22:49 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

> +static inline void
> +slide_read_window (gomp_stream s)
> +{
> +  size_t next = next_window (s, s->read_buffer_index);
> +
> +  s->read_buffer_index = next;
> +  s->read_index = next;
>
> Are you sure you don't need any barrier in between the two?

I am not sure why I would need one. Is there some possible problem
with this implementation ? read_index is entirely private to the
consumer and read_buffer_index is only read by the producer. So the
thread executing this code is the only one writing to these. If we
later have multiple consumers on a same stream, we will need to write
it differently indeed. In this case I believe it is not an issue.

> +}
> +
> +static inline void
> +slide_write_window (gomp_stream s)
> +{
> +  size_t next = next_window (s, s->write_buffer_index);
> +
> +  while (s->read_buffer_index == next)
> +    do_wait ((int *) &s->read_buffer_index, next);
> +
> +  s->write_buffer_index = next;
> +  s->write_index = next;
> +}
>
> First of all, do_wait is config/linux/ specific.  So, either you
> need to come up with new config/* inlines that stream.c will use,
> or need to move stream.c to config/*/ and provide separate implementation
> for linux and posix.  Note that config/posix/ never busy waits ATM,
> it is an implementation that just needs to work somehow.  Can be tested
> e.g. by configurying libgomp with --disable-linux-futex.
> Another thing is that if waiting too long,
> do_wait will eventually sleep on the futex.  So, you need a corresponding
> futex_wake on the side which is changing read_buffer_index.

I should look into this indeed and probably rather try to provide some
equivalent do_wait implementation for posix at least.

> +static inline void
> +wait_used_space (gomp_stream s)
> +{
> +  while (s->read_buffer_index == s->write_buffer_index)
> +    do_wait ((int *) &s->read_buffer_index, s->write_buffer_index);
>
> Do you have guarantees s->write_buffer_index won't change?
> Otherwise this is racy.  You need to read it into some
> temporary variable, have some kind of barrier to ensure it isn't reread
> from memory and then pass that to do_wait, otherwise if
> it changes between the while and the do_wait call,
> a different value is passed.

This is an error I had noticed after posting the patch and I posted
the corrected version already. It's really s->read_buffer_index that
provides the reference value (as it is only written by the consumer
who also executes this code). The correct code is:

+  while (s->write_buffer_index == s->read_buffer_index)
+    do_wait ((int *) &s->write_buffer_index, s->read_buffer_index);

Antoniu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] [libgomp] Add a stream communication framework to GOMP
  2008-06-04 22:49         ` Antoniu Pop
@ 2008-07-04 16:26           ` Antoniu Pop
  0 siblings, 0 replies; 8+ messages in thread
From: Antoniu Pop @ 2008-07-04 16:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1472 bytes --]

Hi,

This patch targets the trunk. It fixes the previous issues, providing
configuration specific implementation of the synchronization
operations.
Trunk plus patch have been bootstrapped and tested on amd64-linux.

Antoniu

ChangeLog:

2008-04-22  Antoniu Pop  <antoniu.pop@gmail.com>
          Sebastian Pop  <sebastian.pop@amd.com>

libgomp/
      * config/linux/stream.c: New.
      * config/linux/stream.h: New.
      * config/posix/stream.c: New.
      * config/posix/stream.h: New.

      * libgomp.h: Include stream.h.
      * libgomp_g.h (GOMP_stream_create, GOMP_stream_push,
      GOMP_stream_head, GOMP_stream_pop, GOMP_stream_eos_p,
      GOMP_stream_set_eos, GOMP_stream_destroy, GOMP_stream_align_push,
      GOMP_stream_align_pop): Declared.
      * libgomp.map: Export GOMP_*.

      * Makefile.am: Added stream.c to libgomp_la_SOURCES.
      * Makefile.in: Regenerate.

gcc/
      * builtin-types.def (BT_FN_BOOL_PTR, BT_FN_VOID_PTR_PTR_SIZE,
      BT_FN_PTR_SIZE_SIZE_SIZE): New types.
      * fortran/types.def (BT_FN_BOOL_PTR, BT_FN_VOID_PTR_SIZE,
      BT_FN_VOID_PTR_PTR_SIZE, BT_FN_PTR_SIZE_SIZE_SIZE): New types.

      * omp-builtins.def (BUILT_IN_GOMP_STREAM_CREATE,
      BUILT_IN_GOMP_STREAM_PUSH, BUILT_IN_GOMP_STREAM_HEAD,
      BUILT_IN_GOMP_STREAM_POP, BUILT_IN_GOMP_STREAM_EOS_P,
      BUILT_IN_GOMP_STREAM_SET_EOS, BUILT_IN_GOMP_STREAM_DESTROY,
      BUILT_IN_GOMP_STREAM_ALIGN_PUSH, BUILT_IN_GOMP_STREAM_ALIGN_POP):
      New builtins.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 2008_07_04_15_56_16_gomp_streams.diff --]
[-- Type: text/x-diff; name=2008_07_04_15_56_16_gomp_streams.diff, Size: 34534 bytes --]

email:antoniu.pop@gmail.com
branch:trunk
revision:HEAD
configure:
make:
check:

Index: libgomp/Makefile.in
===================================================================
--- libgomp/Makefile.in	(revision 136948)
+++ libgomp/Makefile.in	(working copy)
@@ -86,7 +86,7 @@
 	error.lo iter.lo iter_ull.lo loop.lo loop_ull.lo ordered.lo \
 	parallel.lo sections.lo single.lo task.lo team.lo work.lo \
 	lock.lo mutex.lo proc.lo sem.lo bar.lo ptrlock.lo time.lo \
-	fortran.lo affinity.lo
+	fortran.lo affinity.lo stream.lo
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 DEFAULT_INCLUDES = -I. -I$(srcdir) -I.
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -226,12 +226,9 @@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -247,6 +244,9 @@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -255,6 +255,7 @@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -262,14 +263,17 @@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
@@ -298,7 +302,7 @@
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \
 	task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \
-	time.c fortran.c affinity.c
+	time.c fortran.c affinity.c stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
@@ -446,6 +450,7 @@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stream.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@
@@ -555,10 +560,13 @@
 	    $(srcdir)/*) base=`echo "$$base" | sed "s|^$$srcdirstrip/||"`;; \
 	  esac; \
 	  if test -f $$base; then d=.; else d=$(srcdir); fi; \
-	  for file in $$d/$$base*; do \
-	    relfile=`expr "$$file" : "$$d/\(.*\)"`; \
-	    test -f $(distdir)/$$relfile || \
-	      cp -p $$file $(distdir)/$$relfile; \
+	  base_i=`echo "$$base" | sed 's|\.info$$||;s|$$|.i|'`; \
+	  for file in $$d/$$base $$d/$$base-[0-9] $$d/$$base-[0-9][0-9] $$d/$$base_i[0-9] $$d/$$base_i[0-9][0-9]; do \
+	    if test -f $$file; then \
+	      relfile=`expr "$$file" : "$$d/\(.*\)"`; \
+	      test -f $(distdir)/$$relfile || \
+		cp -p $$file $(distdir)/$$relfile; \
+	    else :; fi; \
 	  done; \
 	done
 
Index: libgomp/libgomp_g.h
===================================================================
--- libgomp/libgomp_g.h	(revision 136948)
+++ libgomp/libgomp_g.h	(working copy)
@@ -182,4 +182,15 @@
 extern void *GOMP_single_copy_start (void);
 extern void GOMP_single_copy_end (void *);
 
+/* stream.c */
+extern void *GOMP_stream_create (size_t, size_t, size_t);
+extern void GOMP_stream_push (void *, void *);
+extern void *GOMP_stream_head (void *);
+extern void GOMP_stream_pop (void *);
+extern bool GOMP_stream_eos_p (void *);
+extern void GOMP_stream_set_eos (void *);
+extern void GOMP_stream_destroy (void *);
+extern void GOMP_stream_align_push (void *, void *, size_t);
+extern void GOMP_stream_align_pop (void *, size_t);
+
 #endif /* LIBGOMP_G_H */
Index: libgomp/libgomp.map
===================================================================
--- libgomp/libgomp.map	(revision 136948)
+++ libgomp/libgomp.map	(working copy)
@@ -167,4 +167,13 @@
 	GOMP_loop_ull_runtime_start;
 	GOMP_loop_ull_static_next;
 	GOMP_loop_ull_static_start;
+	GOMP_stream_create;
+	GOMP_stream_push;
+	GOMP_stream_head;
+	GOMP_stream_pop;
+ 	GOMP_stream_eos_p;
+ 	GOMP_stream_set_eos;
+ 	GOMP_stream_destroy;
+ 	GOMP_stream_align_push;
+ 	GOMP_stream_align_pop;
 } GOMP_1.0;
Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h	(revision 136948)
+++ libgomp/libgomp.h	(working copy)
@@ -51,8 +51,8 @@
 #include "mutex.h"
 #include "bar.h"
 #include "ptrlock.h"
+#include "stream.h"
 
-
 /* This structure contains the data to control one work-sharing construct,
    either a LOOP (FOR/DO) or a SECTIONS.  */
 
Index: libgomp/testsuite/Makefile.in
===================================================================
--- libgomp/testsuite/Makefile.in	(revision 136948)
+++ libgomp/testsuite/Makefile.in	(working copy)
@@ -144,12 +144,9 @@
 VERSION = @VERSION@
 XCFLAGS = @XCFLAGS@
 XLDFLAGS = @XLDFLAGS@
-ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 ac_ct_FC = @ac_ct_FC@
-ac_ct_RANLIB = @ac_ct_RANLIB@
-ac_ct_STRIP = @ac_ct_STRIP@
 am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
 am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
 am__include = @am__include@
@@ -165,6 +162,9 @@
 build_vendor = @build_vendor@
 config_path = @config_path@
 datadir = @datadir@
+datarootdir = @datarootdir@
+docdir = @docdir@
+dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
@@ -173,6 +173,7 @@
 host_cpu = @host_cpu@
 host_os = @host_os@
 host_vendor = @host_vendor@
+htmldir = @htmldir@
 includedir = @includedir@
 infodir = @infodir@
 install_sh = @install_sh@
@@ -180,14 +181,17 @@
 libexecdir = @libexecdir@
 libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
+localedir = @localedir@
 localstatedir = @localstatedir@
 lt_ECHO = @lt_ECHO@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
 multi_basedir = @multi_basedir@
 oldincludedir = @oldincludedir@
+pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
+psdir = @psdir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 sysconfdir = @sysconfdir@
Index: libgomp/config/linux/stream.h
===================================================================
--- libgomp/config/linux/stream.h	(revision 0)
+++ libgomp/config/linux/stream.h	(revision 0)
@@ -0,0 +1,97 @@
+/* Copyright (C) 2005 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <apop@cri.ensmp.fr>
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This is a Linux specific implementation of a stream communication
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation relies on the futex syscall.  */
+
+#ifndef GOMP_STREAM_H
+#define GOMP_STREAM_H 1
+
+/* This structure represents a stream between tasks.  Special care
+   needs to be taken to ensure proper cache behaviour.  We need to
+   separate groups of fields on independent cache lines according to
+   the usage of each group.  In the case of SMPs, this is a critical
+   requirement.  */
+
+typedef struct gomp_stream
+{
+  /* Read-only group.  These fields will be initialized or redefined
+     *very* sparsely (i.e., change of strategy in the runtime).
+     Therefore, the behaviour of the cache protocol for the cache
+     lines holding the following fields will be to stay in the Shared
+     state.  */
+
+  /* Circular buffer.  */
+  char *buffer;
+
+  /* Number of bytes in the circular buffer.  */
+  size_t capacity;
+
+  /* Size in bytes of an element in the stream.  */
+  size_t size_elt;
+
+  /* Size in bytes of sub-buffers for unsynchronized reads and writes.  */
+  size_t size_local_buffer;
+
+  /* End of stream: true when producer has finished inserting elements.  */
+  bool eos_p;
+
+  /* Producer private group.  The following fields are only touched by
+     the producer (read and written).  They should be on a separate
+     cache line, which will ensure that the line stays in the Modified
+     (or Exclusive) state and avoids ping-pong.  */
+
+  /* Offset in bytes of the first empty element in the stream.  */
+  size_t write_index __attribute__((aligned (64)));
+
+  /* Consumer private group.  The following fields are only touched by
+     the consumer (read and written).  Same remark as for the consumer
+     private group.  */
+
+  /* Offset in bytes of the first used element in the stream.  */
+  size_t read_index __attribute__((aligned (64)));
+
+  /* Shared/Modified groups.  As a stream implicitely synchronizes
+     tasks/threads, it is natural that some fields will be
+     read/written by both producer and consumer.  It is therefore
+     critical to separate those on individual cache lines (obviously
+     same situation as locks).  */
+
+  /* Offset in bytes of the sliding writing window.  Writing window is
+     of size LOCAL_BUFFER_SIZE bytes.  Read/Written by producer, Read
+     by consumer.  */
+  int write_buffer_index __attribute__((aligned (64)));
+
+  /* Offset in bytes of the sliding reading window.  Read window is of
+     size LOCAL_BUFFER_SIZE bytes.  Read/Written by consumer, Read by
+     producer.  */
+  int read_buffer_index __attribute__((aligned (64)));
+} *gomp_stream;
+
+#endif /* GOMP_STREAM_H */
Index: libgomp/config/linux/stream.c
===================================================================
--- libgomp/config/linux/stream.c	(revision 0)
+++ libgomp/config/linux/stream.c	(revision 0)
@@ -0,0 +1,247 @@
+/* Copyright (C) 2008 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <antoniu.pop@gmail.com> 
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This is a Linux specific implementation of a stream communication
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation relies on the futex syscall.  */
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "wait.h"
+#include "libgomp.h"
+
+/* Returns a new stream of COUNT * WINDOW_SIZE elements.  Each element
+   is of size SIZE bytes.  Returns NULL when the allocation fails or
+   when COUNT is less than 2.  */
+
+void *
+GOMP_stream_create (size_t size, size_t count, size_t window_size)
+{
+  gomp_stream s;
+
+  /* There should be enough place for two sliding windows.  */
+  if (count < 2)
+    return NULL;
+
+  s = (gomp_stream) gomp_malloc (sizeof (struct gomp_stream));
+
+  s->eos_p = false;
+  s->read_buffer_index = 0;
+  s->write_buffer_index = 0;
+  s->write_index = 0;
+  s->read_index = 0;
+  s->size_elt = size;
+  s->size_local_buffer = window_size;
+  s->capacity = count * s->size_local_buffer;
+  s->buffer = (char *) gomp_malloc (s->capacity);
+
+  return s;
+}
+
+static inline size_t
+next_window (gomp_stream s, size_t index)
+{
+  size_t next = index + s->size_local_buffer;
+  return ((next >= s->capacity) ? 0 : next);
+}
+
+static inline void 
+slide_read_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->read_buffer_index);
+
+  if (next_window(s, s->write_buffer_index) == s->read_buffer_index)
+    {
+      s->read_buffer_index = next;
+      futex_wake ((int *) &s->read_buffer_index, 1);
+    }
+  else
+    s->read_buffer_index = next;
+
+  s->read_index = next;
+}
+
+static inline void
+slide_write_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->write_buffer_index);
+
+  while (s->read_buffer_index == next)
+    {
+      futex_wake ((int *) &s->write_buffer_index, 1);
+      futex_wait ((int *) &s->read_buffer_index, next);
+    }
+
+  if (s->read_buffer_index == s->write_buffer_index)
+    {
+      s->write_buffer_index = next;
+      futex_wake ((int *) &s->write_buffer_index, 1);
+    }
+  else
+    s->write_buffer_index = next;
+
+  s->write_index = next;
+}
+
+/* Returns the number of read elements in the read sliding window of
+   stream S.  */
+
+static inline size_t
+read_bytes_in_read_window (gomp_stream s)
+{
+  return s->read_index - s->read_buffer_index;
+}
+
+/* Returns the number of written elements in the write sliding window
+   of stream S.  */
+
+static inline size_t
+written_bytes_in_write_window (gomp_stream s)
+{
+  return s->write_index - s->write_buffer_index;
+}
+
+/* Push element ELT to stream S.  */
+
+static inline void
+gomp_stream_push (gomp_stream s, char *elt)
+{
+  if (written_bytes_in_write_window (s) + s->size_elt > s->size_local_buffer)
+    slide_write_window (s);
+
+  memcpy (s->buffer + s->write_index, elt, s->size_elt);
+  s->write_index += s->size_elt;
+}
+
+/* Release from stream S the next element.  */
+
+static inline void
+gomp_stream_pop (gomp_stream s)
+{
+  if (read_bytes_in_read_window (s) + 2 * s->size_elt > s->size_local_buffer)
+    slide_read_window (s);
+  else
+    s->read_index += s->size_elt;
+}
+
+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->write_buffer_index == s->read_buffer_index)
+    {
+      futex_wake ((int *) &s->read_buffer_index, 1);
+      futex_wait ((int *) &s->write_buffer_index, s->read_buffer_index);
+    }
+}
+
+/* Returns the first element of the stream S.  Don't remove the
+   element: for that, a call to gomp_stream_pop is needed.  */
+
+void *
+GOMP_stream_head (void *s)
+{
+  wait_used_space ((gomp_stream) s);
+  return ((gomp_stream) s)->buffer + ((gomp_stream) s)->read_index;
+}
+
+/* Returns true when there are no more elements to be read from the
+   stream S.  */
+
+bool
+GOMP_stream_eos_p (void *s)
+{
+  return (((gomp_stream) s)->eos_p && 
+	  (((gomp_stream) s)->read_index == ((gomp_stream) s)->write_index));
+}
+
+/* Producer can set End Of Stream to stream S.  The producer has to
+   slide the write window if it wrote something.  */
+
+void
+GOMP_stream_set_eos (void *s)
+{
+  if (written_bytes_in_write_window ((gomp_stream) s) > 0)
+    slide_write_window ((gomp_stream) s);
+
+  ((gomp_stream) s)->eos_p = true;
+}
+
+/* Free stream S.  */
+
+void
+GOMP_stream_destroy (void *s)
+{
+  /* No need to synchronize here: the consumer that detects when eos
+     is set, and based on that it decides to destroy the stream.  */
+
+  free (((gomp_stream) s)->buffer);
+  free ((gomp_stream) s);
+}
+
+/* Align the producer and consumer accesses by pushing in the stream
+   COUNT successive elements starting at address START.  */
+
+void
+GOMP_stream_align_push (void *s, void *start, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    {
+      gomp_stream_push ((gomp_stream) s, (char *) start);
+      start += ((gomp_stream) s)->size_elt;
+    }
+}
+
+/* Align the producer and consumer accesses by removing from the
+   stream COUNT elements.  */
+
+void
+GOMP_stream_align_pop (void *s, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    gomp_stream_pop ((gomp_stream) s);
+}
+
+
+void
+GOMP_stream_push (void *s, void *elt)
+{
+  gomp_stream_push ((gomp_stream) s, (char *) elt);
+}
+
+void
+GOMP_stream_pop (void *s)
+{
+  gomp_stream_pop ((gomp_stream) s);
+}
Index: libgomp/config/posix/stream.h
===================================================================
--- libgomp/config/posix/stream.h	(revision 0)
+++ libgomp/config/posix/stream.h	(revision 0)
@@ -0,0 +1,111 @@
+/* Copyright (C) 2005 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <antoniu.pop@gmail.com> 
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This is the default implementation of a stream communication
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation is based entirely on the POSIX library.  */
+
+#ifndef GOMP_STREAM_H
+#define GOMP_STREAM_H 1
+
+/* This structure associates a POSIX condition variable to a buffer
+   index.  We will use the condition to wait and avoid spin-waiting or
+   sched-yielding.  */
+typedef struct buffer_index
+{
+  size_t index;
+  pthread_cond_t cond;
+} buffer_index __attribute__((aligned (64)));
+
+/* This structure represents a stream between tasks.  Special care
+   needs to be taken to ensure proper cache behaviour.  We need to
+   separate groups of fields on independent cache lines according to
+   the usage of each group.  In the case of SMPs, this is a critical
+   requirement.  */
+
+typedef struct gomp_stream
+{
+  /* Read-only group.  These fields will be initialized or redefined
+     *very* sparsely (i.e., change of strategy in the runtime).
+     Therefore, the behaviour of the cache protocol for the cache
+     lines holding the following fields will be to stay in the Shared
+     state.  */
+
+  /* Circular buffer.  */
+  char *buffer;
+
+  /* Number of bytes in the circular buffer.  */
+  size_t capacity;
+
+  /* Size in bytes of an element in the stream.  */
+  size_t size_elt;
+
+  /* Size in bytes of sub-buffers for unsynchronized reads and writes.  */
+  size_t size_local_buffer;
+
+  /* End of stream: true when producer has finished inserting elements.  */
+  bool eos_p;
+
+  /* Producer private group.  The following fields are only touched by
+     the producer (read and written).  They should be on a separate
+     cache line, which will ensure that the line stays in the Modified
+     (or Exclusive) state and avoids ping-pong.  */
+
+  /* Offset in bytes of the first empty element in the stream.  */
+  size_t write_index __attribute__((aligned (64)));
+
+  /* Consumer private group.  The following fields are only touched by
+     the consumer (read and written).  Same remark as for the consumer
+     private group.  */
+
+  /* Offset in bytes of the first used element in the stream.  */
+  size_t read_index __attribute__((aligned (64)));
+
+  /* Shared/Modified groups.  As a stream implicitely synchronizes
+     tasks/threads, it is natural that some fields will be
+     read/written by both producer and consumer.  It is therefore
+     critical to separate those on individual cache lines (obviously
+     same situation as locks).  */
+
+  /* Offset in bytes of the sliding writing window.  Writing window is
+     of size LOCAL_BUFFER_SIZE bytes.  Read/Written by producer, Read
+     by consumer.  */
+  buffer_index write_buffer_index;
+
+  /* Offset in bytes of the sliding reading window.  Read window is of
+     size LOCAL_BUFFER_SIZE bytes.  Read/Written by consumer, Read by
+     producer.  */
+  buffer_index read_buffer_index;
+
+  /* Lock required for waiting on the conditions associated with the
+     two buffer indexes.  */
+  pthread_mutex_t buffer_index_mutex __attribute__((aligned (64)));
+
+} *gomp_stream;
+
+#endif /* GOMP_STREAM_H */
Index: libgomp/config/posix/stream.c
===================================================================
--- libgomp/config/posix/stream.c	(revision 0)
+++ libgomp/config/posix/stream.c	(revision 0)
@@ -0,0 +1,254 @@
+/* Copyright (C) 2008 Free Software Foundation, Inc.
+   Contributed by Antoniu Pop <antoniu.pop@gmail.com> 
+   and Sebastian Pop <sebastian.pop@amd.com>.
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Lesser General Public License as published by
+   the Free Software Foundation; either version 2.1 of the License, or
+   (at your option) any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
+   more details.
+
+   You should have received a copy of the GNU Lesser General Public License 
+   along with libgomp; see the file COPYING.LIB.  If not, write to the
+   Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+/* As a special exception, if you link this library with other files, some
+   of which are compiled with GCC, to produce an executable, this library
+   does not by itself cause the resulting executable to be covered by the
+   GNU General Public License.  This exception does not however invalidate
+   any other reasons why the executable file might be covered by the GNU
+   General Public License.  */
+
+/* This is the default implementation of a stream communication
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation is based entirely on the POSIX library.  */
+
+#include "libgomp.h"
+#include <stdlib.h>
+#include <string.h>
+
+/* Returns a new stream of COUNT * SIZE_LOCAL_BUFFER elements.  Each
+   element is of size SIZE bytes.  Returns NULL when the allocation
+   fails or when COUNT is less than 2.  */
+
+void *
+GOMP_stream_create (size_t size, size_t count, size_t window_size)
+{
+  gomp_stream s;
+
+  /* There should be enough place for two sliding windows.  */
+  if (count < 2)
+    return NULL;
+
+  s = (gomp_stream) gomp_malloc (sizeof (struct gomp_stream));
+
+  s->eos_p = false;
+  s->write_index = 0;
+  s->read_index = 0;
+  s->read_buffer_index.index = 0;
+  pthread_cond_init (&s->read_buffer_index.cond, NULL);
+  s->write_buffer_index.index = 0;
+  pthread_cond_init (&s->write_buffer_index.cond, NULL);
+  pthread_mutex_init (&s->buffer_index_mutex, NULL);
+  s->size_elt = size;
+  s->size_local_buffer = window_size;
+  s->capacity = count * s->size_local_buffer;
+  s->buffer = (char *) gomp_malloc (s->capacity);
+
+  return s;
+}
+
+static inline size_t
+next_window (gomp_stream s, size_t index)
+{
+  size_t next = index + s->size_local_buffer;
+  return ((next >= s->capacity) ? 0 : next);
+}
+
+static inline void 
+slide_read_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->read_buffer_index.index);
+
+  if (next_window(s, s->write_buffer_index.index) == s->read_buffer_index.index)
+    {
+      s->read_buffer_index.index = next;
+      pthread_cond_signal (&s->read_buffer_index.cond);
+    }
+  else
+    s->read_buffer_index.index = next;
+
+  s->read_index = next;
+}
+
+static inline void
+slide_write_window (gomp_stream s)
+{
+  size_t next = next_window (s, s->write_buffer_index.index);
+
+  while (s->read_buffer_index.index == next) 
+    {
+      pthread_mutex_lock (&s->buffer_index_mutex);
+      pthread_cond_signal (&s->write_buffer_index.cond);
+      pthread_cond_wait (&s->read_buffer_index.cond, 
+			 &s->buffer_index_mutex);
+      pthread_mutex_unlock (&s->buffer_index_mutex);
+    }
+
+  if (s->read_buffer_index.index == s->write_buffer_index.index)
+    {
+      s->write_buffer_index.index = next;
+      pthread_cond_signal (&s->write_buffer_index.cond);
+    }
+  else
+    s->write_buffer_index.index = next;
+
+  s->write_index = next;
+}
+
+/* Returns the number of read elements in the read sliding window of
+   stream S.  */
+
+static inline size_t
+read_bytes_in_read_window (gomp_stream s)
+{
+  return s->read_index - s->read_buffer_index.index;
+}
+
+/* Returns the number of written elements in the write sliding window
+   of stream S.  */
+
+static inline size_t
+written_bytes_in_write_window (gomp_stream s)
+{
+  return s->write_index - s->write_buffer_index.index;
+}
+
+/* Push element ELT to stream S.  */
+
+static inline void
+gomp_stream_push (gomp_stream s, char *elt)
+{
+  if (written_bytes_in_write_window (s) + s->size_elt > s->size_local_buffer)
+    slide_write_window (s);
+
+  memcpy (s->buffer + s->write_index, elt, s->size_elt);
+  s->write_index += s->size_elt;
+}
+
+/* Release from stream S the next element.  */
+
+static inline void
+gomp_stream_pop (gomp_stream s)
+{
+  if (read_bytes_in_read_window (s) + 2 * s->size_elt > s->size_local_buffer)
+    slide_read_window (s);
+  else
+    s->read_index += s->size_elt;
+}
+
+/* Wait until the producer has slided the write window in stream S.  */
+
+static inline void
+wait_used_space (gomp_stream s)
+{
+  while (s->write_buffer_index.index == s->read_buffer_index.index)
+    {
+      pthread_mutex_lock (&s->buffer_index_mutex);
+      pthread_cond_signal (&s->read_buffer_index.cond);
+      pthread_cond_wait (&s->write_buffer_index.cond, 
+			 &s->buffer_index_mutex);
+      pthread_mutex_unlock (&s->buffer_index_mutex);
+    }
+}
+
+/* Returns the first element of the stream S.  Don't remove the
+   element: for that, a call to gomp_stream_pop is needed.  */
+
+void *
+GOMP_stream_head (void *s)
+{
+  wait_used_space ((gomp_stream) s);
+  return ((gomp_stream) s)->buffer + ((gomp_stream) s)->read_index;
+}
+
+/* Returns true when there are no more elements to be read from the
+   stream S.  */
+
+bool
+GOMP_stream_eos_p (void *s)
+{
+  return (((gomp_stream) s)->eos_p && 
+	  (((gomp_stream) s)->read_index == ((gomp_stream) s)->write_index));
+}
+
+/* Producer can set End Of Stream to stream S.  The producer has to
+   slide the write window if it wrote something.  */
+
+void
+GOMP_stream_set_eos (void *s)
+{
+  if (written_bytes_in_write_window ((gomp_stream) s) > 0)
+    slide_write_window ((gomp_stream) s);
+
+  ((gomp_stream) s)->eos_p = true;
+}
+
+/* Free stream S.  */
+
+void
+GOMP_stream_destroy (void *s)
+{
+  /* No need to synchronize here: the consumer that detects when eos
+     is set, and based on that it decides to destroy the stream.  */
+
+  free (((gomp_stream) s)->buffer);
+  free ((gomp_stream) s);
+}
+
+/* Align the producer and consumer accesses by pushing in the stream
+   COUNT successive elements starting at address START.  */
+
+void
+GOMP_stream_align_push (void *s, void *start, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    {
+      gomp_stream_push ((gomp_stream) s, (char *) start);
+      start += ((gomp_stream) s)->size_elt;
+    }
+}
+
+/* Align the producer and consumer accesses by removing from the
+   stream COUNT elements.  */
+
+void
+GOMP_stream_align_pop (void *s, size_t count)
+{
+  size_t i;
+
+  for (i = 0; i < count; ++i)
+    gomp_stream_pop ((gomp_stream) s);
+}
+
+
+void
+GOMP_stream_push (void *s, void *elt)
+{
+  gomp_stream_push ((gomp_stream) s, (char *) elt);
+}
+
+void
+GOMP_stream_pop (void *s)
+{
+  gomp_stream_pop ((gomp_stream) s);
+}
Index: libgomp/Makefile.am
===================================================================
--- libgomp/Makefile.am	(revision 136948)
+++ libgomp/Makefile.am	(working copy)
@@ -32,7 +32,7 @@
 libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
 	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \
 	task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \
-	time.c fortran.c affinity.c
+	time.c fortran.c affinity.c stream.c
 
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def	(revision 136948)
+++ gcc/builtin-types.def	(working copy)
@@ -218,6 +218,7 @@
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -381,6 +382,9 @@
 		     BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_INT_SIZE, BT_PTR,
 		     BT_CONST_PTR, BT_INT, BT_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_SIZE, BT_VOID, BT_PTR, BT_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_SIZE, 
+		     BT_PTR, BT_SIZE, BT_SIZE, BT_SIZE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 		     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
Index: gcc/fortran/types.def
===================================================================
--- gcc/fortran/types.def	(revision 136948)
+++ gcc/fortran/types.def	(working copy)
@@ -85,6 +85,7 @@
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -98,6 +99,7 @@
 DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_I8, BT_I8, BT_VOLATILE_PTR, BT_I8)
 DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_I16, BT_I16, BT_VOLATILE_PTR, BT_I16)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTR, BT_VOID, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_SIZE, BT_VOID, BT_PTR, BT_INT)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
 
@@ -119,6 +121,9 @@
 		     BT_I16, BT_I16)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_OMPFN_PTR_UINT, BT_VOID, BT_PTR_FN_VOID_PTR,
                      BT_PTR, BT_UINT)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_PTR_SIZE, BT_VOID, BT_PTR, BT_PTR, BT_INT)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_SIZE,
+		     BT_PTR, BT_INT, BT_INT, BT_INT)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT,
                      BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT)
Index: gcc/omp-builtins.def
===================================================================
--- gcc/omp-builtins.def	(revision 136948)
+++ gcc/omp-builtins.def	(working copy)
@@ -206,3 +206,22 @@
 		  BT_FN_PTR, ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
 		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_CREATE, "GOMP_stream_create",
+		  BT_FN_PTR_SIZE_SIZE_SIZE, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_PUSH, "GOMP_stream_push",
+		  BT_FN_VOID_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_HEAD, "GOMP_stream_head",
+		  BT_FN_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_POP, "GOMP_stream_pop",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_EOS_P, "GOMP_stream_eos_p",
+		  BT_FN_BOOL_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_SET_EOS, "GOMP_stream_set_eos",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_DESTROY, "GOMP_stream_destroy",
+		  BT_FN_VOID_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_PUSH, "GOMP_stream_align_push",
+		  BT_FN_VOID_PTR_PTR_SIZE, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_STREAM_ALIGN_POP, "GOMP_stream_align_pop",
+		  BT_FN_VOID_PTR_SIZE, ATTR_NOTHROW_LIST)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-07-04 16:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-25 11:24 [PATCH] [libgomp] Add a stream communication framework to GOMP Antoniu Pop
2008-04-25 11:27 ` Jakub Jelinek
2008-04-30 19:23   ` Sebastian Pop
2008-06-02 23:06     ` Antoniu Pop
2008-06-04  5:18       ` Antoniu Pop
2008-06-04  7:00       ` Jakub Jelinek
2008-06-04 22:49         ` Antoniu Pop
2008-07-04 16:26           ` Antoniu Pop

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).