public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] New version of libmpx with new memmove wrapper
@ 2015-11-05 10:38 Aleksandra Tsvetkova
  2015-11-12 14:57 ` Ilya Enkovich
  0 siblings, 1 reply; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-11-05 10:38 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 358 bytes --]

New version of libmpx was added. There is a new function get_bd() that
allows to get bounds directory. Wrapper for memmove was modified. Now
it moves data and then moves corresponding bounds directly from one
bounds table to another. This approach made moving unaligned pointers
possible. It also makes memmove function faster on sizes bigger than
64 bytes.

[-- Attachment #2: patch_v3.diff --]
[-- Type: text/plain, Size: 31557 bytes --]

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
old mode 100644
new mode 100755
index e693063..3026a05
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
+
+	* gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
+
 2015-10-27  Alan Lawrence  <alan.lawrence@arm.com>
 
 	* gcc.dg/vect/vect-strided-shift-1.c: New.
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 0000000..3e2e61c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,118 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  test (0, 1, 2, 0, 2, 1, 0, 0);
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+main ()
+{
+  testall ();
+  return 0;
+}
diff --git a/libmpx/ChangeLog b/libmpx/ChangeLog
old mode 100644
new mode 100755
index 5e5f77d..c186378
--- a/libmpx/ChangeLog
+++ b/libmpx/ChangeLog
@@ -1,3 +1,21 @@
+2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
+
+	* mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
+	* libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise.
+	* libmpx/Makefile.in: Regenerate.
+	* mpxrt/Makefile.in: Regenerate.
+	* libmpxwrap/Makefile.in: Regenerate.
+	* mpxrt/libtool-version: New version.
+	* libmpxwrap/libtool-version: Likewise.
+	* mpxrt/libmpx.map: Add new version and a new symbol.
+	* mpxrt/mpxrt.h: New file.
+	* mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+	* mpxrt/mpxrt.c (REG_IP_IDX): Moved to mpxrt.h.
+	* mpxrt/mpxrt.c (REX_PREFIX): Moved to mpxrt.h.
+	* mpxrt/mpxrt.c (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+	* mpxrt/mpxrt.c (MPX_L1_SIZE): Moved to mpxrt.h.
+	* libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove to make it faster.
+
 2015-10-15  Ilya Enkovich  <enkovich.gnu@gmail.com>
 
 	PR other/66887
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..1a86888
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2014, 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+const uintptr_t MPX_L1_ADDR_MASK = 0xfffff000UL;
+const uintptr_t MPX_L2_ADDR_MASK = 0xfffffffcUL;
+const uintptr_t MPX_L2_VALID_MASK = 0x00000001UL;
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+const uintptr_t MPX_L1_ADDR_MASK = 0xfffffffffffff000ULL;
+const uintptr_t MPX_L2_ADDR_MASK = 0xfffffffffffffff8ULL;
+const uintptr_t MPX_L2_VALID_MASK = 0x0000000000000001ULL;
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..9965e06
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,403 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   *lb is the lower bound, *ub is the upper bound,
+   *p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, struct mpx_bt_entry **bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* move_bounds function copies N bytes from SRC to DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table SRC)
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void *
+move_bounds (void *dst, const void *src, size_t n)
+{
+  if ((n < sizeof (void *)) || !((char*)src-(char*)dst))
+  /* No MPX or not enough bytes for the pointer,
+    therefore, not necessary to copy.  */
+    return dst;
+  struct mpx_bt_entry **bd = get_bd ();
+  if (!(bd))
+    return dst;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return dst;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return dst;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return dst;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return dst;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return dst;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
+  else
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return dst;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return dst;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return dst;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return dst;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return dst;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return dst;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return dst;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return dst;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return dst;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return dst;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return dst;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return dst;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  move_bounds (dst, src, n);
+  return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-05 10:38 [PATCH] New version of libmpx with new memmove wrapper Aleksandra Tsvetkova
@ 2015-11-12 14:57 ` Ilya Enkovich
  2015-11-23 19:51   ` Aleksandra Tsvetkova
  0 siblings, 1 reply; 14+ messages in thread
From: Ilya Enkovich @ 2015-11-12 14:57 UTC (permalink / raw)
  To: Aleksandra Tsvetkova; +Cc: gcc-patches

2015-11-05 13:37 GMT+03:00 Aleksandra Tsvetkova <astsvetk@gmail.com>:
> New version of libmpx was added. There is a new function get_bd() that
> allows to get bounds directory. Wrapper for memmove was modified. Now
> it moves data and then moves corresponding bounds directly from one
> bounds table to another. This approach made moving unaligned pointers
> possible. It also makes memmove function faster on sizes bigger than
> 64 bytes.

+2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
+
+ * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
+

Did you test it on different targets? It seems to me this test will
fail if you run it
on non-MPX target.  Please look at mpx-check.h and how other MPX run
tests use it.

+ * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (REG_IP_IDX): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (REX_PREFIX): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (MPX_L1_SIZE): Moved to mpxrt.h.

No need to repeat file name.

+ * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove to make it faster.

You added new functions, types and modified existing function.  Make
ChangeLog more detailed.

--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2014, 2015, Intel Corporation
+ *  All rights reserved.

2015 only

+const uintptr_t MPX_L1_ADDR_MASK = 0xfffff000UL;
+const uintptr_t MPX_L2_ADDR_MASK = 0xfffffffcUL;
+const uintptr_t MPX_L2_VALID_MASK = 0x00000001UL;

Use defines


--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)

This is not reflected in ChangeLog

+/* The mpx_bt_entry struct represents a cell in bounds table.
+   *lb is the lower bound, *ub is the upper bound,
+   *p is the stored pointer.  */

Bounds and pointer are in lb, ub, p, not in *lb, *ub, *p. Right?

+static inline void
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}

This should be marked as bnd_legacy.

+/* move_bounds function copies N bytes from SRC to DST.

Really?

+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table SRC)

SRC is not a bounds table

+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void *
+move_bounds (void *dst, const void *src, size_t n)

What is returned value for?

+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  move_bounds (dst, src, n);
+  return dst;
 }

You completely remove old algorithm which should be faster on small
sizes. __mpx_wrapper_memmove should become a dispatcher between old
and new implementations depending on target (32-bit or 64-bit) and N.
Since old version performs both data and bounds copy, BD check should
be moved into __mpx_wrapper_memmove to never call
it when MPX is disabled.

Thanks,
Ilya

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-12 14:57 ` Ilya Enkovich
@ 2015-11-23 19:51   ` Aleksandra Tsvetkova
  2015-11-24 12:35     ` Ilya Enkovich
  0 siblings, 1 reply; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-11-23 19:51 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2478 bytes --]

gcc/testsuite/ChangeLog
+2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
+
+ * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.

libmpx/ChangeLog
+2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
+
+ * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
+ * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
+ * libmpx/Makefile.in: Regenerate.
+ * mpxrt/Makefile.in: Regenerate.
+ * libmpxwrap/Makefile.in: Regenerate.
+ * mpxrt/libtool-version: New version.
+ * libmpxwrap/libtool-version: Likewise.
+ * mpxrt/libmpx.map: Add new version and a new symbol.
+ * mpxrt/mpxrt.h: New file.
+ * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+                (REG_IP_IDX): Moved to mpxrt.h.
+                (REX_PREFIX): Moved to mpxrt.h.
+                (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+                (MPX_L1_SIZE): Moved to mpxrt.h.
+ * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove
+ to make it faster.
+ New types: mpx_pointer for extraction of indexes from pointer
+           mpx_bt_entry represents a cell in bounds table.
+ New functions: alloc_bt for allocatinn bounds table
+               get_bt to get address of bounds table
+   copy_if_possible and copy_if_possible_from_end move elements
+   of bounds table if we can
+   move_bounds moves bounds just like memmove


All fixed except for:

>>+static inline void
>>+alloc_bt (void *ptr)
>>+{
>>+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
>>+}
>
>This should be marked as bnd_legacy.

It will not work.

> +void *
> +__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
> +{
> +  if (n == 0)
> +    return dst;
> +
> +  __bnd_chk_ptr_bounds (dst, n);
> +  __bnd_chk_ptr_bounds (src, n);
> +
> +  memmove (dst, src, n);
> +  move_bounds (dst, src, n);
> +  return dst;
>  }
>
> You completely remove old algorithm which should be faster on small
> sizes. __mpx_wrapper_memmove should become a dispatcher between old
> and new implementations depending on target (32-bit or 64-bit) and N.
> Since old version performs both data and bounds copy, BD check should
> be moved into __mpx_wrapper_memmove to never call
> it when MPX is disabled.

Even though the old algorithm is faster on small sizes, it should not be used
with the new one because the new one supports unaligned pointers and the
old one does not. Different behavior may cause more problems.

Thanks,
Alexandra.

[-- Attachment #2: patch_v4.diff --]
[-- Type: text/plain, Size: 29939 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 0000000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..48cc7ae
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..34985d8
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,404 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, struct mpx_bt_entry **bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  if ((n < sizeof (void *)) || !((char*)src-(char*)dst))
+  /* No MPX or not enough bytes for the pointer,
+    therefore, not necessary to copy.  */
+    return;
+  struct mpx_bt_entry **bd = get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  move_bounds (dst, src, n);
+
+return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-23 19:51   ` Aleksandra Tsvetkova
@ 2015-11-24 12:35     ` Ilya Enkovich
  2015-11-25 15:43       ` Aleksandra Tsvetkova
  0 siblings, 1 reply; 14+ messages in thread
From: Ilya Enkovich @ 2015-11-24 12:35 UTC (permalink / raw)
  To: Aleksandra Tsvetkova; +Cc: gcc-patches

2015-11-23 22:44 GMT+03:00 Aleksandra Tsvetkova <astsvetk@gmail.com>:
> gcc/testsuite/ChangeLog
> +2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
> +
> + * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
>
> libmpx/ChangeLog
> +2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
> +
> + * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
> + * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
> + * libmpx/Makefile.in: Regenerate.
> + * mpxrt/Makefile.in: Regenerate.
> + * libmpxwrap/Makefile.in: Regenerate.
> + * mpxrt/libtool-version: New version.
> + * libmpxwrap/libtool-version: Likewise.
> + * mpxrt/libmpx.map: Add new version and a new symbol.
> + * mpxrt/mpxrt.h: New file.
> + * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
> +                (REG_IP_IDX): Moved to mpxrt.h.
> +                (REX_PREFIX): Moved to mpxrt.h.
> +                (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
> +                (MPX_L1_SIZE): Moved to mpxrt.h.

Wrong indentation

> + * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove
> + to make it faster.
> + New types: mpx_pointer for extraction of indexes from pointer
> +           mpx_bt_entry represents a cell in bounds table.
> + New functions: alloc_bt for allocatinn bounds table
> +               get_bt to get address of bounds table
> +   copy_if_possible and copy_if_possible_from_end move elements
> +   of bounds table if we can
> +   move_bounds moves bounds just like memmove

Format ChangeLog for this file appropriately.  One entry for function/type.
Look for many examples in existing ChangeLogs.

        (mpx_pointer): New type.
        (alloc_bt): New function.

etc.

>
>
> All fixed except for:
>
>>>+static inline void
>>>+alloc_bt (void *ptr)
>>>+{
>>>+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
>>>+}
>>
>>This should be marked as bnd_legacy.
>
> It will not work.

Why?

>cat test.c
static inline void __attribute__((bnd_legacy))
asm_test (void *ptr)
{
  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
}

void
test (void *p1, void *p2)
{
  asm_test (p1);
  asm_test (p2);
}
>gcc test.c -S -mmpx -O2
>cat test.s
        .file   "test.c"
        .text
        .p2align 4,,15
        .globl  test
        .type   test, @function
test:
.LFB1:
        .cfi_startproc
#APP
# 4 "test.c" 1
        bndstx %bnd0, (%rdi,%rdi)
# 0 "" 2
# 4 "test.c" 1
        bndstx %bnd0, (%rsi,%rsi)
# 0 "" 2
#NO_APP
        ret
        .cfi_endproc

Seems to work for me...

>
>> +void *
>> +__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
>> +{
>> +  if (n == 0)
>> +    return dst;
>> +
>> +  __bnd_chk_ptr_bounds (dst, n);
>> +  __bnd_chk_ptr_bounds (src, n);
>> +
>> +  memmove (dst, src, n);
>> +  move_bounds (dst, src, n);
>> +  return dst;
>>  }
>>
>> You completely remove old algorithm which should be faster on small
>> sizes. __mpx_wrapper_memmove should become a dispatcher between old
>> and new implementations depending on target (32-bit or 64-bit) and N.
>> Since old version performs both data and bounds copy, BD check should
>> be moved into __mpx_wrapper_memmove to never call
>> it when MPX is disabled.
>
> Even though the old algorithm is faster on small sizes, it should not be used
> with the new one because the new one supports unaligned pointers and the
> old one does not. Different behavior may cause more problems.

OK, lets stick to the single algorithm and think how to improve it for
small sizes.

+
+const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);

Use #define

+  if ((n < sizeof (void *)) || !((char*)src-(char*)dst))

Move this check into __mpx_wrapper_memmove. You also can use || src == dst.

You still didn't describe performed testing.


Ilya

>
> Thanks,
> Alexandra.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-24 12:35     ` Ilya Enkovich
@ 2015-11-25 15:43       ` Aleksandra Tsvetkova
  2015-11-25 15:54         ` Aleksandra Tsvetkova
  2015-11-26 10:49         ` Ilya Enkovich
  0 siblings, 2 replies; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-11-25 15:43 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1363 bytes --]

gcc/testsuite/ChangeLog
2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

    * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.

libmpx/ChangeLog
2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

    * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
    * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
    * libmpx/Makefile.in: Regenerate.
    * mpxrt/Makefile.in: Regenerate.
    * libmpxwrap/Makefile.in: Regenerate.
    * mpxrt/libtool-version: New version.
    * libmpxwrap/libtool-version: Likewise.
    * mpxrt/libmpx.map: Add new version and a new symbol.
    * mpxrt/mpxrt.h: New file.
    * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
    (REG_IP_IDX): Moved to mpxrt.h.
    (REX_PREFIX): Moved to mpxrt.h.
    (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
    (MPX_L1_SIZE): Moved to mpxrt.h.
    * libmpxwrap/mpx_wrappers.c: (__mpx_wrapper_memmove): Rewritten.
    (mpx_pointer): New type.
    (mpx_bt_entry): New type.
    (alloc_bt): New function.
    (get_bt): New function.
    (copy_if_possible): New function.
    (copy_if_possible_from_end): New function.
    (move_bounds): New function.

Memmove became 2 times slower on 8 bytes. On bigger lengths (>64 bytes) it became up to 3,5 times better with pointers, up to 21 times better without pointers.

[-- Attachment #2: patch_v3.diff --]
[-- Type: text/plain, Size: 30045 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 0000000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..e825d7d
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+#define MPX_L1_SIZE ((1UL << NUM_L1_BITS) * sizeof (void *))
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..1b0fb7c
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,406 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* A special type for bd is needed because bt addresses can be modified.  */
+typedef struct mpx_bt_entry * volatile * bd_type;
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void __attribute__ ((bnd_legacy))
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, bd_type bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  bd_type bd = get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  /* No MPX or not enough bytes for the pointer,
+    therefore, not necessary to copy.  */
+  if ((n > sizeof (void *)) || (src != dst))
+    move_bounds (dst, src, n);
+
+return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-25 15:43       ` Aleksandra Tsvetkova
@ 2015-11-25 15:54         ` Aleksandra Tsvetkova
  2015-11-26 10:49         ` Ilya Enkovich
  1 sibling, 0 replies; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-11-25 15:54 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

I ran make check (paseed) and spec 2000, where 1 extra test(255.vortex) failed.

On Wed, Nov 25, 2015 at 6:41 PM, Aleksandra Tsvetkova
<astsvetk@gmail.com> wrote:
> gcc/testsuite/ChangeLog
> 2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
>
> libmpx/ChangeLog
> 2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
>     * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
>     * libmpx/Makefile.in: Regenerate.
>     * mpxrt/Makefile.in: Regenerate.
>     * libmpxwrap/Makefile.in: Regenerate.
>     * mpxrt/libtool-version: New version.
>     * libmpxwrap/libtool-version: Likewise.
>     * mpxrt/libmpx.map: Add new version and a new symbol.
>     * mpxrt/mpxrt.h: New file.
>     * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
>     (REG_IP_IDX): Moved to mpxrt.h.
>     (REX_PREFIX): Moved to mpxrt.h.
>     (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
>     (MPX_L1_SIZE): Moved to mpxrt.h.
>     * libmpxwrap/mpx_wrappers.c: (__mpx_wrapper_memmove): Rewritten.
>     (mpx_pointer): New type.
>     (mpx_bt_entry): New type.
>     (alloc_bt): New function.
>     (get_bt): New function.
>     (copy_if_possible): New function.
>     (copy_if_possible_from_end): New function.
>     (move_bounds): New function.
>
> Memmove became 2 times slower on 8 bytes. On bigger lengths (>64 bytes) it became up to 3,5 times better with pointers, up to 21 times better without pointers.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-25 15:43       ` Aleksandra Tsvetkova
  2015-11-25 15:54         ` Aleksandra Tsvetkova
@ 2015-11-26 10:49         ` Ilya Enkovich
  2015-12-06 19:41           ` Aleksandra Tsvetkova
  1 sibling, 1 reply; 14+ messages in thread
From: Ilya Enkovich @ 2015-11-26 10:49 UTC (permalink / raw)
  To: Aleksandra Tsvetkova; +Cc: gcc-patches

2015-11-25 18:41 GMT+03:00 Aleksandra Tsvetkova <astsvetk@gmail.com>:
> gcc/testsuite/ChangeLog
> 2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
>
> libmpx/ChangeLog
> 2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
>     * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
>     * libmpx/Makefile.in: Regenerate.
>     * mpxrt/Makefile.in: Regenerate.
>     * libmpxwrap/Makefile.in: Regenerate.
>     * mpxrt/libtool-version: New version.
>     * libmpxwrap/libtool-version: Likewise.
>     * mpxrt/libmpx.map: Add new version and a new symbol.
>     * mpxrt/mpxrt.h: New file.
>     * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
>     (REG_IP_IDX): Moved to mpxrt.h.
>     (REX_PREFIX): Moved to mpxrt.h.
>     (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
>     (MPX_L1_SIZE): Moved to mpxrt.h.
>     * libmpxwrap/mpx_wrappers.c: (__mpx_wrapper_memmove): Rewritten.
>     (mpx_pointer): New type.
>     (mpx_bt_entry): New type.
>     (alloc_bt): New function.
>     (get_bt): New function.
>     (copy_if_possible): New function.
>     (copy_if_possible_from_end): New function.
>     (move_bounds): New function.
>
> Memmove became 2 times slower on 8 bytes. On bigger lengths (>64 bytes) it became up to 3,5 times better with pointers, up to 21 times better without pointers.

+  bd_type bd = get_bd ();
+  if (!(bd))
+    return;

Add explicit typecast.

+  /* No MPX or not enough bytes for the pointer,
+    therefore, not necessary to copy.  */
+  if ((n > sizeof (void *)) || (src != dst))
+    move_bounds (dst, src, n);

Comment doesn't match condition.  I believe condition should be ((n >=
sizeof (void *)) && (src != dst)).

Otherwise patch looks good. We need to make sure spec benchmark
failure is due to improved checking quality before commit it.

Thanks,
Ilya

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-11-26 10:49         ` Ilya Enkovich
@ 2015-12-06 19:41           ` Aleksandra Tsvetkova
  2015-12-07 16:00             ` Ilya Enkovich
  0 siblings, 1 reply; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-12-06 19:41 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 50 bytes --]

Fixed all.
Now there are no new fails on spec2000

[-- Attachment #2: patch_v5.diff --]
[-- Type: text/plain, Size: 30061 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 0000000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..e825d7d
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+#define MPX_L1_SIZE ((1UL << NUM_L1_BITS) * sizeof (void *))
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..7991f48
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,406 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* A special type for bd is needed because bt addresses can be modified.  */
+typedef struct mpx_bt_entry * volatile * bd_type;
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void __attribute__ ((bnd_legacy))
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, bd_type bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  bd_type bd = (bd_type)get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  /* Not necessary to copy bounds if size is less then size of pointer
+     or SRC=DST.  */
+  if ((n >= sizeof (void *)) || (src != dst))
+    move_bounds (dst, src, n);
+
+return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-12-06 19:41           ` Aleksandra Tsvetkova
@ 2015-12-07 16:00             ` Ilya Enkovich
  2015-12-08 10:46               ` Aleksandra Tsvetkova
  0 siblings, 1 reply; 14+ messages in thread
From: Ilya Enkovich @ 2015-12-07 16:00 UTC (permalink / raw)
  To: Aleksandra Tsvetkova; +Cc: gcc-patches

2015-12-06 22:41 GMT+03:00 Aleksandra Tsvetkova <astsvetk@gmail.com>:
> Fixed all.
> Now there are no new fails on spec2000

If you made some fix in your algorithm to pass SPEC benchmarks, you
need to extend your tests to cover this fix.

Thanks,
Ilya

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-12-07 16:00             ` Ilya Enkovich
@ 2015-12-08 10:46               ` Aleksandra Tsvetkova
  2015-12-08 10:54                 ` Aleksandra Tsvetkova
  0 siblings, 1 reply; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-12-08 10:46 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1275 bytes --]

gcc/testsuite/ChangeLog
2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

    * gcc.target/i386/mpx/memmove-1.c: New test for __mpx_wrapper_memmove.
    * gcc.target/i386/mpx/memmove-2.c: New test covering fail on spec.

libmpx/ChangeLog
2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

    * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
    * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
    * libmpx/Makefile.in: Regenerate.
    * mpxrt/Makefile.in: Regenerate.
    * libmpxwrap/Makefile.in: Regenerate.
    * mpxrt/libtool-version: New version.
    * libmpxwrap/libtool-version: Likewise.
    * mpxrt/libmpx.map: Add new version and a new symbol.
    * mpxrt/mpxrt.h: New file.
    * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
    (REG_IP_IDX): Moved to mpxrt.h.
    (REX_PREFIX): Moved to mpxrt.h.
    (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
    (MPX_L1_SIZE): Moved to mpxrt.h.
    * libmpxwrap/mpx_wrappers.c: (__mpx_wrapper_memmove): Rewritten.
    (mpx_pointer): New type.
    (mpx_bt_entry): New type.
    (alloc_bt): New function.
    (get_bt): New function.
    (copy_if_possible): New function.
    (copy_if_possible_from_end): New function.
    (move_bounds): New function.

[-- Attachment #2: patch_v3.diff --]
[-- Type: text/plain, Size: 31489 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
new file mode 100755
index 0000000..d0ada25
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   This test checks that a bug making spec2000 fail with
+   SEGFAULT is fixed.  */
+
+int
+mpx_test (int argc, const char **argv)
+{
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           2 * bt_num_of_elems * sizeof (void *));
+  void **src = arr, **dst = arr, **ptr = arr;
+  src += 10;
+  dst += 1;
+  ptr += bt_num_of_elems + 100;
+  ptr[0] = __bnd_set_ptr_bounds (arr + 1, sizeof (void *) + 1);
+  memmove (dst, src, 5 * sizeof (void *));
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 0000000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..e825d7d
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+#define MPX_L1_SIZE ((1UL << NUM_L1_BITS) * sizeof (void *))
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..7991f48
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,406 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* A special type for bd is needed because bt addresses can be modified.  */
+typedef struct mpx_bt_entry * volatile * bd_type;
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void __attribute__ ((bnd_legacy))
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, bd_type bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  bd_type bd = (bd_type)get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  /* Not necessary to copy bounds if size is less then size of pointer
+     or SRC=DST.  */
+  if ((n >= sizeof (void *)) || (src != dst))
+    move_bounds (dst, src, n);
+
+return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-12-08 10:46               ` Aleksandra Tsvetkova
@ 2015-12-08 10:54                 ` Aleksandra Tsvetkova
  2015-12-11 14:35                   ` Ilya Enkovich
  0 siblings, 1 reply; 14+ messages in thread
From: Aleksandra Tsvetkova @ 2015-12-08 10:54 UTC (permalink / raw)
  To: Ilya Enkovich; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

Wrong version of patch was attached.

On Tue, Dec 8, 2015 at 1:46 PM, Aleksandra Tsvetkova <astsvetk@gmail.com> wrote:
> gcc/testsuite/ChangeLog
> 2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * gcc.target/i386/mpx/memmove-1.c: New test for __mpx_wrapper_memmove.
>     * gcc.target/i386/mpx/memmove-2.c: New test covering fail on spec.
>
> libmpx/ChangeLog
> 2015-10-28  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
>
>     * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
>     * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
>     * libmpx/Makefile.in: Regenerate.
>     * mpxrt/Makefile.in: Regenerate.
>     * libmpxwrap/Makefile.in: Regenerate.
>     * mpxrt/libtool-version: New version.
>     * libmpxwrap/libtool-version: Likewise.
>     * mpxrt/libmpx.map: Add new version and a new symbol.
>     * mpxrt/mpxrt.h: New file.
>     * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
>     (REG_IP_IDX): Moved to mpxrt.h.
>     (REX_PREFIX): Moved to mpxrt.h.
>     (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
>     (MPX_L1_SIZE): Moved to mpxrt.h.
>     * libmpxwrap/mpx_wrappers.c: (__mpx_wrapper_memmove): Rewritten.
>     (mpx_pointer): New type.
>     (mpx_bt_entry): New type.
>     (alloc_bt): New function.
>     (get_bt): New function.
>     (copy_if_possible): New function.
>     (copy_if_possible_from_end): New function.
>     (move_bounds): New function.

[-- Attachment #2: patch_v3.diff --]
[-- Type: text/plain, Size: 31511 bytes --]

diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c b/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c
new file mode 100755
index 0000000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
new file mode 100755
index 0000000..0adfa4c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   This test checks that a bug making spec2000 test 255.vortex
+   fail with SEGFAULT is fixed.  */
+
+int
+mpx_test (int argc, const char **argv)
+{
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           2 * bt_num_of_elems * sizeof (void *));
+  void **src = arr, **dst = arr, **ptr = arr;
+  src += 10;
+  dst += 1;
+  ptr += bt_num_of_elems + 100;
+  ptr[0] = __bnd_set_ptr_bounds (arr + 1, sizeof (void *) + 1);
+  memmove (dst, src, 5 * sizeof (void *));
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..e825d7d
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+#define MPX_L1_SIZE ((1UL << NUM_L1_BITS) * sizeof (void *))
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..7991f48
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,406 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* A special type for bd is needed because bt addresses can be modified.  */
+typedef struct mpx_bt_entry * volatile * bd_type;
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void __attribute__ ((bnd_legacy))
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, bd_type bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  bd_type bd = (bd_type)get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it’s valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  /* Not necessary to copy bounds if size is less then size of pointer
+     or SRC=DST.  */
+  if ((n >= sizeof (void *)) || (src != dst))
+    move_bounds (dst, src, n);
+
+return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-12-08 10:54                 ` Aleksandra Tsvetkova
@ 2015-12-11 14:35                   ` Ilya Enkovich
  2016-01-20 13:20                     ` Matthias Klose
  0 siblings, 1 reply; 14+ messages in thread
From: Ilya Enkovich @ 2015-12-11 14:35 UTC (permalink / raw)
  To: Aleksandra Tsvetkova; +Cc: gcc-patches

On 08 Dec 13:53, Aleksandra Tsvetkova wrote:
> Wrong version of patch was attached.
> 
> On Tue, Dec 8, 2015 at 1:46 PM, Aleksandra Tsvetkova <astsvetk@gmail.com> wrote:
> > gcc/testsuite/ChangeLog
> > 2015-10-27  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>
> >
> >     * gcc.target/i386/mpx/memmove-1.c: New test for __mpx_wrapper_memmove.
> >     * gcc.target/i386/mpx/memmove-2.c: New test covering fail on spec.

memmove-2.c has Windows-style end of lines.

> +  /* Not necessary to copy bounds if size is less then size of pointer
> +     or SRC=DST.  */
> +  if ((n >= sizeof (void *)) || (src != dst))
> +    move_bounds (dst, src, n);

Condition is still incorrect.

I fixed it, bootstrapped, regtested and applied to trunk.  Here is committed version.

Thanks,
Ilya
--
libmpx/

2015-12-11  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

	* mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info
	option.
	* libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise and
	fix include path.
	* libmpx/Makefile.in: Regenerate.
	* mpxrt/Makefile.in: Regenerate.
	* libmpxwrap/Makefile.in: Regenerate.
	* mpxrt/libtool-version: New version.
	* libmpxwrap/libtool-version: Likewise.
	* mpxrt/libmpx.map: Add new version and a new symbol.
	* mpxrt/mpxrt.h: New file.
	* mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
	(REG_IP_IDX): Moved to mpxrt.h.
	(REX_PREFIX): Moved to mpxrt.h.
	(XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
	(MPX_L1_SIZE): Moved to mpxrt.h.
	* libmpxwrap/mpx_wrappers.c (mpx_pointer): New type.
	(mpx_bt_entry): New type.
	(alloc_bt): New function.
	(get_bt): New function.
	(copy_if_possible): New function.
	(copy_if_possible_from_end): New function.
	(move_bounds): New function.
	(__mpx_wrapper_memmove): Use move_bounds to copy bounds.

gcc/testsuite/

2015-12-11  Tsvetkova Alexandra  <aleksandra.tsvetkova@intel.com>

	* gcc.target/i386/mpx/memmove-1.c: New test.
	* gcc.target/i386/mpx/memmove-2.c: New test.


diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c b/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c
new file mode 100755
index 0000000..0efd030
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove-1.c
@@ -0,0 +1,117 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <string.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+      int bd_index_end, int src_bt_index_end, int pointers_inside,
+      int src_align, int dst_align)
+{
+  const int n =
+    src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+    {
+      return 0;
+    }
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+    src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+    dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+    }
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+    {
+      for (int i = 0; i < n; i++)
+        {
+          if (dst[i] != arr + i)
+            abort ();
+          if (__bnd_get_ptr_lbound (dst[i]) != arr + i)
+            abort ();
+          if (__bnd_get_ptr_ubound (dst[i]) != arr + 2 * i)
+            abort ();
+        }
+    }
+  free (arr);
+  return 0;
+}
+
+/* Call testall to test common cases of memmove for MPX.  */
+void
+testall ()
+{
+  int align[3];
+  align[0] = 0;
+  align[1] = 1;
+  align[2] = 7;
+  for (int pointers_inside = 0; pointers_inside < 2; pointers_inside++)
+    for (int src_bigger_dst = 0; src_bigger_dst < 2; src_bigger_dst++)
+      for (int src_align = 0; src_align < 3; src_align ++)
+        for (int dst_align = 0; dst_align < 3; dst_align ++)
+          for (int pages = 0; pages < 4; pages++)
+            {
+              test (src_bigger_dst, 1, 2, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, 2, pages, 2, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 1, pages, 1, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 2, 3, pages, 12, pointers_inside,
+                    align[src_align], align[dst_align]);
+              test (src_bigger_dst, 1, bt_num_of_elems - 2, pages, 2,
+                    pointers_inside, align[src_align], align[dst_align]);
+            }
+};
+
+int
+mpx_test (int argc, const char **argv)
+{
+  testall ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
new file mode 100755
index 0000000..e1d78fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include <stdint.h>
+#include <string.h>
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+
+/* Function to test MPX wrapper of memmove function.
+   Check case with no BT allocated for data.  */
+
+int
+mpx_test (int argc, const char **argv)
+{
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+           1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+           2 * bt_num_of_elems * sizeof (void *));
+  void **src = arr, **dst = arr, **ptr = arr;
+  src += 10;
+  dst += 1;
+  ptr += bt_num_of_elems + 100;
+  ptr[0] = __bnd_set_ptr_bounds (arr + 1, sizeof (void *) + 1);
+  memmove (dst, src, 5 * sizeof (void *));
+  return 0;
+}
diff --git a/libmpx/Makefile.in b/libmpx/Makefile.in
index ff36a7f..d644af3 100644
--- a/libmpx/Makefile.in
+++ b/libmpx/Makefile.in
@@ -228,7 +228,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
diff --git a/libmpx/mpxrt/Makefile.am b/libmpx/mpxrt/Makefile.am
old mode 100644
new mode 100755
index a00a808..3280b62
--- a/libmpx/mpxrt/Makefile.am
+++ b/libmpx/mpxrt/Makefile.am
@@ -13,7 +13,8 @@ libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 
 libmpx_la_CFLAGS = -fPIC
 libmpx_la_DEPENDENCIES = libmpx.map
-libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 mpxrt.lo: mpxrt-utils.h
 mpxrt-utils.lo: mpxrt-utils.h
diff --git a/libmpx/mpxrt/Makefile.in b/libmpx/mpxrt/Makefile.in
index 646f3a9..1fdb454 100644
--- a/libmpx/mpxrt/Makefile.in
+++ b/libmpx/mpxrt/Makefile.in
@@ -222,7 +222,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -257,7 +256,9 @@ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_SOURCES = mpxrt.c mpxrt-utils.c
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_CFLAGS = -fPIC
 @LIBMPX_SUPPORTED_TRUE@libmpx_la_DEPENDENCIES = libmpx.map
-@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx)
+@LIBMPX_SUPPORTED_TRUE@libmpx_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpx.map $(link_libmpx) \
+@LIBMPX_SUPPORTED_TRUE@                    -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
diff --git a/libmpx/mpxrt/libmpx.map b/libmpx/mpxrt/libmpx.map
index 90093b7..1f0fc2c 100644
--- a/libmpx/mpxrt/libmpx.map
+++ b/libmpx/mpxrt/libmpx.map
@@ -3,3 +3,8 @@ LIBMPX_1.0
   local:
 	*;
 };
+LIBMPX_2.0
+{
+  global:
+    get_bd;
+} LIBMPX_1.0;
diff --git a/libmpx/mpxrt/libtool-version b/libmpx/mpxrt/libtool-version
index 5aa6ed7..7d99255 100644
--- a/libmpx/mpxrt/libtool-version
+++ b/libmpx/mpxrt/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
old mode 100644
new mode 100755
index c29c5d9..bcdd3a6
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -51,34 +51,11 @@
 #include <sys/prctl.h>
 #include <cpuid.h>
 #include "mpxrt-utils.h"
-
-#ifdef __i386__
-
-/* i386 directory size is 4MB */
-#define NUM_L1_BITS    20
-
-#define REG_IP_IDX      REG_EIP
-#define REX_PREFIX
-
-#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
-
-#else /* __i386__ */
-
-/* x86_64 directory size is 2GB */
-#define NUM_L1_BITS   28
-
-#define REG_IP_IDX    REG_RIP
-#define REX_PREFIX    "0x48, "
-
-#define XSAVE_OFFSET_IN_FPMEM    0
-
-#endif /* !__i386__ */
+#include "mpxrt.h"
 
 #define MPX_ENABLE_BIT_NO 0
 #define BNDPRESERVE_BIT_NO 1
 
-const size_t MPX_L1_SIZE = (1UL << NUM_L1_BITS) * sizeof (void *);
-
 struct xsave_hdr_struct
 {
   uint64_t xstate_bv;
@@ -508,3 +485,10 @@ mpxrt_cleanup (void)
   __mpxrt_utils_free ();
   process_specific_finish ();
 }
+
+/* Get address of bounds directory.  */
+void *
+get_bd ()
+{
+  return l1base;
+}
diff --git a/libmpx/mpxrt/mpxrt.h b/libmpx/mpxrt/mpxrt.h
new file mode 100755
index 0000000..e825d7d
--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h                  -*-C++-*-
+ *
+ *************************************************************************
+ *
+ *  @copyright
+ *  Copyright (C) 2015, Intel Corporation
+ *  All rights reserved.
+ *
+ *  @copyright
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ *    * Redistributions of source code must retain the above copyright
+ *      notice, this list of conditions and the following disclaimer.
+ *    * Redistributions in binary form must reproduce the above copyright
+ *      notice, this list of conditions and the following disclaimer in
+ *      the documentation and/or other materials provided with the
+ *      distribution.
+ *    * Neither the name of Intel Corporation nor the names of its
+ *      contributors may be used to endorse or promote products derived
+ *      from this software without specific prior written permission.
+ *
+ *  @copyright
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
+ *  HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ *  OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ *  AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
+ *  WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ *  POSSIBILITY OF SUCH DAMAGE.
+ *
+ **************************************************************************/
+#ifdef __i386__
+
+/* i386 directory size is 4MB.  */
+#define NUM_L1_BITS 20
+#define NUM_L2_BITS 10
+#define NUM_IGN_BITS 2
+#define MPX_L1_ADDR_MASK  0xfffff000UL
+#define MPX_L2_ADDR_MASK  0xfffffffcUL
+#define MPX_L2_VALID_MASK 0x00000001UL
+
+#define REG_IP_IDX      REG_EIP
+#define REX_PREFIX
+
+#define XSAVE_OFFSET_IN_FPMEM    sizeof (struct _libc_fpstate)
+
+#else /* __i386__ */
+
+/* x86_64 directory size is 2GB.  */
+#define NUM_L1_BITS 28
+#define NUM_L2_BITS 17
+#define NUM_IGN_BITS 3
+#define MPX_L1_ADDR_MASK  0xfffffffffffff000ULL
+#define MPX_L2_ADDR_MASK  0xfffffffffffffff8ULL
+#define MPX_L2_VALID_MASK 0x0000000000000001ULL
+
+#define REG_IP_IDX    REG_RIP
+#define REX_PREFIX    "0x48, "
+
+#define XSAVE_OFFSET_IN_FPMEM 0
+
+#endif /* !__i386__ */
+
+#define MPX_L1_SIZE ((1UL << NUM_L1_BITS) * sizeof (void *))
+
+/* Get address of bounds directory.  */
+void *
+get_bd ();
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
old mode 100644
new mode 100755
index 72abccf..f24cdc8
--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -6,7 +7,8 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
 
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 
diff --git a/libmpx/mpxwrap/Makefile.in b/libmpx/mpxwrap/Makefile.in
index 1612ebf..df1a334 100644
--- a/libmpx/mpxwrap/Makefile.in
+++ b/libmpx/mpxwrap/Makefile.in
@@ -221,7 +221,6 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libmpx = @link_libmpx@
-link_mpx = @link_mpx@
 localedir = @localedir@
 localstatedir = @localstatedir@
 mandir = @mandir@
@@ -247,6 +246,7 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
@@ -254,7 +254,9 @@ libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
 			   -fno-chkp-check-write -fno-chkp-use-wrappers -fPIC
 
 libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
-libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map \
+               -version-info `grep -v '^\#' $(srcdir)/libtool-version`
+
 toolexeclib_LTLIBRARIES = libmpxwrappers.la
 libmpxwrappers_la_SOURCES = mpx_wrappers.c
 
diff --git a/libmpx/mpxwrap/libtool-version b/libmpx/mpxwrap/libtool-version
old mode 100644
new mode 100755
index bfe84c8..fab30fb
--- a/libmpx/mpxwrap/libtool-version
+++ b/libmpx/mpxwrap/libtool-version
@@ -3,4 +3,4 @@
 # a separate file so that version updates don't involve re-running
 # automake.
 # CURRENT:REVISION:AGE
-1:0:0
+2:0:0
diff --git a/libmpx/mpxwrap/mpx_wrappers.c b/libmpx/mpxwrap/mpx_wrappers.c
old mode 100644
new mode 100755
index 58670aa..ffa7e7e
--- a/libmpx/mpxwrap/mpx_wrappers.c
+++ b/libmpx/mpxwrap/mpx_wrappers.c
@@ -26,6 +26,8 @@
 #include "stdlib.h"
 #include "string.h"
 #include <sys/mman.h>
+#include <stdint.h>
+#include "mpxrt/mpxrt.h"
 
 void *
 __mpx_wrapper_malloc (size_t size)
@@ -88,75 +90,406 @@ __mpx_wrapper_bzero (void *dst, size_t len)
   __mpx_wrapper_memset (dst, 0, len);
 }
 
-void *
-__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+/* The mpx_pointer type is used for getting bits
+   for bt_index (index in bounds table) and
+   bd_index (index in bounds directory).  */
+typedef union
+{
+  struct
+  {
+    unsigned long ignored:NUM_IGN_BITS;
+    unsigned long l2entry:NUM_L2_BITS;
+    unsigned long l1index:NUM_L1_BITS;
+  };
+  void *pointer;
+} mpx_pointer;
+
+/* The mpx_bt_entry struct represents a cell in bounds table.
+   lb is the lower bound, ub is the upper bound,
+   p is the stored pointer.  */
+struct mpx_bt_entry
 {
-  const char *s = (const char*)src;
-  char *d = (char*)dst;
-  void *ret = dst;
-  size_t offset_src = ((size_t) s) & (sizeof (void *) - 1);
-  size_t offset_dst = ((size_t) d) & (sizeof (void *) - 1);
+  void *lb;
+  void *ub;
+  void *p;
+  void *reserved;
+};
+
+/* A special type for bd is needed because bt addresses can be modified.  */
+typedef struct mpx_bt_entry * volatile * bd_type;
+
+/* Function alloc_bt is used for allocating bounds table
+   for the destination pointers if we don't have one.
+   We generate a bounds store for some pointer belonging
+   to that table and kernel allocates the table for us.  */
+static inline void __attribute__ ((bnd_legacy))
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}
 
-  if (n == 0)
-    return ret;
+/* get_bt returns address of bounds table that should
+   exist at BD[BD_INDEX].  If there is no address or the address is not valid,
+   we try to allocate a valid table.
+   If we succeed in getting bt, its address will be returned.
+   If we can't get a valid bt, NULL will be returned.  */
+__attribute__ ((bnd_legacy)) static inline struct mpx_bt_entry *
+get_bt (unsigned bd_index, bd_type bd)
+{
+  struct mpx_bt_entry *bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+                            & MPX_L2_ADDR_MASK);
+  if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    {
+      mpx_pointer ptr;
+      ptr.l1index = bd_index;
+      /* If we don't have BT, allocate it.  */
+      alloc_bt (ptr.pointer);
+      bt = (struct mpx_bt_entry *) ((uintptr_t) bd[bd_index]
+            & MPX_L2_ADDR_MASK);
+      if (!(bt) || !((uintptr_t) bd[bd_index] & MPX_L2_VALID_MASK))
+    return NULL;
+    }
+  return bt;
+}
 
-  __bnd_chk_ptr_bounds (dst, n);
-  __bnd_chk_ptr_bounds (src, n);
+/* Function copy_if_possible moves elements from *FROM to *TO.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   it copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible (int elems, int elems_to_copy, struct mpx_bt_entry *from,
+                  struct mpx_bt_entry *to)
+{
+  if (elems < elems_to_copy)
+    memmove (to, from, elems * sizeof (struct mpx_bt_entry));
+  else
+    {
+      memmove (to, from, elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
 
-  /* Different alignment means that even if
-     pointers exist in memory, we don't how
-     pointers are aligned and therefore cann't
-     copy bounds anyway.  */
-  if (offset_src != offset_dst)
-    memmove (dst, src, n);
+/* Function copy_if_possible_from_end moves elements ending at *SRC_END
+   to the place where they will end at *DST_END.
+   If ELEMS is less then the ELEMS_TO_COPY (elements we can copy),
+   function copies ELEMS elements and returns 0.
+   Otherwise, it copies ELEMS_TO_COPY elements and returns 1.  */
+__attribute__ ((bnd_legacy)) static inline int
+copy_if_possible_from_end (int elems, int elems_to_copy, struct mpx_bt_entry
+                           *src_end, struct mpx_bt_entry *dst_end)
+{
+  if (elems < elems_to_copy)
+    memmove (dst_end - elems, src_end - elems,
+             elems * sizeof (struct mpx_bt_entry));
   else
     {
-      if (s < d)
-	{
-	  d += n;
-	  s += n;
-	  offset_src = (offset_src + n) & (sizeof (void *) -1);
-	  while (n-- && offset_src--)
-	    *--d = *--s;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *--d1 = *--s1;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *--d = *--s;
-	}
+      memmove (dst_end - elems_to_copy,
+           src_end - elems_to_copy,
+           elems_to_copy * sizeof (struct mpx_bt_entry));
+      return 1;
+    }
+  return 0;
+}
+
+/* move_bounds function copies bounds for N bytes from bt of SRC to bt of DST.
+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table of SRC
+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void
+move_bounds (void *dst, const void *src, size_t n)
+{
+  bd_type bd = (bd_type)get_bd ();
+  if (!(bd))
+    return;
+
+  /* We get indexes for all tables and number of elements for BT.  */
+  unsigned long bt_num_of_elems = (1UL << NUM_L2_BITS);
+  mpx_pointer addr_src, addr_dst, addr_src_end, addr_dst_end;
+  addr_src.pointer = (char *) src;
+  addr_dst.pointer = (char *) dst;
+  addr_src_end.pointer = (char *) src + n - 1;
+  addr_dst_end.pointer = (char *) dst + n - 1;
+  unsigned dst_bd_index = addr_dst.l1index;
+  unsigned src_bd_index = addr_src.l1index;
+  unsigned dst_bt_index = addr_dst.l2entry;
+  unsigned src_bt_index = addr_src.l2entry;
+
+  unsigned dst_bd_index_end = addr_dst_end.l1index;
+  unsigned src_bd_index_end = addr_src_end.l1index;
+  unsigned dst_bt_index_end = addr_dst_end.l2entry;
+  unsigned src_bt_index_end = addr_src_end.l2entry;
+
+  int elems_to_copy = src_bt_index_end - src_bt_index + 1 + (src_bd_index_end
+                      - src_bd_index) * bt_num_of_elems;
+  struct mpx_bt_entry *bt_src, *bt_dst;
+  uintptr_t bt_valid;
+  /* size1 and size2 will be used to find out what portions
+     can be used to copy data.  */
+  int size1_elem, size2_elem, size1_bytes, size2_bytes;
+
+  /* Copy from the beginning.  */
+  if (((char *) src - (char *) dst) > 0)
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+      /* We can copy the whole preliminary piece of data.  */
+      if (src_bt_index > dst_bt_index)
+        {
+          size1_elem = src_bt_index - dst_bt_index;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (bt_num_of_elems - src_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+            }
+          elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      /* We have to copy preliminary data in two parts.  */
+      else
+        {
+          size2_elem = dst_bt_index - src_bt_index;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (bt_num_of_elems - dst_bt_index,
+                  elems_to_copy, &(bt_src[src_bt_index]),
+                  &(bt_dst[dst_bt_index])))
+                return;
+              elems_to_copy -= bt_num_of_elems - dst_bt_index;
+
+              dst_bd_index++;
+
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible (size2_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[0])))
+                return;
+              elems_to_copy -= size2_elem;
+            }
+          else
+            elems_to_copy -= bt_num_of_elems - src_bt_index;
+        }
+      src_bd_index++;
+
+      /* For each bounds table check if it's valid and move it.  */
+      for (; src_bd_index < src_bd_index_end; src_bd_index++)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index++;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index] & MPX_L2_VALID_MASK;
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible (size1_elem, elems_to_copy, &(bt_src[0]),
+                  &(bt_dst[size2_elem])))
+                return;
+
+              elems_to_copy -= size1_elem;
+              dst_bd_index++;
+              bt_dst = get_bt (dst_bd_index, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]),
+                       elems_to_copy * sizeof (struct mpx_bt_entry));
+
+            }
+        }
+    }
+  /* Copy from the end.  */
+  else
+    {
+      /* Copy everything till the end of the first bounds table (src)  */
+      bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                & MPX_L2_ADDR_MASK);
+      bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+
+      if (src_bt_index_end <= dst_bt_index_end)
+      /* We can copy the whole preliminary piece of data.  */
+        {
+          size2_elem = dst_bt_index_end - src_bt_index_end;
+          size1_elem = bt_num_of_elems - size2_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+
+              if (copy_if_possible_from_end (src_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+            }
+          elems_to_copy -= src_bt_index_end + 1;
+        }
+      /* We have to copy preliminary data in two parts.  */
       else
-	{
-	  offset_src = sizeof (void *) - offset_src;
-	  while (n-- && offset_src--)
-	    *d++ = *s++;
-	  n++;
-	  if (!n)
-	    return ret;
-	  void **d1 = (void **)d;
-	  void **s1 = (void **)s;
-	  /* This loop will also copy bounds.  */
-	  while (n >= sizeof (void *))
-	    {
-	      n -= sizeof (void *);
-	      *d1++ = *s1++;
-	    }
-	  s = (char *)s1;
-	  d = (char *)d1;
-	  while (n--)
-	    *d++ = *s++;
-	}
+        {
+          size1_elem = src_bt_index_end - dst_bt_index_end;
+          size2_elem = bt_num_of_elems - size1_elem;
+          size1_bytes = size1_elem * sizeof (struct mpx_bt_entry);
+          size2_bytes = size2_elem * sizeof (struct mpx_bt_entry);
+
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (dst_bt_index_end + 1,
+                  elems_to_copy, &(bt_src[src_bt_index_end + 1]),
+                  &(bt_dst[dst_bt_index_end + 1])))
+                return;
+              elems_to_copy -= dst_bt_index_end + 1;
+
+              dst_bd_index_end--;
+
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              if (copy_if_possible_from_end (size1_elem, elems_to_copy,
+                  &(bt_src[size1_elem]), &(bt_dst[bt_num_of_elems])))
+                return;
+
+              elems_to_copy -= size1_elem;
+            }
+          else
+            elems_to_copy -= src_bt_index_end + 1;
+        }
+      src_bd_index_end--;
+      /* For each bounds table we check if there are valid pointers inside.
+         If there are some, we copy table in pre-counted portions.  */
+      for (; src_bd_index_end > src_bd_index; src_bd_index_end--)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (!bt_src || !bt_valid)
+            dst_bd_index_end--;
+          else
+            {
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[0]), &(bt_src[size1_elem]), size2_bytes);
+              dst_bd_index_end--;
+              bt_dst = get_bt (dst_bd_index_end, bd);
+              if (!bt_dst)
+                return;
+              memmove (&(bt_dst[size2_elem]), &(bt_src[0]), size1_bytes);
+            }
+          elems_to_copy -= bt_num_of_elems;
+        }
+
+      /* Now we have the last page that may be not full
+         we copy it separately.  */
+      if (elems_to_copy > 0)
+        {
+          bt_src = (struct mpx_bt_entry *) ((uintptr_t) bd[src_bd_index_end]
+                    & MPX_L2_ADDR_MASK);
+          bt_valid = (uintptr_t) bd[src_bd_index_end] & MPX_L2_VALID_MASK;
+          /* Check we have bounds to copy. */
+          if (bt_src && bt_valid)
+          {
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            if (copy_if_possible_from_end (size2_elem, elems_to_copy,
+                &(bt_src[bt_num_of_elems]), &(bt_dst[size2_elem])))
+              return;
+
+            elems_to_copy -= size2_elem;
+            dst_bd_index_end--;
+            bt_dst = get_bt (dst_bd_index_end, bd);
+            if (!bt_dst)
+              return;
+            memmove (&(bt_dst[dst_bt_index]), &(bt_src[src_bt_index]),
+                     elems_to_copy * sizeof (struct mpx_bt_entry));
+          }
+        }
     }
-  return ret;
+  return;
+}
+
+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+    return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  /* Not necessary to copy bounds if size is less then size of pointer
+     or SRC==DST.  */
+  if ((n >= sizeof (void *)) && (src != dst))
+    move_bounds (dst, src, n);
+
+  return dst;
 }
 
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2015-12-11 14:35                   ` Ilya Enkovich
@ 2016-01-20 13:20                     ` Matthias Klose
  2016-01-20 14:37                       ` Ilya Enkovich
  0 siblings, 1 reply; 14+ messages in thread
From: Matthias Klose @ 2016-01-20 13:20 UTC (permalink / raw)
  To: Ilya Enkovich, Aleksandra Tsvetkova; +Cc: gcc-patches

On 11.12.2015 15:34, Ilya Enkovich wrote:
> I fixed it, bootstrapped, regtested and applied to trunk.  Here is committed version.

this left libmpx/libtool-version, which now is unused and outdated. Ok to remove?

Matthias

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] New version of libmpx with new memmove wrapper
  2016-01-20 13:20                     ` Matthias Klose
@ 2016-01-20 14:37                       ` Ilya Enkovich
  0 siblings, 0 replies; 14+ messages in thread
From: Ilya Enkovich @ 2016-01-20 14:37 UTC (permalink / raw)
  To: Matthias Klose; +Cc: Aleksandra Tsvetkova, gcc-patches

2016-01-20 16:20 GMT+03:00 Matthias Klose <doko@ubuntu.com>:
> On 11.12.2015 15:34, Ilya Enkovich wrote:
>>
>> I fixed it, bootstrapped, regtested and applied to trunk.  Here is
>> committed version.
>
>
> this left libmpx/libtool-version, which now is unused and outdated. Ok to
> remove?

OK if bootstrap passes.

Thanks,
Ilya

>
> Matthias
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-20 14:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-05 10:38 [PATCH] New version of libmpx with new memmove wrapper Aleksandra Tsvetkova
2015-11-12 14:57 ` Ilya Enkovich
2015-11-23 19:51   ` Aleksandra Tsvetkova
2015-11-24 12:35     ` Ilya Enkovich
2015-11-25 15:43       ` Aleksandra Tsvetkova
2015-11-25 15:54         ` Aleksandra Tsvetkova
2015-11-26 10:49         ` Ilya Enkovich
2015-12-06 19:41           ` Aleksandra Tsvetkova
2015-12-07 16:00             ` Ilya Enkovich
2015-12-08 10:46               ` Aleksandra Tsvetkova
2015-12-08 10:54                 ` Aleksandra Tsvetkova
2015-12-11 14:35                   ` Ilya Enkovich
2016-01-20 13:20                     ` Matthias Klose
2016-01-20 14:37                       ` Ilya Enkovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).