* [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object
@ 2016-12-13 11:43 Thomas Preudhomme
2016-12-13 11:46 ` Thomas Preudhomme
2016-12-14 8:08 ` Richard Biener
0 siblings, 2 replies; 3+ messages in thread
From: Thomas Preudhomme @ 2016-12-13 11:43 UTC (permalink / raw)
To: Richard Biener, Jakub Jelinek, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1625 bytes --]
Hi,
Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on
GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The
backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no regression.
ChangeLog entries are as follow:
*** gcc/ChangeLog ***
2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
Backport from mainline
2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
PR tree-optimization/77673
* tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
(init_symbolic_number): Initialize src field from src parameter.
(perform_symbolic_merge): Select most dominated statement as the
source statement. Set src field of resulting n structure from the
input src with the lowest address.
(find_bswap_or_nop): Rename source_stmt into ins_stmt.
(bswap_replace): Rename src_stmt into ins_stmt. Initially get source
of load from src field rather than insertion statement. Cancel
optimization if statement analyzed is not dominated by the insertion
statement.
(pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute
dominance information.
*** gcc/testsuite/ChangeLog ***
2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
Backport from mainline
2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
PR tree-optimization/77673
* gcc.dg/pr77673.c: New test.
Is this ok for gcc-5-branch and gcc-6-branch?
Best regards,
Thomas
[-- Attachment #2: 2_combine_profile_multilib.patch --]
[-- Type: text/x-patch, Size: 9988 bytes --]
diff --git a/gcc/config.gcc b/gcc/config.gcc
index bfd1127d6e8e647ca8c3a57dd2d58b586dffe4a5..d7508e89de0e1d7a3027da0b04e4fa36c1e95fd0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3710,34 +3710,18 @@ case "${target}" in
# Add extra multilibs
if test "x$with_multilib_list" != x; then
arm_multilibs=`echo $with_multilib_list | sed -e 's/,/ /g'`
- case ${arm_multilibs} in
- aprofile)
- # Note that arm/t-aprofile is a
- # stand-alone make file fragment to be
- # used only with itself. We do not
- # specifically use the
- # TM_MULTILIB_OPTION framework because
- # this shorthand is more
- # pragmatic.
- tmake_profile_file="arm/t-aprofile"
- ;;
- rmprofile)
- # Note that arm/t-rmprofile is a
- # stand-alone make file fragment to be
- # used only with itself. We do not
- # specifically use the
- # TM_MULTILIB_OPTION framework because
- # this shorthand is more
- # pragmatic.
- tmake_profile_file="arm/t-rmprofile"
- ;;
- default)
- ;;
- *)
- echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2
- exit 1
- ;;
- esac
+ if test "x${arm_multilibs}" != xdefault ; then
+ for arm_multilib in ${arm_multilibs}; do
+ case ${arm_multilib} in
+ aprofile|rmprofile)
+ tmake_profile_file="arm/t-multilib"
+ ;;
+ *)
+ echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2
+ exit 1
+ ;;
+ esac
+ done
if test "x${tmake_profile_file}" != x ; then
# arm/t-aprofile and arm/t-rmprofile are only
diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile
index 90305e1206e3964e08a673e385d3198747bdffa1..2cbd8e3c8e8bcd4bed6368bfea83ece953c8dbb4 100644
--- a/gcc/config/arm/t-aprofile
+++ b/gcc/config/arm/t-aprofile
@@ -24,30 +24,13 @@
# have their default values during the configure step. We enforce
# this during the top-level configury.
-MULTILIB_OPTIONS =
-MULTILIB_DIRNAMES =
-MULTILIB_EXCEPTIONS =
-MULTILIB_MATCHES =
-MULTILIB_REUSE =
+# Arch and FPU variants to build libraries with
-# We have the following hierachy:
-# ISA: A32 (.) or T32 (thumb)
-# Architecture: ARMv7-A (v7-a), ARMv7VE (v7ve), or ARMv8-A (v8-a).
-# FPU: VFPv3-D16 (fpv3), NEONv1 (simdv1), VFPv4-D16 (fpv4),
-# NEON-VFPV4 (simdvfpv4), NEON for ARMv8 (simdv8), or None (.).
-# Float-abi: Soft (.), softfp (softfp), or hard (hardfp).
+MULTI_ARCH_OPTS_A = march=armv7-a/march=armv7ve/march=armv8-a
+MULTI_ARCH_DIRS_A = v7-a v7ve v8-a
-MULTILIB_OPTIONS += mthumb
-MULTILIB_DIRNAMES += thumb
-
-MULTILIB_OPTIONS += march=armv7-a/march=armv7ve/march=armv8-a
-MULTILIB_DIRNAMES += v7-a v7ve v8-a
-
-MULTILIB_OPTIONS += mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8
-MULTILIB_DIRNAMES += fpv3 simdv1 fpv4 simdvfpv4 simdv8
-
-MULTILIB_OPTIONS += mfloat-abi=softfp/mfloat-abi=hard
-MULTILIB_DIRNAMES += softfp hard
+MULTI_FPU_OPTS_A = mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8
+MULTI_FPU_DIRS_A = fpv3 simdv1 fpv4 simdvfpv4 simdv8
# Option combinations to build library with
@@ -71,7 +54,11 @@ MULTILIB_REQUIRED += *march=armv8-a
MULTILIB_REQUIRED += *march=armv8-a/mfpu=neon-fp-armv8/mfloat-abi=*
+# Matches
+
# CPU Matches
+MULTILIB_MATCHES += march?armv7-a=mcpu?marvell-pj4
+MULTILIB_MATCHES += march?armv7-a=mcpu?generic-armv7-a
MULTILIB_MATCHES += march?armv7-a=mcpu?cortex-a8
MULTILIB_MATCHES += march?armv7-a=mcpu?cortex-a9
MULTILIB_MATCHES += march?armv7-a=mcpu?cortex-a5
diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
new file mode 100644
index 0000000000000000000000000000000000000000..642e731765534904b752d7e15fb6b68a3db2708e
--- /dev/null
+++ b/gcc/config/arm/t-multilib
@@ -0,0 +1,69 @@
+# Copyright (C) 2016 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>.
+
+# This is a target makefile fragment that attempts to get
+# multilibs built for the range of CPU's, FPU's and ABI's that
+# are relevant for the ARM architecture. It should not be used in
+# conjunction with another make file fragment and assumes --with-arch,
+# --with-cpu, --with-fpu, --with-float, --with-mode have their default
+# values during the configure step. We enforce this during the
+# top-level configury.
+
+MULTILIB_OPTIONS =
+MULTILIB_DIRNAMES =
+MULTILIB_EXCEPTIONS =
+MULTILIB_MATCHES =
+MULTILIB_REUSE =
+
+comma := ,
+tm_multilib_list := $(subst $(comma), ,$(TM_MULTILIB_CONFIG))
+
+HAS_APROFILE := $(filter aprofile,$(tm_multilib_list))
+HAS_RMPROFILE := $(filter rmprofile,$(tm_multilib_list))
+
+ifneq (,$(HAS_APROFILE))
+include $(srcdir)/config/arm/t-aprofile
+endif
+ifneq (,$(HAS_RMPROFILE))
+include $(srcdir)/config/arm/t-rmprofile
+endif
+SEP := $(and $(HAS_APROFILE),$(HAS_RMPROFILE),/)
+
+
+# We have the following hierachy:
+# ISA: A32 (.) or T16/T32 (thumb)
+# Architecture: ARMv6-M (v6-m), ARMv7-M (v7-m), ARMv7E-M (v7e-m),
+# ARMv7 (v7-ar), ARMv7-A (v7-a), ARMv7VE (v7ve),
+# ARMv8-M Baseline (v8-m.base), ARMv8-M Mainline (v8-m.main)
+# or ARMv8-A (v8-a).
+# FPU: VFPv3-D16 (fpv3), NEONv1 (simdv1), FPV4-SP-D16 (fpv4-sp),
+# VFPv4-D16 (fpv4), NEON-VFPV4 (simdvfpv4), FPV5-SP-D16 (fpv5-sp),
+# VFPv5-D16 (fpv5), NEON for ARMv8 (simdv8), or None (.).
+# Float-abi: Soft (.), softfp (softfp), or hard (hard).
+
+MULTILIB_OPTIONS += mthumb
+MULTILIB_DIRNAMES += thumb
+
+MULTILIB_OPTIONS += $(MULTI_ARCH_OPTS_A)$(SEP)$(MULTI_ARCH_OPTS_RM)
+MULTILIB_DIRNAMES += $(MULTI_ARCH_DIRS_A) $(MULTI_ARCH_DIRS_RM)
+
+MULTILIB_OPTIONS += $(MULTI_FPU_OPTS_A)$(SEP)$(MULTI_FPU_OPTS_RM)
+MULTILIB_DIRNAMES += $(MULTI_FPU_DIRS_A) $(MULTI_FPU_DIRS_RM)
+
+MULTILIB_OPTIONS += mfloat-abi=softfp/mfloat-abi=hard
+MULTILIB_DIRNAMES += softfp hard
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index c195a6590c2f8e1753a9f2498583f5be89df7d1e..ea6054dfd58be63426fc77e3a00a25fde96a8583 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -24,33 +24,14 @@
# values during the configure step. We enforce this during the
# top-level configury.
-MULTILIB_OPTIONS =
-MULTILIB_DIRNAMES =
-MULTILIB_EXCEPTIONS =
-MULTILIB_MATCHES =
-MULTILIB_REUSE =
-# We have the following hierachy:
-# ISA: A32 (.) or T16/T32 (thumb).
-# Architecture: ARMv6S-M (v6-m), ARMv7-M (v7-m), ARMv7E-M (v7e-m),
-# ARMv8-M Baseline (v8-m.base) or ARMv8-M Mainline (v8-m.main).
-# FPU: VFPv3-D16 (fpv3), FPV4-SP-D16 (fpv4-sp), FPV5-SP-D16 (fpv5-sp),
-# VFPv5-D16 (fpv5), or None (.).
-# Float-abi: Soft (.), softfp (softfp), or hard (hardfp).
+# Arch and FPU variants to build libraries with
-# Options to build libraries with
+MULTI_ARCH_OPTS_RM = march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/march=armv8-m.base/march=armv8-m.main
+MULTI_ARCH_DIRS_RM = v6-m v7-m v7e-m v7-ar v8-m.base v8-m.main
-MULTILIB_OPTIONS += mthumb
-MULTILIB_DIRNAMES += thumb
-
-MULTILIB_OPTIONS += march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/march=armv8-m.base/march=armv8-m.main
-MULTILIB_DIRNAMES += v6-m v7-m v7e-m v7-ar v8-m.base v8-m.main
-
-MULTILIB_OPTIONS += mfpu=vfpv3-d16/mfpu=fpv4-sp-d16/mfpu=fpv5-sp-d16/mfpu=fpv5-d16
-MULTILIB_DIRNAMES += fpv3 fpv4-sp fpv5-sp fpv5
-
-MULTILIB_OPTIONS += mfloat-abi=softfp/mfloat-abi=hard
-MULTILIB_DIRNAMES += softfp hard
+MULTI_FPU_OPTS_RM = mfpu=vfpv3-d16/mfpu=fpv4-sp-d16/mfpu=fpv5-sp-d16/mfpu=fpv5-d16
+MULTI_FPU_DIRS_RM = fpv3 fpv4-sp fpv5-sp fpv5
# Option combinations to build library with
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 0b94bc1931a226e58d06a7ed5a726454142c006a..d91c82b569c017819cc2390085ed159e02929819 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1114,14 +1114,18 @@ for each target is given below.
@table @code
@item arm*-*-*
-@var{list} is one of@code{default}, @code{aprofile} or @code{rmprofile}.
-Specifying @code{default} is equivalent to omitting this option, ie. only the
-default runtime library will be enabled. Specifying @code{aprofile} or
-@code{rmprofile} builds multilibs for a combination of ISA, architecture,
-FPU available and floating-point ABI.
+@var{list} is a comma separated list of @code{aprofile} and @code{rmprofile}
+to build multilibs for A or R and M architecture profiles respectively. Note
+that, due to some limitation of the current multilib framework, using the
+combined @code{aprofile,rmprofile} multilibs selects in some cases a less
+optimal multilib than when using the multilib profile for the architecture
+targetted. The special value @code{default} is also accepted and is equivalent
+to omitting the option, ie. only the default run-time library will be enabled.
The table below gives the combination of ISAs, architectures, FPUs and
floating-point ABIs for which multilibs are built for each accepted value.
+The union of these options is considered when specifying both @code{aprofile}
+and @code{rmprofile}.
@multitable @columnfractions .15 .28 .30
@item Option @tab aprofile @tab rmprofile
[-- Attachment #3: fix_pr77673_gcc5.patch --]
[-- Type: text/x-patch, Size: 7027 bytes --]
diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03bcb9284d1e5a0e496cfa547fdbab630bec04f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr77673.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-require-effective-target bswap32 } */
+/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */
+/* { dg-additional-options "-march=z900" { target s390*-*-* } } */
+
+void mach_parse_compressed(unsigned char* ptr, unsigned long int* val)
+{
+ if (ptr[0] < 0xC0U) {
+ *val = ptr[0] + ptr[1];
+ return;
+ }
+
+ *val = ((unsigned long int)(ptr[0]) << 24)
+ | ((unsigned long int)(ptr[1]) << 16)
+ | ((unsigned long int)(ptr[2]) << 8)
+ | ptr[3];
+}
+
+/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 61b65bb824b1e90ab13e3cb3ac1b4fbbc34a3b70..3882652dcfc0e04642196243a034f1f7c1405936 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1645,6 +1645,7 @@ struct symbolic_number {
tree base_addr;
tree offset;
HOST_WIDE_INT bytepos;
+ tree src;
tree alias_set;
tree vuse;
unsigned HOST_WIDE_INT range;
@@ -1746,6 +1747,7 @@ init_symbolic_number (struct symbolic_number *n, tree src)
int size;
n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
+ n->src = src;
/* Set up the symbolic number N by setting each byte to a value between 1 and
the byte size of rhs1. The highest order byte is set to n->size and the
@@ -1859,6 +1861,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
uint64_t inc;
HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
struct symbolic_number *toinc_n_ptr, *n_end;
+ basic_block bb1, bb2;
if (!n1->base_addr || !n2->base_addr
|| !operand_equal_p (n1->base_addr, n2->base_addr, 0))
@@ -1872,15 +1875,20 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
{
n_start = n1;
start_sub = n2->bytepos - n1->bytepos;
- source_stmt = source_stmt1;
}
else
{
n_start = n2;
start_sub = n1->bytepos - n2->bytepos;
- source_stmt = source_stmt2;
}
+ bb1 = gimple_bb (source_stmt1);
+ bb2 = gimple_bb (source_stmt2);
+ if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+ source_stmt = source_stmt1;
+ else
+ source_stmt = source_stmt2;
+
/* Find the highest address at which a load is performed and
compute related info. */
end1 = n1->bytepos + (n1->range - 1);
@@ -1937,6 +1945,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
n->vuse = n_start->vuse;
n->base_addr = n_start->base_addr;
n->offset = n_start->offset;
+ n->src = n_start->src;
n->bytepos = n_start->bytepos;
n->type = n_start->type;
size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
@@ -2147,7 +2156,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
uint64_t cmpxchg = CMPXCHG;
uint64_t cmpnop = CMPNOP;
- gimple source_stmt;
+ gimple ins_stmt;
int limit;
/* The last parameter determines the depth search limit. It usually
@@ -2157,9 +2166,9 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
in libgcc, and for initial shift/and operation of the src operand. */
limit = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (gimple_expr_type (stmt)));
limit += 1 + (int) ceil_log2 ((unsigned HOST_WIDE_INT) limit);
- source_stmt = find_bswap_or_nop_1 (stmt, n, limit);
+ ins_stmt = find_bswap_or_nop_1 (stmt, n, limit);
- if (!source_stmt)
+ if (!ins_stmt)
return NULL;
/* Find real size of result (highest non-zero byte). */
@@ -2201,7 +2210,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
return NULL;
n->range *= BITS_PER_UNIT;
- return source_stmt;
+ return ins_stmt;
}
namespace {
@@ -2250,7 +2259,7 @@ public:
changing of basic block. */
static bool
-bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
+bswap_replace (gimple cur_stmt, gimple ins_stmt, tree fndecl, tree bswap_type,
tree load_type, struct symbolic_number *n, bool bswap)
{
gimple_stmt_iterator gsi;
@@ -2258,18 +2267,24 @@ bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
gimple bswap_stmt;
gsi = gsi_for_stmt (cur_stmt);
- src = gimple_assign_rhs1 (src_stmt);
+ src = n->src;
tgt = gimple_assign_lhs (cur_stmt);
/* Need to load the value from memory first. */
if (n->base_addr)
{
- gimple_stmt_iterator gsi_ins = gsi_for_stmt (src_stmt);
+ gimple_stmt_iterator gsi_ins = gsi_for_stmt (ins_stmt);
tree addr_expr, addr_tmp, val_expr, val_tmp;
tree load_offset_ptr, aligned_load_type;
gimple addr_stmt, load_stmt;
unsigned align;
HOST_WIDE_INT load_offset = 0;
+ basic_block ins_bb, cur_bb;
+
+ ins_bb = gimple_bb (ins_stmt);
+ cur_bb = gimple_bb (cur_stmt);
+ if (!dominated_by_p (CDI_DOMINATORS, cur_bb, ins_bb))
+ return false;
align = get_object_alignment (src);
/* If the new access is smaller than the original one, we need
@@ -2301,7 +2316,7 @@ bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
/* Move cur_stmt just before one of the load of the original
to ensure it has the same VUSE. See PR61517 for what could
go wrong. */
- if (gimple_bb (cur_stmt) != gimple_bb (src_stmt))
+ if (gimple_bb (cur_stmt) != gimple_bb (ins_stmt))
reset_flow_sensitive_info (gimple_assign_lhs (cur_stmt));
gsi_move_before (&gsi, &gsi_ins);
gsi = gsi_for_stmt (cur_stmt);
@@ -2474,6 +2489,7 @@ pass_optimize_bswap::execute (function *fun)
memset (&nop_stats, 0, sizeof (nop_stats));
memset (&bswap_stats, 0, sizeof (bswap_stats));
+ calculate_dominance_info (CDI_DOMINATORS);
FOR_EACH_BB_FN (bb, fun)
{
@@ -2485,7 +2501,7 @@ pass_optimize_bswap::execute (function *fun)
variant wouldn't be detected. */
for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
{
- gimple src_stmt, cur_stmt = gsi_stmt (gsi);
+ gimple ins_stmt, cur_stmt = gsi_stmt (gsi);
tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
enum tree_code code;
struct symbolic_number n;
@@ -2518,9 +2534,9 @@ pass_optimize_bswap::execute (function *fun)
continue;
}
- src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
+ ins_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
- if (!src_stmt)
+ if (!ins_stmt)
continue;
switch (n.range)
@@ -2554,7 +2570,7 @@ pass_optimize_bswap::execute (function *fun)
if (bswap && !fndecl && n.range != 16)
continue;
- if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,
+ if (bswap_replace (cur_stmt, ins_stmt, fndecl, bswap_type, load_type,
&n, bswap))
changed = true;
}
[-- Attachment #4: fix_pr77673_gcc6.patch --]
[-- Type: text/x-patch, Size: 6979 bytes --]
diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03bcb9284d1e5a0e496cfa547fdbab630bec04f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr77673.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-require-effective-target bswap32 } */
+/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */
+/* { dg-additional-options "-march=z900" { target s390*-*-* } } */
+
+void mach_parse_compressed(unsigned char* ptr, unsigned long int* val)
+{
+ if (ptr[0] < 0xC0U) {
+ *val = ptr[0] + ptr[1];
+ return;
+ }
+
+ *val = ((unsigned long int)(ptr[0]) << 24)
+ | ((unsigned long int)(ptr[1]) << 16)
+ | ((unsigned long int)(ptr[2]) << 8)
+ | ptr[3];
+}
+
+/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 735b7c67c31df0c8317544346396fb8b15315879..ac15e8179a3257d1e190086c8089bc85ed8552bf 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1951,6 +1951,7 @@ struct symbolic_number {
tree base_addr;
tree offset;
HOST_WIDE_INT bytepos;
+ tree src;
tree alias_set;
tree vuse;
unsigned HOST_WIDE_INT range;
@@ -2052,6 +2053,7 @@ init_symbolic_number (struct symbolic_number *n, tree src)
int size;
n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
+ n->src = src;
/* Set up the symbolic number N by setting each byte to a value between 1 and
the byte size of rhs1. The highest order byte is set to n->size and the
@@ -2167,6 +2169,7 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
uint64_t inc;
HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
struct symbolic_number *toinc_n_ptr, *n_end;
+ basic_block bb1, bb2;
if (!n1->base_addr || !n2->base_addr
|| !operand_equal_p (n1->base_addr, n2->base_addr, 0))
@@ -2180,15 +2183,20 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
{
n_start = n1;
start_sub = n2->bytepos - n1->bytepos;
- source_stmt = source_stmt1;
}
else
{
n_start = n2;
start_sub = n1->bytepos - n2->bytepos;
- source_stmt = source_stmt2;
}
+ bb1 = gimple_bb (source_stmt1);
+ bb2 = gimple_bb (source_stmt2);
+ if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+ source_stmt = source_stmt1;
+ else
+ source_stmt = source_stmt2;
+
/* Find the highest address at which a load is performed and
compute related info. */
end1 = n1->bytepos + (n1->range - 1);
@@ -2245,6 +2253,7 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
n->vuse = n_start->vuse;
n->base_addr = n_start->base_addr;
n->offset = n_start->offset;
+ n->src = n_start->src;
n->bytepos = n_start->bytepos;
n->type = n_start->type;
size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
@@ -2455,7 +2464,7 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
uint64_t cmpxchg = CMPXCHG;
uint64_t cmpnop = CMPNOP;
- gimple *source_stmt;
+ gimple *ins_stmt;
int limit;
/* The last parameter determines the depth search limit. It usually
@@ -2465,9 +2474,9 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
in libgcc, and for initial shift/and operation of the src operand. */
limit = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (gimple_expr_type (stmt)));
limit += 1 + (int) ceil_log2 ((unsigned HOST_WIDE_INT) limit);
- source_stmt = find_bswap_or_nop_1 (stmt, n, limit);
+ ins_stmt = find_bswap_or_nop_1 (stmt, n, limit);
- if (!source_stmt)
+ if (!ins_stmt)
return NULL;
/* Find real size of result (highest non-zero byte). */
@@ -2509,7 +2518,7 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
return NULL;
n->range *= BITS_PER_UNIT;
- return source_stmt;
+ return ins_stmt;
}
namespace {
@@ -2558,7 +2567,7 @@ public:
changing of basic block. */
static bool
-bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
+bswap_replace (gimple *cur_stmt, gimple *ins_stmt, tree fndecl,
tree bswap_type, tree load_type, struct symbolic_number *n,
bool bswap)
{
@@ -2567,18 +2576,24 @@ bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
gimple *bswap_stmt;
gsi = gsi_for_stmt (cur_stmt);
- src = gimple_assign_rhs1 (src_stmt);
+ src = n->src;
tgt = gimple_assign_lhs (cur_stmt);
/* Need to load the value from memory first. */
if (n->base_addr)
{
- gimple_stmt_iterator gsi_ins = gsi_for_stmt (src_stmt);
+ gimple_stmt_iterator gsi_ins = gsi_for_stmt (ins_stmt);
tree addr_expr, addr_tmp, val_expr, val_tmp;
tree load_offset_ptr, aligned_load_type;
gimple *addr_stmt, *load_stmt;
unsigned align;
HOST_WIDE_INT load_offset = 0;
+ basic_block ins_bb, cur_bb;
+
+ ins_bb = gimple_bb (ins_stmt);
+ cur_bb = gimple_bb (cur_stmt);
+ if (!dominated_by_p (CDI_DOMINATORS, cur_bb, ins_bb))
+ return false;
align = get_object_alignment (src);
/* If the new access is smaller than the original one, we need
@@ -2610,7 +2625,7 @@ bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
/* Move cur_stmt just before one of the load of the original
to ensure it has the same VUSE. See PR61517 for what could
go wrong. */
- if (gimple_bb (cur_stmt) != gimple_bb (src_stmt))
+ if (gimple_bb (cur_stmt) != gimple_bb (ins_stmt))
reset_flow_sensitive_info (gimple_assign_lhs (cur_stmt));
gsi_move_before (&gsi, &gsi_ins);
gsi = gsi_for_stmt (cur_stmt);
@@ -2783,6 +2798,7 @@ pass_optimize_bswap::execute (function *fun)
memset (&nop_stats, 0, sizeof (nop_stats));
memset (&bswap_stats, 0, sizeof (bswap_stats));
+ calculate_dominance_info (CDI_DOMINATORS);
FOR_EACH_BB_FN (bb, fun)
{
@@ -2794,7 +2810,7 @@ pass_optimize_bswap::execute (function *fun)
variant wouldn't be detected. */
for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
{
- gimple *src_stmt, *cur_stmt = gsi_stmt (gsi);
+ gimple *ins_stmt, *cur_stmt = gsi_stmt (gsi);
tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
enum tree_code code;
struct symbolic_number n;
@@ -2827,9 +2843,9 @@ pass_optimize_bswap::execute (function *fun)
continue;
}
- src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
+ ins_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
- if (!src_stmt)
+ if (!ins_stmt)
continue;
switch (n.range)
@@ -2863,7 +2879,7 @@ pass_optimize_bswap::execute (function *fun)
if (bswap && !fndecl && n.range != 16)
continue;
- if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,
+ if (bswap_replace (cur_stmt, ins_stmt, fndecl, bswap_type, load_type,
&n, bswap))
changed = true;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object
2016-12-13 11:43 [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object Thomas Preudhomme
@ 2016-12-13 11:46 ` Thomas Preudhomme
2016-12-14 8:08 ` Richard Biener
1 sibling, 0 replies; 3+ messages in thread
From: Thomas Preudhomme @ 2016-12-13 11:46 UTC (permalink / raw)
To: Richard Biener, Jakub Jelinek, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1911 bytes --]
Sorry, ignore the first attachment (2_combine_profile_multilib.patch). i always
miss that Thunderbird selects the first file in the in-review folder upfront.
Best regards,
Thomas
On 13/12/16 11:43, Thomas Preudhomme wrote:
> Hi,
>
> Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on
> GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The
> backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no
> regression.
>
> ChangeLog entries are as follow:
>
>
> *** gcc/ChangeLog ***
>
> 2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> Backport from mainline
> 2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> PR tree-optimization/77673
> * tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
> (init_symbolic_number): Initialize src field from src parameter.
> (perform_symbolic_merge): Select most dominated statement as the
> source statement. Set src field of resulting n structure from the
> input src with the lowest address.
> (find_bswap_or_nop): Rename source_stmt into ins_stmt.
> (bswap_replace): Rename src_stmt into ins_stmt. Initially get source
> of load from src field rather than insertion statement. Cancel
> optimization if statement analyzed is not dominated by the insertion
> statement.
> (pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute
> dominance information.
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> Backport from mainline
> 2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> PR tree-optimization/77673
> * gcc.dg/pr77673.c: New test.
>
>
> Is this ok for gcc-5-branch and gcc-6-branch?
>
> Best regards,
>
> Thomas
[-- Attachment #2: fix_pr77673_gcc5.patch --]
[-- Type: text/x-patch, Size: 7027 bytes --]
diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03bcb9284d1e5a0e496cfa547fdbab630bec04f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr77673.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-require-effective-target bswap32 } */
+/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */
+/* { dg-additional-options "-march=z900" { target s390*-*-* } } */
+
+void mach_parse_compressed(unsigned char* ptr, unsigned long int* val)
+{
+ if (ptr[0] < 0xC0U) {
+ *val = ptr[0] + ptr[1];
+ return;
+ }
+
+ *val = ((unsigned long int)(ptr[0]) << 24)
+ | ((unsigned long int)(ptr[1]) << 16)
+ | ((unsigned long int)(ptr[2]) << 8)
+ | ptr[3];
+}
+
+/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 61b65bb824b1e90ab13e3cb3ac1b4fbbc34a3b70..3882652dcfc0e04642196243a034f1f7c1405936 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1645,6 +1645,7 @@ struct symbolic_number {
tree base_addr;
tree offset;
HOST_WIDE_INT bytepos;
+ tree src;
tree alias_set;
tree vuse;
unsigned HOST_WIDE_INT range;
@@ -1746,6 +1747,7 @@ init_symbolic_number (struct symbolic_number *n, tree src)
int size;
n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
+ n->src = src;
/* Set up the symbolic number N by setting each byte to a value between 1 and
the byte size of rhs1. The highest order byte is set to n->size and the
@@ -1859,6 +1861,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
uint64_t inc;
HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
struct symbolic_number *toinc_n_ptr, *n_end;
+ basic_block bb1, bb2;
if (!n1->base_addr || !n2->base_addr
|| !operand_equal_p (n1->base_addr, n2->base_addr, 0))
@@ -1872,15 +1875,20 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
{
n_start = n1;
start_sub = n2->bytepos - n1->bytepos;
- source_stmt = source_stmt1;
}
else
{
n_start = n2;
start_sub = n1->bytepos - n2->bytepos;
- source_stmt = source_stmt2;
}
+ bb1 = gimple_bb (source_stmt1);
+ bb2 = gimple_bb (source_stmt2);
+ if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+ source_stmt = source_stmt1;
+ else
+ source_stmt = source_stmt2;
+
/* Find the highest address at which a load is performed and
compute related info. */
end1 = n1->bytepos + (n1->range - 1);
@@ -1937,6 +1945,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
n->vuse = n_start->vuse;
n->base_addr = n_start->base_addr;
n->offset = n_start->offset;
+ n->src = n_start->src;
n->bytepos = n_start->bytepos;
n->type = n_start->type;
size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
@@ -2147,7 +2156,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
uint64_t cmpxchg = CMPXCHG;
uint64_t cmpnop = CMPNOP;
- gimple source_stmt;
+ gimple ins_stmt;
int limit;
/* The last parameter determines the depth search limit. It usually
@@ -2157,9 +2166,9 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
in libgcc, and for initial shift/and operation of the src operand. */
limit = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (gimple_expr_type (stmt)));
limit += 1 + (int) ceil_log2 ((unsigned HOST_WIDE_INT) limit);
- source_stmt = find_bswap_or_nop_1 (stmt, n, limit);
+ ins_stmt = find_bswap_or_nop_1 (stmt, n, limit);
- if (!source_stmt)
+ if (!ins_stmt)
return NULL;
/* Find real size of result (highest non-zero byte). */
@@ -2201,7 +2210,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap)
return NULL;
n->range *= BITS_PER_UNIT;
- return source_stmt;
+ return ins_stmt;
}
namespace {
@@ -2250,7 +2259,7 @@ public:
changing of basic block. */
static bool
-bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
+bswap_replace (gimple cur_stmt, gimple ins_stmt, tree fndecl, tree bswap_type,
tree load_type, struct symbolic_number *n, bool bswap)
{
gimple_stmt_iterator gsi;
@@ -2258,18 +2267,24 @@ bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
gimple bswap_stmt;
gsi = gsi_for_stmt (cur_stmt);
- src = gimple_assign_rhs1 (src_stmt);
+ src = n->src;
tgt = gimple_assign_lhs (cur_stmt);
/* Need to load the value from memory first. */
if (n->base_addr)
{
- gimple_stmt_iterator gsi_ins = gsi_for_stmt (src_stmt);
+ gimple_stmt_iterator gsi_ins = gsi_for_stmt (ins_stmt);
tree addr_expr, addr_tmp, val_expr, val_tmp;
tree load_offset_ptr, aligned_load_type;
gimple addr_stmt, load_stmt;
unsigned align;
HOST_WIDE_INT load_offset = 0;
+ basic_block ins_bb, cur_bb;
+
+ ins_bb = gimple_bb (ins_stmt);
+ cur_bb = gimple_bb (cur_stmt);
+ if (!dominated_by_p (CDI_DOMINATORS, cur_bb, ins_bb))
+ return false;
align = get_object_alignment (src);
/* If the new access is smaller than the original one, we need
@@ -2301,7 +2316,7 @@ bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
/* Move cur_stmt just before one of the load of the original
to ensure it has the same VUSE. See PR61517 for what could
go wrong. */
- if (gimple_bb (cur_stmt) != gimple_bb (src_stmt))
+ if (gimple_bb (cur_stmt) != gimple_bb (ins_stmt))
reset_flow_sensitive_info (gimple_assign_lhs (cur_stmt));
gsi_move_before (&gsi, &gsi_ins);
gsi = gsi_for_stmt (cur_stmt);
@@ -2474,6 +2489,7 @@ pass_optimize_bswap::execute (function *fun)
memset (&nop_stats, 0, sizeof (nop_stats));
memset (&bswap_stats, 0, sizeof (bswap_stats));
+ calculate_dominance_info (CDI_DOMINATORS);
FOR_EACH_BB_FN (bb, fun)
{
@@ -2485,7 +2501,7 @@ pass_optimize_bswap::execute (function *fun)
variant wouldn't be detected. */
for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
{
- gimple src_stmt, cur_stmt = gsi_stmt (gsi);
+ gimple ins_stmt, cur_stmt = gsi_stmt (gsi);
tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
enum tree_code code;
struct symbolic_number n;
@@ -2518,9 +2534,9 @@ pass_optimize_bswap::execute (function *fun)
continue;
}
- src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
+ ins_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
- if (!src_stmt)
+ if (!ins_stmt)
continue;
switch (n.range)
@@ -2554,7 +2570,7 @@ pass_optimize_bswap::execute (function *fun)
if (bswap && !fndecl && n.range != 16)
continue;
- if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,
+ if (bswap_replace (cur_stmt, ins_stmt, fndecl, bswap_type, load_type,
&n, bswap))
changed = true;
}
[-- Attachment #3: fix_pr77673_gcc6.patch --]
[-- Type: text/x-patch, Size: 6979 bytes --]
diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03bcb9284d1e5a0e496cfa547fdbab630bec04f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr77673.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-require-effective-target bswap32 } */
+/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */
+/* { dg-additional-options "-march=z900" { target s390*-*-* } } */
+
+void mach_parse_compressed(unsigned char* ptr, unsigned long int* val)
+{
+ if (ptr[0] < 0xC0U) {
+ *val = ptr[0] + ptr[1];
+ return;
+ }
+
+ *val = ((unsigned long int)(ptr[0]) << 24)
+ | ((unsigned long int)(ptr[1]) << 16)
+ | ((unsigned long int)(ptr[2]) << 8)
+ | ptr[3];
+}
+
+/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 735b7c67c31df0c8317544346396fb8b15315879..ac15e8179a3257d1e190086c8089bc85ed8552bf 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1951,6 +1951,7 @@ struct symbolic_number {
tree base_addr;
tree offset;
HOST_WIDE_INT bytepos;
+ tree src;
tree alias_set;
tree vuse;
unsigned HOST_WIDE_INT range;
@@ -2052,6 +2053,7 @@ init_symbolic_number (struct symbolic_number *n, tree src)
int size;
n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
+ n->src = src;
/* Set up the symbolic number N by setting each byte to a value between 1 and
the byte size of rhs1. The highest order byte is set to n->size and the
@@ -2167,6 +2169,7 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
uint64_t inc;
HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
struct symbolic_number *toinc_n_ptr, *n_end;
+ basic_block bb1, bb2;
if (!n1->base_addr || !n2->base_addr
|| !operand_equal_p (n1->base_addr, n2->base_addr, 0))
@@ -2180,15 +2183,20 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
{
n_start = n1;
start_sub = n2->bytepos - n1->bytepos;
- source_stmt = source_stmt1;
}
else
{
n_start = n2;
start_sub = n1->bytepos - n2->bytepos;
- source_stmt = source_stmt2;
}
+ bb1 = gimple_bb (source_stmt1);
+ bb2 = gimple_bb (source_stmt2);
+ if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+ source_stmt = source_stmt1;
+ else
+ source_stmt = source_stmt2;
+
/* Find the highest address at which a load is performed and
compute related info. */
end1 = n1->bytepos + (n1->range - 1);
@@ -2245,6 +2253,7 @@ perform_symbolic_merge (gimple *source_stmt1, struct symbolic_number *n1,
n->vuse = n_start->vuse;
n->base_addr = n_start->base_addr;
n->offset = n_start->offset;
+ n->src = n_start->src;
n->bytepos = n_start->bytepos;
n->type = n_start->type;
size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
@@ -2455,7 +2464,7 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
uint64_t cmpxchg = CMPXCHG;
uint64_t cmpnop = CMPNOP;
- gimple *source_stmt;
+ gimple *ins_stmt;
int limit;
/* The last parameter determines the depth search limit. It usually
@@ -2465,9 +2474,9 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
in libgcc, and for initial shift/and operation of the src operand. */
limit = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (gimple_expr_type (stmt)));
limit += 1 + (int) ceil_log2 ((unsigned HOST_WIDE_INT) limit);
- source_stmt = find_bswap_or_nop_1 (stmt, n, limit);
+ ins_stmt = find_bswap_or_nop_1 (stmt, n, limit);
- if (!source_stmt)
+ if (!ins_stmt)
return NULL;
/* Find real size of result (highest non-zero byte). */
@@ -2509,7 +2518,7 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
return NULL;
n->range *= BITS_PER_UNIT;
- return source_stmt;
+ return ins_stmt;
}
namespace {
@@ -2558,7 +2567,7 @@ public:
changing of basic block. */
static bool
-bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
+bswap_replace (gimple *cur_stmt, gimple *ins_stmt, tree fndecl,
tree bswap_type, tree load_type, struct symbolic_number *n,
bool bswap)
{
@@ -2567,18 +2576,24 @@ bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
gimple *bswap_stmt;
gsi = gsi_for_stmt (cur_stmt);
- src = gimple_assign_rhs1 (src_stmt);
+ src = n->src;
tgt = gimple_assign_lhs (cur_stmt);
/* Need to load the value from memory first. */
if (n->base_addr)
{
- gimple_stmt_iterator gsi_ins = gsi_for_stmt (src_stmt);
+ gimple_stmt_iterator gsi_ins = gsi_for_stmt (ins_stmt);
tree addr_expr, addr_tmp, val_expr, val_tmp;
tree load_offset_ptr, aligned_load_type;
gimple *addr_stmt, *load_stmt;
unsigned align;
HOST_WIDE_INT load_offset = 0;
+ basic_block ins_bb, cur_bb;
+
+ ins_bb = gimple_bb (ins_stmt);
+ cur_bb = gimple_bb (cur_stmt);
+ if (!dominated_by_p (CDI_DOMINATORS, cur_bb, ins_bb))
+ return false;
align = get_object_alignment (src);
/* If the new access is smaller than the original one, we need
@@ -2610,7 +2625,7 @@ bswap_replace (gimple *cur_stmt, gimple *src_stmt, tree fndecl,
/* Move cur_stmt just before one of the load of the original
to ensure it has the same VUSE. See PR61517 for what could
go wrong. */
- if (gimple_bb (cur_stmt) != gimple_bb (src_stmt))
+ if (gimple_bb (cur_stmt) != gimple_bb (ins_stmt))
reset_flow_sensitive_info (gimple_assign_lhs (cur_stmt));
gsi_move_before (&gsi, &gsi_ins);
gsi = gsi_for_stmt (cur_stmt);
@@ -2783,6 +2798,7 @@ pass_optimize_bswap::execute (function *fun)
memset (&nop_stats, 0, sizeof (nop_stats));
memset (&bswap_stats, 0, sizeof (bswap_stats));
+ calculate_dominance_info (CDI_DOMINATORS);
FOR_EACH_BB_FN (bb, fun)
{
@@ -2794,7 +2810,7 @@ pass_optimize_bswap::execute (function *fun)
variant wouldn't be detected. */
for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
{
- gimple *src_stmt, *cur_stmt = gsi_stmt (gsi);
+ gimple *ins_stmt, *cur_stmt = gsi_stmt (gsi);
tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
enum tree_code code;
struct symbolic_number n;
@@ -2827,9 +2843,9 @@ pass_optimize_bswap::execute (function *fun)
continue;
}
- src_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
+ ins_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap);
- if (!src_stmt)
+ if (!ins_stmt)
continue;
switch (n.range)
@@ -2863,7 +2879,7 @@ pass_optimize_bswap::execute (function *fun)
if (bswap && !fndecl && n.range != 16)
continue;
- if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,
+ if (bswap_replace (cur_stmt, ins_stmt, fndecl, bswap_type, load_type,
&n, bswap))
changed = true;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object
2016-12-13 11:43 [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object Thomas Preudhomme
2016-12-13 11:46 ` Thomas Preudhomme
@ 2016-12-14 8:08 ` Richard Biener
1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2016-12-14 8:08 UTC (permalink / raw)
To: Thomas Preudhomme; +Cc: Jakub Jelinek, gcc-patches
On Tue, 13 Dec 2016, Thomas Preudhomme wrote:
> Hi,
>
> Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly
> on GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC
> 5). The backports also bootstrap fine on x86_64-linux-gnu and testsuite shows
> no regression.
Ok.
Richard.
> ChangeLog entries are as follow:
>
>
> *** gcc/ChangeLog ***
>
> 2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> Backport from mainline
> 2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> PR tree-optimization/77673
> * tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
> (init_symbolic_number): Initialize src field from src parameter.
> (perform_symbolic_merge): Select most dominated statement as the
> source statement. Set src field of resulting n structure from the
> input src with the lowest address.
> (find_bswap_or_nop): Rename source_stmt into ins_stmt.
> (bswap_replace): Rename src_stmt into ins_stmt. Initially get source
> of load from src field rather than insertion statement. Cancel
> optimization if statement analyzed is not dominated by the insertion
> statement.
> (pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute
> dominance information.
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2016-12-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> Backport from mainline
> 2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> PR tree-optimization/77673
> * gcc.dg/pr77673.c: New test.
>
>
> Is this ok for gcc-5-branch and gcc-6-branch?
>
> Best regards,
>
> Thomas
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-12-14 8:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-13 11:43 [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object Thomas Preudhomme
2016-12-13 11:46 ` Thomas Preudhomme
2016-12-14 8:08 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).