* [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads @ 2015-10-29 19:22 Alan Lawrence 2015-10-29 19:20 ` [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF Alan Lawrence ` (5 more replies) 0 siblings, 6 replies; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:22 UTC (permalink / raw) To: gcc-patches This is a revision of previous series at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01485.html , and follows on from the first two patches of that series, which have been pushed already. A few things have happened since. The previous patch 3, making SRA generate ARRAY_REFS, is removed. As Martin comments, it seems unfair to make SRA work around this limitation of DOM, so instead the first two patches extend DOM to understand this equivalence. Richi suggested that we might want a normalization pass to convert all ARRAY_REFs to MEM_REFS, and while that would have advantages, this simpler approach is enough to persuade DOM to optimize the MEM_REFS produced by SRA with array accesses, even when the vectorizer intervenes. Moreover we can apply more normalizations in DOM than would be permissible elsewhere, even if these two patches are only a small step towards that. A number of other changes followed, with most problems coming from Ada, specifically on ARM. I intend to follow up with changes to the heuristics and gimplification code, moving heuristics from gimplify_init_constructor to SRA, but this series as it is fixes PR/63679 (with the --param), and it is not certain I will be able to tackle said heuristics in time for gcc 6. I've tested each patch (in sequence supplied) with Bootstrap + check-gcc,g++,ada,fortran on x86_64 and ARM; Bootstrap + check-gcc,g++,fortran on AArch64. And also tested that the ssa-dom-cse-2.c scan-tree-dump test is fixed on hppa, ppc64le, sparc, alpha, s390. Are these OK for trunk? Cheers, Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence @ 2015-10-29 19:20 ` Alan Lawrence 2015-10-30 5:39 ` Jeff Law 2015-10-29 19:21 ` [PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs Alan Lawrence ` (4 subsequent siblings) 5 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:20 UTC (permalink / raw) To: gcc-patches This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with pointer type to the array element type). gcc/ChangeLog: * tree-ssa-dom.c (dom_normalize_single_rhs): New. (dom_normalize_gimple_stmt): New. (lookup_avail_expr): Call dom_normalize_gimple_stmt. --- gcc/tree-ssa-dom.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 67 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 38cceff..fca8a80 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1897,6 +1897,72 @@ vuse_eq (ao_ref *, tree vuse1, unsigned int cnt, void *data) return NULL; } +/* Return the normal form of EXPR, the gimple_assign_rhs1 of a GIMPLE_SINGLE_RHS + assignment, or NULL_TREE if EXPR is already in normal form. + The normal form is such that if two expressions normalize to trees that are + operand_equal_p, then dominator optimizations can replace one by the value + produced by the other. + It is not necessary that expressions with equal canonical forms are + equivalent in all other situations, e.g. aliasing. */ + +static tree +dom_normalize_single_rhs (tree expr) +{ + switch (TREE_CODE (expr)) + { + case ARRAY_REF: + { + tree index = TREE_OPERAND (expr, 1); + tree low_bound = array_ref_low_bound (expr); + if (!TREE_CONSTANT (index) || !TREE_CONSTANT (low_bound)) + return NULL_TREE; + + if (! integer_zerop (low_bound)) + index = fold_build2 (MINUS_EXPR, TREE_TYPE (index), index, low_bound); + + tree esz = array_ref_element_size (expr); + gcc_assert (TREE_CONSTANT (esz)); + tree offset = fold_build2 (MULT_EXPR, sizetype, index, esz); + + offset = fold_convert (build_pointer_type (TREE_TYPE (expr)), offset); + + tree base = TREE_OPERAND (expr, 0); + base = build_fold_addr_expr (base); + + return fold_build2 (MEM_REF, TREE_TYPE (expr), base, offset); + } + default: + return NULL_TREE; + } +} + +/* Return the normal form of STMT, or STMT if it is already normalized. */ + +static gimple * +dom_normalize_gimple_stmt (gimple *stmt) +{ + if (gimple_code (stmt) == GIMPLE_ASSIGN) + { + enum tree_code rhs_code = gimple_assign_rhs_code (stmt); + if (get_gimple_rhs_class (rhs_code) == GIMPLE_SINGLE_RHS) + if (tree nrhs = dom_normalize_single_rhs (gimple_assign_rhs1 (stmt))) + { + tree lhs = gimple_assign_lhs (stmt); + /* Exchanged statements, which are never part of the CFG, + can have invalid LHS. */ + gimple *defstmt = 0; + if (TREE_CODE (lhs) == SSA_NAME) + defstmt = SSA_NAME_DEF_STMT (lhs); + gassign *new_stmt = gimple_build_assign (lhs, nrhs); + if (TREE_CODE (lhs) == SSA_NAME) + SSA_NAME_DEF_STMT (lhs) = defstmt; + gimple_set_vuse (new_stmt, gimple_vuse (stmt)); + return new_stmt; + } + } + return stmt; +} + /* Search for an existing instance of STMT in the AVAIL_EXPRS_STACK table. If found, return its LHS. Otherwise insert STMT in the table and return NULL_TREE. @@ -1918,7 +1984,7 @@ lookup_avail_expr (gimple *stmt, bool insert, else lhs = gimple_get_lhs (stmt); - class expr_hash_elt element (stmt, lhs); + class expr_hash_elt element (dom_normalize_gimple_stmt (stmt), lhs); if (dump_file && (dump_flags & TDF_DETAILS)) { -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-10-29 19:20 ` [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF Alan Lawrence @ 2015-10-30 5:39 ` Jeff Law 2015-10-30 9:21 ` Richard Biener 2015-11-03 10:27 ` Alan Lawrence 0 siblings, 2 replies; 26+ messages in thread From: Jeff Law @ 2015-10-30 5:39 UTC (permalink / raw) To: Alan Lawrence, gcc-patches On 10/29/2015 01:18 PM, Alan Lawrence wrote: > This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with > pointer type to the array element type). > > gcc/ChangeLog: > > * tree-ssa-dom.c (dom_normalize_single_rhs): New. > (dom_normalize_gimple_stmt): New. > (lookup_avail_expr): Call dom_normalize_gimple_stmt. Has this been tested? Do you have tests where it can be shown to make a difference independent of the changes to tree-sra.c? The implementation looks fine, I just want to have some basic tests in the testsuite that show the benefit of this normalization. Similarly for patch #2. Interestingly enough we had code that made that kind of transformation during gimplification eons ago. Presumably it was ripped out at some point because of the differences in aliasing properties. Jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-10-30 5:39 ` Jeff Law @ 2015-10-30 9:21 ` Richard Biener 2015-11-03 10:27 ` Alan Lawrence 1 sibling, 0 replies; 26+ messages in thread From: Richard Biener @ 2015-10-30 9:21 UTC (permalink / raw) To: Jeff Law; +Cc: Alan Lawrence, GCC Patches On Fri, Oct 30, 2015 at 6:35 AM, Jeff Law <law@redhat.com> wrote: > On 10/29/2015 01:18 PM, Alan Lawrence wrote: >> >> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs >> (with >> pointer type to the array element type). >> >> gcc/ChangeLog: >> >> * tree-ssa-dom.c (dom_normalize_single_rhs): New. >> (dom_normalize_gimple_stmt): New. >> (lookup_avail_expr): Call dom_normalize_gimple_stmt. > > Has this been tested? Do you have tests where it can be shown to make a > difference independent of the changes to tree-sra.c? > > The implementation looks fine, I just want to have some basic tests in the > testsuite that show the benefit of this normalization. Err, I think the implementation is extremely wasteful ... > Similarly for patch #2. Interestingly enough we had code that made that kind > of transformation during gimplification eons ago. Presumably it was ripped > out at some point because of the differences in aliasing properties. Please have a look at how SCCVN handles variants of memory references. You might even want to re-use it's copy_reference_ops_from_ref implementation (and reference hashing). Note that SCCVN needs to be able to reconstruct a tree expression from its representation (for PRE) which DOM needs not, so DOM might use a simpler form. The basic idea is to accumulate constant offset handled_components into a "offset component". Thus a[2].i -> (&a)->offset(8 + offsetof(i)) MEM_REF[&a, 8].i -> (&a)->offset(8 + offsetof(i)) for DOM you can probably do the copy_reference_ops_from_ref work on-the-fly for hashing and comparing. The main point will be to forgo with the DOM way of hashing/comparing for memory references. Richard. > > > Jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-10-30 5:39 ` Jeff Law 2015-10-30 9:21 ` Richard Biener @ 2015-11-03 10:27 ` Alan Lawrence 2015-11-03 11:13 ` Alan Lawrence 1 sibling, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-11-03 10:27 UTC (permalink / raw) To: gcc-patches; +Cc: law On 30/10/15 05:35, Jeff Law wrote: > On 10/29/2015 01:18 PM, Alan Lawrence wrote: >> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with >> pointer type to the array element type). >> >> gcc/ChangeLog: >> >> * tree-ssa-dom.c (dom_normalize_single_rhs): New. >> (dom_normalize_gimple_stmt): New. >> (lookup_avail_expr): Call dom_normalize_gimple_stmt. > Has this been tested? Do you have tests where it can be shown to make a > difference independent of the changes to tree-sra.c? > > The implementation looks fine, I just want to have some basic tests in the > testsuite that show the benefit of this normalization. I'll look at the implementation and Richard's comments soon, but as to tests - well I had tried before and not had much luck but OK you make me try harder ;). I'll put these in with the appropriate patches, but ssa-dom-cse-{5,6}.c test the ARRAY_REF -> MEM_REF part (one of these has to disable SRA from optimizing the whole test away, the other...ends up waiting until dom2 for the whole loop to have been unrolled, so I'd not be surprised if this proved a bit fragile, as in not spotting when some other phase starts doing the optimization instead). ssa-dom-cse-7.c tests the normalization-of-MEM_REFs part; but AFAICT, the *only* place making exprs like the problematic MEM[(int[8] *)&a ...], is SRA. That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if instead of my patch 2 (normalization of MEM_REFs) we have this: diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 4327990..2889a96 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1697,7 +1697,7 @@ build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset, } else { - off = build_int_cst (reference_alias_ptr_type (base), + off = build_int_cst (build_pointer_type (exp_type), base_offset + offset / BITS_PER_UNIT); base = build_fold_addr_expr (unshare_expr (base)); } ...I'll test that fully but I have to wonder what the right path is here! Cheers, Alan --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c | 17 +++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c | 16 ++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c | 18 ++++++++++++++++++ 3 files changed, 51 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c new file mode 100644 index 0000000..cfbb85f --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c @@ -0,0 +1,17 @@ +/* Test normalization of ARRAY_REF expressions to MEM_REFs in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-dom2" } */ + +#define N 8 + +int +main (int argc, char **argv) +{ + int a[N]; + for (int i = 0; i < N; i++) + a[i] = 2*i + 1; + int *p = &a[0]; + return *(++p); +} + +/* { dg-final { scan-tree-dump-times "return 3;" 1 "dom2"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c new file mode 100644 index 0000000..c387fa3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c @@ -0,0 +1,16 @@ +/* Test normalization of ARRAY_REF expressions to MEM_REFs in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-sra -fno-tree-fre -fdump-tree-dom1" } */ + +int +main (int argc, char **argv) +{ + union { + int a[8]; + int b[2]; + } u = { .a = { 1, 42, 3, 4, 5, 6, 7, 8 } }; + + return u.b[1]; +} + +/* { dg-final { scan-assembler-times "return 42;" 1 "dom1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c new file mode 100644 index 0000000..3f4ca17 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c @@ -0,0 +1,18 @@ +/* Test normalization of MEM_REF expressions in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } + +typedef struct { + int a[8]; +} foo; + +foo f; + +int test () +{ + foo g = { { 1, 2, 3, 4, 5, 6, 7, 8 } }; + f=g; + return f.a[2]; +} + +/* { dg-final { scan-tree-dump-times "return 3;" 1 "optimized" } } */ -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-11-03 10:27 ` Alan Lawrence @ 2015-11-03 11:13 ` Alan Lawrence 2015-11-03 11:49 ` Richard Biener 0 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-11-03 11:13 UTC (permalink / raw) To: gcc-patches; +Cc: law On 3 November 2015 at 10:27, Alan Lawrence <alan.lawrence@arm.com> wrote: > That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if > instead of my patch 2 (normalization of MEM_REFs) we have this: > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 4327990..2889a96 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -1697,7 +1697,7 @@ build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset, > } > else > { > - off = build_int_cst (reference_alias_ptr_type (base), > + off = build_int_cst (build_pointer_type (exp_type), > base_offset + offset / BITS_PER_UNIT); > base = build_fold_addr_expr (unshare_expr (base)); > } > > ...I'll test that fully but I have to wonder what the right path is here! So with also changing the other reference_alias_ptr_type in the first case of build_ref_for_offset, it breaks Ada ACATS (on x86): c52101a "CHECK THAT ARRAY SUBTYPE CONVERSION IS APPLIED AFTER AN ARRAY VALUE IS DETERMINED" cc70003 cxac004 (stream access, stream functions) ....I'll not dig any further unless you think that change to SRA is the right avenue to investigate! Cheers, Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF 2015-11-03 11:13 ` Alan Lawrence @ 2015-11-03 11:49 ` Richard Biener 0 siblings, 0 replies; 26+ messages in thread From: Richard Biener @ 2015-11-03 11:49 UTC (permalink / raw) To: Alan Lawrence; +Cc: gcc-patches, Jeff Law On Tue, Nov 3, 2015 at 12:13 PM, Alan Lawrence <alan.lawrence@arm.com> wrote: > On 3 November 2015 at 10:27, Alan Lawrence <alan.lawrence@arm.com> wrote: >> That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if >> instead of my patch 2 (normalization of MEM_REFs) we have this: >> >> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c >> index 4327990..2889a96 100644 >> --- a/gcc/tree-sra.c >> +++ b/gcc/tree-sra.c >> @@ -1697,7 +1697,7 @@ build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset, >> } >> else >> { >> - off = build_int_cst (reference_alias_ptr_type (base), >> + off = build_int_cst (build_pointer_type (exp_type), >> base_offset + offset / BITS_PER_UNIT); >> base = build_fold_addr_expr (unshare_expr (base)); >> } >> >> ...I'll test that fully but I have to wonder what the right path is here! > > So with also changing the other reference_alias_ptr_type in the first > case of build_ref_for_offset, it breaks Ada ACATS (on x86): > > c52101a "CHECK THAT ARRAY SUBTYPE CONVERSION IS APPLIED AFTER AN ARRAY > VALUE IS DETERMINED" > cc70003 > cxac004 (stream access, stream functions) > > ....I'll not dig any further unless you think that change to SRA is > the right avenue to investigate! Nope, that change looks wrong to me. Richard. > Cheers, Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs. 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence 2015-10-29 19:20 ` [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF Alan Lawrence @ 2015-10-29 19:21 ` Alan Lawrence 2015-10-29 19:22 ` [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices Alan Lawrence ` (3 subsequent siblings) 5 siblings, 0 replies; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:21 UTC (permalink / raw) To: gcc-patches This makes dom2 identify e.g. MEM[(int[8] *)...] with MEM[(int *)...]. These are not generally equivalent as they have different aliasing behaviour but they have the same value as far as dom is concerned and so this helps find more equivalences. There is some question over the best policy here, but using the result type of the MEM_REF seemed the most normalizing. Of course this will not catch e.g. e.g. reading a V2SI vector from a block of memory which was written as two separate SImode writes; but fixing that will require a much heavier-weight approach. gcc/ChangeLog: * tree-ssa-dom.c (dom_normalize_single_rhs): Add case for MEM_REFs. --- gcc/tree-ssa-dom.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index fca8a80..d74fb70 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1931,6 +1931,28 @@ dom_normalize_single_rhs (tree expr) return fold_build2 (MEM_REF, TREE_TYPE (expr), base, offset); } + case MEM_REF: + { + /* Make the target of the pointer equal to the result type of the MEM_REF. + (This is not generally equivalent as it has different aliasing + properties but has the same value and so finds more equivalences.) */ + + tree off = TREE_OPERAND (expr, 1); + + gcc_assert (POINTER_TYPE_P (TREE_TYPE (off))); + tree ptr_target_type = TREE_TYPE (TREE_TYPE (off)); + + if (ptr_target_type == TREE_TYPE (expr)) + return NULL_TREE; + + tree nref = copy_node (expr); + TREE_OPERAND (nref, 1) = copy_node (off); + + TREE_TYPE (TREE_OPERAND (nref, 1)) = + build_pointer_type (TREE_TYPE (expr)); + + return nref; + } default: return NULL_TREE; } -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence 2015-10-29 19:20 ` [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF Alan Lawrence 2015-10-29 19:21 ` [PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs Alan Lawrence @ 2015-10-29 19:22 ` Alan Lawrence 2015-10-30 10:54 ` Richard Biener 2015-10-29 19:22 ` [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions Alan Lawrence ` (2 subsequent siblings) 5 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:22 UTC (permalink / raw) To: gcc-patches The code I added to completely_scalarize for arrays isn't right in some cases of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada testsuite). On ARM, this prevents a failure bootstrapping Ada with the next patch, as well as a few ACATS tests (e.g. c64106a). Some discussion here: https://gcc.gnu.org/ml/gcc/2015-10/msg00096.html . gcc/ChangeLog: * tree-sra.c (completely_scalarize): Deal properly with negative array indices. --- gcc/tree-sra.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index e15df1f..358db79 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1010,18 +1010,25 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) if (maxidx) { gcc_assert (TREE_CODE (maxidx) == INTEGER_CST); - /* MINIDX and MAXIDX are inclusive. Try to avoid overflow. */ - unsigned HOST_WIDE_INT lenp1 = tree_to_shwi (maxidx) - - tree_to_shwi (minidx); + /* MINIDX and MAXIDX are inclusive, and must be interpreted in the + TYPE_DOMAIN (e.g. signed int, whereas min/max may be size_int). + Try also to avoid overflow. */ + minidx = build_int_cst (TYPE_DOMAIN (decl_type), + tree_to_shwi (minidx)); + maxidx = build_int_cst (TYPE_DOMAIN (decl_type), + tree_to_shwi (maxidx)); + HOST_WIDE_INT min = tree_to_shwi (minidx); + unsigned HOST_WIDE_INT lenlessone = tree_to_shwi (maxidx) - min; unsigned HOST_WIDE_INT idx = 0; do { - tree nref = build4 (ARRAY_REF, elemtype, ref, size_int (idx), + tree nref = build4 (ARRAY_REF, elemtype, + ref, size_int (idx + min), NULL_TREE, NULL_TREE); int el_off = offset + idx * el_size; scalarize_elem (base, el_off, el_size, nref, elemtype); } - while (++idx <= lenp1); + while (++idx <= lenlessone); } } break; -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices 2015-10-29 19:22 ` [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices Alan Lawrence @ 2015-10-30 10:54 ` Richard Biener 2015-10-30 10:55 ` Eric Botcazou 0 siblings, 1 reply; 26+ messages in thread From: Richard Biener @ 2015-10-30 10:54 UTC (permalink / raw) To: Alan Lawrence; +Cc: GCC Patches On Thu, Oct 29, 2015 at 8:18 PM, Alan Lawrence <alan.lawrence@arm.com> wrote: > The code I added to completely_scalarize for arrays isn't right in some cases > of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada > testsuite). On ARM, this prevents a failure bootstrapping Ada with the next > patch, as well as a few ACATS tests (e.g. c64106a). > > Some discussion here: https://gcc.gnu.org/ml/gcc/2015-10/msg00096.html . > > gcc/ChangeLog: > > * tree-sra.c (completely_scalarize): Deal properly with negative array > indices. > --- > gcc/tree-sra.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index e15df1f..358db79 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -1010,18 +1010,25 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > if (maxidx) > { > gcc_assert (TREE_CODE (maxidx) == INTEGER_CST); > - /* MINIDX and MAXIDX are inclusive. Try to avoid overflow. */ > - unsigned HOST_WIDE_INT lenp1 = tree_to_shwi (maxidx) > - - tree_to_shwi (minidx); > + /* MINIDX and MAXIDX are inclusive, and must be interpreted in the > + TYPE_DOMAIN (e.g. signed int, whereas min/max may be size_int). > + Try also to avoid overflow. */ > + minidx = build_int_cst (TYPE_DOMAIN (decl_type), > + tree_to_shwi (minidx)); > + maxidx = build_int_cst (TYPE_DOMAIN (decl_type), > + tree_to_shwi (maxidx)); I think you want to use wide-ints here and wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN (...)), TYPE_SIGN (TYPE_DOMAIN (..))); wide_int maxidx = ... you can then simply iterate minidx with ++ and do the final compare against maxidx with while (++idx <= maxidx). For the array ref index we want to use TYPE_DOMAIN as type as well, not size_int. Thus wide_int_to_tree (TYPE_DOMAIN (...)..idx). RIchard. > + HOST_WIDE_INT min = tree_to_shwi (minidx); > + unsigned HOST_WIDE_INT lenlessone = tree_to_shwi (maxidx) - min; > unsigned HOST_WIDE_INT idx = 0; > do > { > - tree nref = build4 (ARRAY_REF, elemtype, ref, size_int (idx), > + tree nref = build4 (ARRAY_REF, elemtype, > + ref, size_int (idx + min), > NULL_TREE, NULL_TREE); > int el_off = offset + idx * el_size; > scalarize_elem (base, el_off, el_size, nref, elemtype); > } > - while (++idx <= lenp1); > + while (++idx <= lenlessone); > } > } > break; > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices 2015-10-30 10:54 ` Richard Biener @ 2015-10-30 10:55 ` Eric Botcazou 2015-11-05 13:22 ` Alan Lawrence 0 siblings, 1 reply; 26+ messages in thread From: Eric Botcazou @ 2015-10-30 10:55 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches, Alan Lawrence > I think you want to use wide-ints here and > > wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN > (...)), TYPE_SIGN (TYPE_DOMAIN (..))); > wide_int maxidx = ... > > you can then simply iterate minidx with ++ and do the final compare > against maxidx > with while (++idx <= maxidx). For the array ref index we want to use > TYPE_DOMAIN > as type as well, not size_int. Thus wide_int_to_tree (TYPE_DOMAIN > (...)..idx). Yes, you generally cannot use HOST_WIDE_INT to avoid overflow because this will break for 64-bit HOST_WIDE_INT and 32-bit sizetype in corner cases. But using offset_int should be OK, see for example get_ref_base_and_extent. -- Eric Botcazou ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices 2015-10-30 10:55 ` Eric Botcazou @ 2015-11-05 13:22 ` Alan Lawrence 2015-11-06 12:30 ` Richard Biener 0 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-11-05 13:22 UTC (permalink / raw) To: gcc-patches; +Cc: ebotcazou, richard.guenther On 30/10/15 10:54, Eric Botcazou wrote: > On 30/10/15 10:44, Richard Biener wrote: >> >> I think you want to use wide-ints here and >> >> wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN >> (...)), TYPE_SIGN (TYPE_DOMAIN (..))); >> wide_int maxidx = ... >> >> you can then simply iterate minidx with ++ and do the final compare >> against maxidx >> with while (++idx <= maxidx). For the array ref index we want to use >> TYPE_DOMAIN >> as type as well, not size_int. Thus wide_int_to_tree (TYPE_DOMAIN (...)..idx). [...] > But using offset_int should be OK, see for example get_ref_base_and_extent. > Here's a patch using offset_int. (Not as easy to construct as wide_int::from, the sign-extend is what appeared to be done elsewhere that constructs offset_ints). Tested by bootstrap+check-{gcc,g++,ada,fortran} with the rest of the patchset (which causes an array[-1..1] to be completely scalarized, among others), on x86_64 and ARM. I don't have a test without all that (such would have to be in Ada, and trigger SRA of such an array but not involving the constant pool); is it OK without? gcc/ChangeLog: * tree-sra.c (completely_scalarize): Properly handle negative array indices using offset_int. --- gcc/tree-sra.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index e15df1f..6168a7e 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1010,18 +1010,25 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) if (maxidx) { gcc_assert (TREE_CODE (maxidx) == INTEGER_CST); - /* MINIDX and MAXIDX are inclusive. Try to avoid overflow. */ - unsigned HOST_WIDE_INT lenp1 = tree_to_shwi (maxidx) - - tree_to_shwi (minidx); - unsigned HOST_WIDE_INT idx = 0; - do + tree domain = TYPE_DOMAIN (decl_type); + /* MINIDX and MAXIDX are inclusive, and must be interpreted in + DOMAIN (e.g. signed int, whereas min/max may be size_int). */ + offset_int idx = wi::to_offset (minidx); + offset_int max = wi::to_offset (maxidx); + if (!TYPE_UNSIGNED (domain)) { - tree nref = build4 (ARRAY_REF, elemtype, ref, size_int (idx), + idx = wi::sext (idx, TYPE_PRECISION (domain)); + max = wi::sext (max, TYPE_PRECISION (domain)); + } + for (int el_off = offset; wi::les_p (idx, max); ++idx) + { + tree nref = build4 (ARRAY_REF, elemtype, + ref, + wide_int_to_tree (domain, idx), NULL_TREE, NULL_TREE); - int el_off = offset + idx * el_size; scalarize_elem (base, el_off, el_size, nref, elemtype); + el_off += el_size; } - while (++idx <= lenp1); } } break; -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices 2015-11-05 13:22 ` Alan Lawrence @ 2015-11-06 12:30 ` Richard Biener 0 siblings, 0 replies; 26+ messages in thread From: Richard Biener @ 2015-11-06 12:30 UTC (permalink / raw) To: Alan Lawrence; +Cc: GCC Patches, Eric Botcazou On Thu, Nov 5, 2015 at 2:22 PM, Alan Lawrence <alan.lawrence@arm.com> wrote: > On 30/10/15 10:54, Eric Botcazou wrote: >> On 30/10/15 10:44, Richard Biener wrote: >>> >>> I think you want to use wide-ints here and >>> >>> wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN >>> (...)), TYPE_SIGN (TYPE_DOMAIN (..))); >>> wide_int maxidx = ... >>> >>> you can then simply iterate minidx with ++ and do the final compare >>> against maxidx >>> with while (++idx <= maxidx). For the array ref index we want to use >>> TYPE_DOMAIN >>> as type as well, not size_int. Thus wide_int_to_tree (TYPE_DOMAIN (...)..idx). > [...] >> But using offset_int should be OK, see for example get_ref_base_and_extent. >> > > Here's a patch using offset_int. (Not as easy to construct as wide_int::from, > the sign-extend is what appeared to be done elsewhere that constructs > offset_ints). > > Tested by bootstrap+check-{gcc,g++,ada,fortran} with the rest of the patchset > (which causes an array[-1..1] to be completely scalarized, among others), on > x86_64 and ARM. > > I don't have a test without all that (such would have to be in Ada, and trigger > SRA of such an array but not involving the constant pool); is it OK without? Ok. Richard. > gcc/ChangeLog: > > * tree-sra.c (completely_scalarize): Properly handle negative array > indices using offset_int. > --- > gcc/tree-sra.c | 23 +++++++++++++++-------- > 1 file changed, 15 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index e15df1f..6168a7e 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -1010,18 +1010,25 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > if (maxidx) > { > gcc_assert (TREE_CODE (maxidx) == INTEGER_CST); > - /* MINIDX and MAXIDX are inclusive. Try to avoid overflow. */ > - unsigned HOST_WIDE_INT lenp1 = tree_to_shwi (maxidx) > - - tree_to_shwi (minidx); > - unsigned HOST_WIDE_INT idx = 0; > - do > + tree domain = TYPE_DOMAIN (decl_type); > + /* MINIDX and MAXIDX are inclusive, and must be interpreted in > + DOMAIN (e.g. signed int, whereas min/max may be size_int). */ > + offset_int idx = wi::to_offset (minidx); > + offset_int max = wi::to_offset (maxidx); > + if (!TYPE_UNSIGNED (domain)) > { > - tree nref = build4 (ARRAY_REF, elemtype, ref, size_int (idx), > + idx = wi::sext (idx, TYPE_PRECISION (domain)); > + max = wi::sext (max, TYPE_PRECISION (domain)); > + } > + for (int el_off = offset; wi::les_p (idx, max); ++idx) > + { > + tree nref = build4 (ARRAY_REF, elemtype, > + ref, > + wide_int_to_tree (domain, idx), > NULL_TREE, NULL_TREE); > - int el_off = offset + idx * el_size; > scalarize_elem (base, el_off, el_size, nref, elemtype); > + el_off += el_size; > } > - while (++idx <= lenp1); > } > } > break; > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions. 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence ` (2 preceding siblings ...) 2015-10-29 19:22 ` [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices Alan Lawrence @ 2015-10-29 19:22 ` Alan Lawrence 2015-10-30 5:35 ` Jeff Law 2015-10-29 19:22 ` [PATCH 3/6] Share code from fold_array_ctor_reference with fold Alan Lawrence 2015-10-29 19:32 ` [PATCH 6/6] Make SRA replace constant-pool loads Alan Lawrence 5 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:22 UTC (permalink / raw) To: gcc-patches gcc/ChangeLog: * tree-sra.c (scalarizable_type_p): Comment variable-length arrays. (completely_scalarize): Comment zero-length arrays. (get_access_replacement): Correct comment re. precondition. --- gcc/tree-sra.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index f851758..e15df1f 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -956,6 +956,7 @@ scalarizable_type_p (tree type) ; else if ((tree_to_shwi (TYPE_SIZE (type)) <= 0) || !tree_fits_shwi_p (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) + /* Variable-length array, do not allow scalarization. */ return false; tree elem = TREE_TYPE (type); @@ -1005,6 +1006,7 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type)); gcc_assert (TREE_CODE (minidx) == INTEGER_CST); tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type)); + /* Skip (some) zero-length arrays; others have MAXIDX == MINIDX - 1. */ if (maxidx) { gcc_assert (TREE_CODE (maxidx) == INTEGER_CST); @@ -2146,7 +2148,7 @@ create_access_replacement (struct access *access) return repl; } -/* Return ACCESS scalar replacement, create it if it does not exist yet. */ +/* Return ACCESS scalar replacement, which must exist. */ static inline tree get_access_replacement (struct access *access) -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions. 2015-10-29 19:22 ` [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions Alan Lawrence @ 2015-10-30 5:35 ` Jeff Law 0 siblings, 0 replies; 26+ messages in thread From: Jeff Law @ 2015-10-30 5:35 UTC (permalink / raw) To: Alan Lawrence, gcc-patches On 10/29/2015 01:18 PM, Alan Lawrence wrote: > gcc/ChangeLog: > > * tree-sra.c (scalarizable_type_p): Comment variable-length arrays. > (completely_scalarize): Comment zero-length arrays. > (get_access_replacement): Correct comment re. precondition. Trivially OK :-) jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 3/6] Share code from fold_array_ctor_reference with fold. 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence ` (3 preceding siblings ...) 2015-10-29 19:22 ` [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions Alan Lawrence @ 2015-10-29 19:22 ` Alan Lawrence 2015-10-30 9:25 ` Richard Biener 2015-11-03 3:20 ` Jeff Law 2015-10-29 19:32 ` [PATCH 6/6] Make SRA replace constant-pool loads Alan Lawrence 5 siblings, 2 replies; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:22 UTC (permalink / raw) To: gcc-patches This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where Richi points out that CONSTRUCTOR elements are not necessarily ordered. I wasn't sure of a good naming convention for the new get_ctor_element_at_index, other suggestions welcome. gcc/ChangeLog: * gimple-fold.c (fold_array_ctor_reference): Move searching code to: * fold-const.c (get_ctor_element_at_index): New. (fold): Remove binary-search through CONSTRUCTOR, call previous. * fold-const.h (get_ctor_element_at_index): New. --- gcc/fold-const.c | 93 ++++++++++++++++++++++++++++++++++++++++--------------- gcc/fold-const.h | 1 + gcc/gimple-fold.c | 47 ++-------------------------- 3 files changed, 72 insertions(+), 69 deletions(-) diff --git a/gcc/fold-const.c b/gcc/fold-const.c index de45a2c..5d27b91 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -12018,6 +12018,72 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type, } /* switch (code) */ } +/* Gets the element ACCESS_INDEX from CTOR, which must be a CONSTRUCTOR. */ + +tree +get_ctor_element_at_index (tree ctor, offset_int access_index) +{ + tree index_type = NULL_TREE; + offset_int low_bound = 0; + + if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE) + { + tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); + if (domain_type && TYPE_MIN_VALUE (domain_type)) + { + /* Static constructors for variably sized objects makes no sense. */ + gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); + index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); + low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); + } + } + + if (index_type) + access_index = wi::ext (access_index, TYPE_PRECISION (index_type), + TYPE_SIGN (index_type)); + + offset_int index = low_bound - 1; + if (index_type) + index = wi::ext (index, TYPE_PRECISION (index_type), + TYPE_SIGN (index_type)); + + offset_int max_index; + unsigned HOST_WIDE_INT cnt; + tree cfield, cval; + + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) + { + /* Array constructor might explicitely set index, or specify range + * or leave index NULL meaning that it is next index after previous + * one. */ + if (cfield) + { + if (TREE_CODE (cfield) == INTEGER_CST) + max_index = index = wi::to_offset (cfield); + else + { + gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); + index = wi::to_offset (TREE_OPERAND (cfield, 0)); + max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); + } + } + else + { + index += 1; + if (index_type) + index = wi::ext (index, TYPE_PRECISION (index_type), + TYPE_SIGN (index_type)); + max_index = index; + } + + /* Do we have match? */ + if (wi::cmpu (access_index, index) >= 0 + && wi::cmpu (access_index, max_index) <= 0) + return cval; + } + return NULL_TREE; +} + /* Perform constant folding and related simplification of EXPR. The related simplifications include x*1 => x, x*0 => 0, etc., and application of the associative law. @@ -12094,31 +12160,8 @@ fold (tree expr) && TREE_CODE (op0) == CONSTRUCTOR && ! type_contains_placeholder_p (TREE_TYPE (op0))) { - vec<constructor_elt, va_gc> *elts = CONSTRUCTOR_ELTS (op0); - unsigned HOST_WIDE_INT end = vec_safe_length (elts); - unsigned HOST_WIDE_INT begin = 0; - - /* Find a matching index by means of a binary search. */ - while (begin != end) - { - unsigned HOST_WIDE_INT middle = (begin + end) / 2; - tree index = (*elts)[middle].index; - - if (TREE_CODE (index) == INTEGER_CST - && tree_int_cst_lt (index, op1)) - begin = middle + 1; - else if (TREE_CODE (index) == INTEGER_CST - && tree_int_cst_lt (op1, index)) - end = middle; - else if (TREE_CODE (index) == RANGE_EXPR - && tree_int_cst_lt (TREE_OPERAND (index, 1), op1)) - begin = middle + 1; - else if (TREE_CODE (index) == RANGE_EXPR - && tree_int_cst_lt (op1, TREE_OPERAND (index, 0))) - end = middle; - else - return (*elts)[middle].value; - } + if (tree val = get_ctor_element_at_index (op0, wi::to_offset (op1))) + return val; } return t; diff --git a/gcc/fold-const.h b/gcc/fold-const.h index ee74dc8..1fa6ee0 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -73,6 +73,7 @@ extern tree fold_build_call_array_loc (location_t, tree, tree, int, tree *); #define fold_build_call_array_initializer(T1,T2,N,T4)\ fold_build_call_array_initializer_loc (UNKNOWN_LOCATION, T1, T2, N, T4) extern tree fold_build_call_array_initializer_loc (location_t, tree, tree, int, tree *); +extern tree get_ctor_element_at_index (tree, offset_int); extern bool fold_convertible_p (const_tree, const_tree); #define fold_convert(T1,T2)\ fold_convert_loc (UNKNOWN_LOCATION, T1, T2) diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 85ff018..823d952 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -5308,13 +5308,10 @@ fold_array_ctor_reference (tree type, tree ctor, unsigned HOST_WIDE_INT size, tree from_decl) { - unsigned HOST_WIDE_INT cnt; - tree cfield, cval; offset_int low_bound; offset_int elt_size; - offset_int index, max_index; offset_int access_index; - tree domain_type = NULL_TREE, index_type = NULL_TREE; + tree domain_type = NULL_TREE; HOST_WIDE_INT inner_offset; /* Compute low bound and elt size. */ @@ -5324,7 +5321,6 @@ fold_array_ctor_reference (tree type, tree ctor, { /* Static constructors for variably sized objects makes no sense. */ gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); - index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); } else @@ -5346,9 +5342,6 @@ fold_array_ctor_reference (tree type, tree ctor, access_index = wi::udiv_trunc (offset_int (offset / BITS_PER_UNIT), elt_size); access_index += low_bound; - if (index_type) - access_index = wi::ext (access_index, TYPE_PRECISION (index_type), - TYPE_SIGN (index_type)); /* And offset within the access. */ inner_offset = offset % (elt_size.to_uhwi () * BITS_PER_UNIT); @@ -5357,43 +5350,9 @@ fold_array_ctor_reference (tree type, tree ctor, care to fold accesses spanning multiple array indexes. */ if (inner_offset + size > elt_size.to_uhwi () * BITS_PER_UNIT) return NULL_TREE; + if (tree val = get_ctor_element_at_index (ctor, access_index)) + return fold_ctor_reference (type, val, inner_offset, size, from_decl); - index = low_bound - 1; - if (index_type) - index = wi::ext (index, TYPE_PRECISION (index_type), - TYPE_SIGN (index_type)); - - FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) - { - /* Array constructor might explicitely set index, or specify range - or leave index NULL meaning that it is next index after previous - one. */ - if (cfield) - { - if (TREE_CODE (cfield) == INTEGER_CST) - max_index = index = wi::to_offset (cfield); - else - { - gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); - index = wi::to_offset (TREE_OPERAND (cfield, 0)); - max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); - } - } - else - { - index += 1; - if (index_type) - index = wi::ext (index, TYPE_PRECISION (index_type), - TYPE_SIGN (index_type)); - max_index = index; - } - - /* Do we have match? */ - if (wi::cmpu (access_index, index) >= 0 - && wi::cmpu (access_index, max_index) <= 0) - return fold_ctor_reference (type, cval, inner_offset, size, - from_decl); - } /* When memory is not explicitely mentioned in constructor, it is 0 (or out of range). */ return build_zero_cst (type); -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold. 2015-10-29 19:22 ` [PATCH 3/6] Share code from fold_array_ctor_reference with fold Alan Lawrence @ 2015-10-30 9:25 ` Richard Biener 2015-11-03 3:20 ` Jeff Law 1 sibling, 0 replies; 26+ messages in thread From: Richard Biener @ 2015-10-30 9:25 UTC (permalink / raw) To: Alan Lawrence; +Cc: GCC Patches On Thu, Oct 29, 2015 at 8:18 PM, Alan Lawrence <alan.lawrence@arm.com> wrote: > This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where > Richi points out that CONSTRUCTOR elements are not necessarily ordered. > > I wasn't sure of a good naming convention for the new get_ctor_element_at_index, > other suggestions welcome. get_array_ctor_element_at_index (ok it also handles vectors). Ok with that change. Richard. > gcc/ChangeLog: > > * gimple-fold.c (fold_array_ctor_reference): Move searching code to: > * fold-const.c (get_ctor_element_at_index): New. > (fold): Remove binary-search through CONSTRUCTOR, call previous. > > * fold-const.h (get_ctor_element_at_index): New. > --- > gcc/fold-const.c | 93 ++++++++++++++++++++++++++++++++++++++++--------------- > gcc/fold-const.h | 1 + > gcc/gimple-fold.c | 47 ++-------------------------- > 3 files changed, 72 insertions(+), 69 deletions(-) > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c > index de45a2c..5d27b91 100644 > --- a/gcc/fold-const.c > +++ b/gcc/fold-const.c > @@ -12018,6 +12018,72 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type, > } /* switch (code) */ > } > > +/* Gets the element ACCESS_INDEX from CTOR, which must be a CONSTRUCTOR. */ > + > +tree > +get_ctor_element_at_index (tree ctor, offset_int access_index) > +{ > + tree index_type = NULL_TREE; > + offset_int low_bound = 0; > + > + if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE) > + { > + tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); > + if (domain_type && TYPE_MIN_VALUE (domain_type)) > + { > + /* Static constructors for variably sized objects makes no sense. */ > + gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); > + index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); > + low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); > + } > + } > + > + if (index_type) > + access_index = wi::ext (access_index, TYPE_PRECISION (index_type), > + TYPE_SIGN (index_type)); > + > + offset_int index = low_bound - 1; > + if (index_type) > + index = wi::ext (index, TYPE_PRECISION (index_type), > + TYPE_SIGN (index_type)); > + > + offset_int max_index; > + unsigned HOST_WIDE_INT cnt; > + tree cfield, cval; > + > + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) > + { > + /* Array constructor might explicitely set index, or specify range > + * or leave index NULL meaning that it is next index after previous > + * one. */ > + if (cfield) > + { > + if (TREE_CODE (cfield) == INTEGER_CST) > + max_index = index = wi::to_offset (cfield); > + else > + { > + gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); > + index = wi::to_offset (TREE_OPERAND (cfield, 0)); > + max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); > + } > + } > + else > + { > + index += 1; > + if (index_type) > + index = wi::ext (index, TYPE_PRECISION (index_type), > + TYPE_SIGN (index_type)); > + max_index = index; > + } > + > + /* Do we have match? */ > + if (wi::cmpu (access_index, index) >= 0 > + && wi::cmpu (access_index, max_index) <= 0) > + return cval; > + } > + return NULL_TREE; > +} > + > /* Perform constant folding and related simplification of EXPR. > The related simplifications include x*1 => x, x*0 => 0, etc., > and application of the associative law. > @@ -12094,31 +12160,8 @@ fold (tree expr) > && TREE_CODE (op0) == CONSTRUCTOR > && ! type_contains_placeholder_p (TREE_TYPE (op0))) > { > - vec<constructor_elt, va_gc> *elts = CONSTRUCTOR_ELTS (op0); > - unsigned HOST_WIDE_INT end = vec_safe_length (elts); > - unsigned HOST_WIDE_INT begin = 0; > - > - /* Find a matching index by means of a binary search. */ > - while (begin != end) > - { > - unsigned HOST_WIDE_INT middle = (begin + end) / 2; > - tree index = (*elts)[middle].index; > - > - if (TREE_CODE (index) == INTEGER_CST > - && tree_int_cst_lt (index, op1)) > - begin = middle + 1; > - else if (TREE_CODE (index) == INTEGER_CST > - && tree_int_cst_lt (op1, index)) > - end = middle; > - else if (TREE_CODE (index) == RANGE_EXPR > - && tree_int_cst_lt (TREE_OPERAND (index, 1), op1)) > - begin = middle + 1; > - else if (TREE_CODE (index) == RANGE_EXPR > - && tree_int_cst_lt (op1, TREE_OPERAND (index, 0))) > - end = middle; > - else > - return (*elts)[middle].value; > - } > + if (tree val = get_ctor_element_at_index (op0, wi::to_offset (op1))) > + return val; > } > > return t; > diff --git a/gcc/fold-const.h b/gcc/fold-const.h > index ee74dc8..1fa6ee0 100644 > --- a/gcc/fold-const.h > +++ b/gcc/fold-const.h > @@ -73,6 +73,7 @@ extern tree fold_build_call_array_loc (location_t, tree, tree, int, tree *); > #define fold_build_call_array_initializer(T1,T2,N,T4)\ > fold_build_call_array_initializer_loc (UNKNOWN_LOCATION, T1, T2, N, T4) > extern tree fold_build_call_array_initializer_loc (location_t, tree, tree, int, tree *); > +extern tree get_ctor_element_at_index (tree, offset_int); > extern bool fold_convertible_p (const_tree, const_tree); > #define fold_convert(T1,T2)\ > fold_convert_loc (UNKNOWN_LOCATION, T1, T2) > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c > index 85ff018..823d952 100644 > --- a/gcc/gimple-fold.c > +++ b/gcc/gimple-fold.c > @@ -5308,13 +5308,10 @@ fold_array_ctor_reference (tree type, tree ctor, > unsigned HOST_WIDE_INT size, > tree from_decl) > { > - unsigned HOST_WIDE_INT cnt; > - tree cfield, cval; > offset_int low_bound; > offset_int elt_size; > - offset_int index, max_index; > offset_int access_index; > - tree domain_type = NULL_TREE, index_type = NULL_TREE; > + tree domain_type = NULL_TREE; > HOST_WIDE_INT inner_offset; > > /* Compute low bound and elt size. */ > @@ -5324,7 +5321,6 @@ fold_array_ctor_reference (tree type, tree ctor, > { > /* Static constructors for variably sized objects makes no sense. */ > gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); > - index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); > low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); > } > else > @@ -5346,9 +5342,6 @@ fold_array_ctor_reference (tree type, tree ctor, > access_index = wi::udiv_trunc (offset_int (offset / BITS_PER_UNIT), > elt_size); > access_index += low_bound; > - if (index_type) > - access_index = wi::ext (access_index, TYPE_PRECISION (index_type), > - TYPE_SIGN (index_type)); > > /* And offset within the access. */ > inner_offset = offset % (elt_size.to_uhwi () * BITS_PER_UNIT); > @@ -5357,43 +5350,9 @@ fold_array_ctor_reference (tree type, tree ctor, > care to fold accesses spanning multiple array indexes. */ > if (inner_offset + size > elt_size.to_uhwi () * BITS_PER_UNIT) > return NULL_TREE; > + if (tree val = get_ctor_element_at_index (ctor, access_index)) > + return fold_ctor_reference (type, val, inner_offset, size, from_decl); > > - index = low_bound - 1; > - if (index_type) > - index = wi::ext (index, TYPE_PRECISION (index_type), > - TYPE_SIGN (index_type)); > - > - FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) > - { > - /* Array constructor might explicitely set index, or specify range > - or leave index NULL meaning that it is next index after previous > - one. */ > - if (cfield) > - { > - if (TREE_CODE (cfield) == INTEGER_CST) > - max_index = index = wi::to_offset (cfield); > - else > - { > - gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); > - index = wi::to_offset (TREE_OPERAND (cfield, 0)); > - max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); > - } > - } > - else > - { > - index += 1; > - if (index_type) > - index = wi::ext (index, TYPE_PRECISION (index_type), > - TYPE_SIGN (index_type)); > - max_index = index; > - } > - > - /* Do we have match? */ > - if (wi::cmpu (access_index, index) >= 0 > - && wi::cmpu (access_index, max_index) <= 0) > - return fold_ctor_reference (type, cval, inner_offset, size, > - from_decl); > - } > /* When memory is not explicitely mentioned in constructor, > it is 0 (or out of range). */ > return build_zero_cst (type); > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold. 2015-10-29 19:22 ` [PATCH 3/6] Share code from fold_array_ctor_reference with fold Alan Lawrence 2015-10-30 9:25 ` Richard Biener @ 2015-11-03 3:20 ` Jeff Law 2015-11-04 14:35 ` Alan Lawrence 1 sibling, 1 reply; 26+ messages in thread From: Jeff Law @ 2015-11-03 3:20 UTC (permalink / raw) To: Alan Lawrence, gcc-patches On 10/29/2015 01:18 PM, Alan Lawrence wrote: > This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where > Richi points out that CONSTRUCTOR elements are not necessarily ordered. > > I wasn't sure of a good naming convention for the new get_ctor_element_at_index, > other suggestions welcome. > > gcc/ChangeLog: > > * gimple-fold.c (fold_array_ctor_reference): Move searching code to: > * fold-const.c (get_ctor_element_at_index): New. > (fold): Remove binary-search through CONSTRUCTOR, call previous. > > * fold-const.h (get_ctor_element_at_index): New. Code is fine. Some formatting nits though. > --- > gcc/fold-const.c | 93 ++++++++++++++++++++++++++++++++++++++++--------------- > gcc/fold-const.h | 1 + > gcc/gimple-fold.c | 47 ++-------------------------- > 3 files changed, 72 insertions(+), 69 deletions(-) > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c > index de45a2c..5d27b91 100644 > --- a/gcc/fold-const.c > +++ b/gcc/fold-const.c > + > + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) > + { > + /* Array constructor might explicitely set index, or specify range > + * or leave index NULL meaning that it is next index after previous > + * one. */ s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines of the comment. It looks like get_ctor_element_at_index has numerous formatting problems. In particular you didn't indent the braces across the board properly. Also check for tabs vs spaces issues please. Can you please fix the formatting and comments and repost for a final check? Thanks, Jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold. 2015-11-03 3:20 ` Jeff Law @ 2015-11-04 14:35 ` Alan Lawrence 2015-11-06 21:19 ` Jeff Law 0 siblings, 1 reply; 26+ messages in thread From: Alan Lawrence @ 2015-11-04 14:35 UTC (permalink / raw) To: law; +Cc: gcc-patches > s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines > of the comment. > > It looks like get_ctor_element_at_index has numerous formatting > problems. In particular you didn't indent the braces across the board > properly. Also check for tabs vs spaces issues please. Yes, you are quite right, I'm not quite sure how those crept in. (Well, the 'explicitely' I am sure was a copy-paste error but the others!). Apologies... I already committed the offending code as r229605, but here's a patch to clean it up - does it look ok now? Thanks, Alan --- gcc/fold-const.c | 58 ++++++++++++++++++++++++++++---------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/gcc/fold-const.c b/gcc/fold-const.c index ee9b349..e977b49 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -11855,16 +11855,16 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index) offset_int low_bound = 0; if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE) - { - tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); - if (domain_type && TYPE_MIN_VALUE (domain_type)) { - /* Static constructors for variably sized objects makes no sense. */ - gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); - index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); - low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); + tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); + if (domain_type && TYPE_MIN_VALUE (domain_type)) + { + /* Static constructors for variably sized objects makes no sense. */ + gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); + index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); + low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); + } } - } if (index_type) access_index = wi::ext (access_index, TYPE_PRECISION (index_type), @@ -11880,29 +11880,29 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index) tree cfield, cval; FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) - { - /* Array constructor might explicitely set index, or specify range - * or leave index NULL meaning that it is next index after previous - * one. */ - if (cfield) { - if (TREE_CODE (cfield) == INTEGER_CST) - max_index = index = wi::to_offset (cfield); + /* Array constructor might explicitly set index, or specify a range, + or leave index NULL meaning that it is next index after previous + one. */ + if (cfield) + { + if (TREE_CODE (cfield) == INTEGER_CST) + max_index = index = wi::to_offset (cfield); + else + { + gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); + index = wi::to_offset (TREE_OPERAND (cfield, 0)); + max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); + } + } else - { - gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); - index = wi::to_offset (TREE_OPERAND (cfield, 0)); - max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); - } - } - else - { - index += 1; - if (index_type) - index = wi::ext (index, TYPE_PRECISION (index_type), - TYPE_SIGN (index_type)); - max_index = index; - } + { + index += 1; + if (index_type) + index = wi::ext (index, TYPE_PRECISION (index_type), + TYPE_SIGN (index_type)); + max_index = index; + } /* Do we have match? */ if (wi::cmpu (access_index, index) >= 0 -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold. 2015-11-04 14:35 ` Alan Lawrence @ 2015-11-06 21:19 ` Jeff Law 0 siblings, 0 replies; 26+ messages in thread From: Jeff Law @ 2015-11-06 21:19 UTC (permalink / raw) To: Alan Lawrence; +Cc: gcc-patches On 11/04/2015 07:35 AM, Alan Lawrence wrote: >> s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines >> of the comment. >> >> It looks like get_ctor_element_at_index has numerous formatting >> problems. In particular you didn't indent the braces across the board >> properly. Also check for tabs vs spaces issues please. > > Yes, you are quite right, I'm not quite sure how those crept in. (Well, the > 'explicitely' I am sure was a copy-paste error but the others!). Apologies... Yea, the explicitely was in the original. I'd actually gone back to the original thinking the bogus formatting might have been in there too. But it looked reasonable. > > I already committed the offending code as r229605, but here's a patch to clean > it up - does it look ok now? Yes. jeff ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 6/6] Make SRA replace constant-pool loads 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence ` (4 preceding siblings ...) 2015-10-29 19:22 ` [PATCH 3/6] Share code from fold_array_ctor_reference with fold Alan Lawrence @ 2015-10-29 19:32 ` Alan Lawrence 2015-10-31 23:48 ` Bernhard Reutner-Fischer 2015-11-03 14:01 ` Richard Biener 5 siblings, 2 replies; 26+ messages in thread From: Alan Lawrence @ 2015-10-29 19:32 UTC (permalink / raw) To: gcc-patches This has changed quite a bit since the previous revision (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada and specifically Ada on ARM. I didn't find a good alternative to scanning for constant-pool accesses "as we go" through the function, and although I didn't find any of these accesses being disqualified after we'd found them in C, some awkward constant-pool expressions in Ada forced me to disqualify them for a new reason (inability to constant-propagate). Hence, I introduced a new bitmap of disqualified_constants, rather than an additional pass through the function. Next, the approach of using a replacement_expr (with tho constant value) in place of the old replacement_decl, also failed (the decl could be used in an expression where substituting in the expr produced ill-formed gimple). Hence, constant-pool entries now use replacement_decls just like everything else, for which I generate initializers in initialize_constant_pool_replacements. However, I was able to avoid generating the replacement constants early, and to do so late in initialize_constant_pool_replacements; the main trick here was to use fold_array_ctor_reference, kindly pointed out by Richie :), to fold the MEM_REFs built in analyze_access_subtree. Finally, I found completely_scalarize was still failing to fold all the constant expressions, because Ada was putting VIEW_CONVERT_EXPRs in the constant pool, and fold-const.c did not deal with ARRAY_REFs of these. However, requiring completely_scalarize to be able to fold everything, seemed fragile, instead I decoupled from fold-const by allowing to fail. With these changes, ssa-dom-cse-2.c is fixed on all platforms with appropriate --param. There are still a few remaining test failures, on AArch64: gcc.dg/guality/pr54970.c -O1 line 15 a[0] == 1 gcc.dg/guality/pr54970.c -O1 line 20 a[0] == 1 gcc.dg/guality/pr54970.c -O1 line 25 a[0] == 1 (also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os). ...which I'm working on. I also tested with the hack at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01483.html . This revealed one new failure in advsimd-intrinsics/vldX.c on AArch64 (fixed by https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02777.html), and fixed a bunch of guality failures on ARM: gcc.dg/guality/pr54970.c -O1 line 31 a[0] == 4 gcc.dg/guality/pr54970.c -O1 line 36 a[0] == 4 gcc.dg/guality/pr54970.c -O1 line 45 a[0] == 4 gcc.dg/guality/pr54970.c -O1 line 45 p[-2] == 4 gcc.dg/guality/pr54970.c -O1 line 45 q[-1] == 4 (also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os) --Alan gcc/ChangeLog: * tree-sra.c (disqualified_constants): New. (sra_initialize): Initialize it. (sra_deinitialize): Deallocate it. (disqualify_candidate): Set bit in disqualified_constants. (subst_constant_pool_initial): New. (create_access): Scan for constant-pool entries as we go. (scalarizable_type_p): Disallow types containing placeholders. (completely_scalarize): Return bool to allow failure. (scalarize_elem): Likewise; check we can generate constant replacements. (maybe_add_sra_candidate): Allow constant-pool entries. (analyze_access_subtree): Checking-assert that we can fold any refs built for constant-pool entries. (analyze_all_variable_accesses): Deal with completely_scalarize failing. (initialize_constant_pool_replacements): New. (sra_modify_function_body): Call previous. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Add sra-max-scalarization-size param, remove xfail. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 7 +- gcc/tree-sra.c | 221 ++++++++++++++++++++++++-- 2 files changed, 208 insertions(+), 20 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c index 9eccdc9..b13d583 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */ +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized --param sra-max-scalarization-size-Ospeed=32" } */ int foo () @@ -17,7 +17,4 @@ foo () /* After late unrolling the above loop completely DOM should be able to optimize this to return 28. */ -/* See PR63679 and PR64159, if the target forces the initializer to memory then - DOM is not able to perform this optimization. */ - -/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail aarch64*-*-* alpha*-*-* hppa*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } */ +/* { dg-final { scan-tree-dump "return 28;" "optimized" } } */ diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 358db79..1920265 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -90,6 +90,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-iterator.h" #include "gimplify-me.h" #include "gimple-walk.h" +#include "gimple-fold.h" #include "tree-cfg.h" #include "flags.h" #include "insn-config.h" @@ -336,6 +337,10 @@ candidate (unsigned uid) those which cannot be (because they are and need be used as a whole). */ static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap; +/* Bitmap of candidates in the constant pool, which cannot be scalarized + because this would produce non-constant expressions (e.g. Ada). */ +static bitmap disqualified_constants; + /* Obstack for creation of fancy names. */ static struct obstack name_obstack; @@ -658,6 +663,7 @@ sra_initialize (void) (vec_safe_length (cfun->local_decls) / 2); should_scalarize_away_bitmap = BITMAP_ALLOC (NULL); cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL); + disqualified_constants = BITMAP_ALLOC (NULL); gcc_obstack_init (&name_obstack); base_access_vec = new hash_map<tree, auto_vec<access_p> >; memset (&sra_stats, 0, sizeof (sra_stats)); @@ -676,6 +682,7 @@ sra_deinitialize (void) candidates = NULL; BITMAP_FREE (should_scalarize_away_bitmap); BITMAP_FREE (cannot_scalarize_away_bitmap); + BITMAP_FREE (disqualified_constants); access_pool.release (); assign_link_pool.release (); obstack_free (&name_obstack, NULL); @@ -690,6 +697,8 @@ disqualify_candidate (tree decl, const char *reason) { if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl))) candidates->remove_elt_with_hash (decl, DECL_UID (decl)); + if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl)) + bitmap_set_bit (disqualified_constants, DECL_UID (decl)); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -841,8 +850,76 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size) return access; } +/* Given EXPR a chain of COMPONENT_REFs and/or ARRAY_REFs built around a + constant-pool declaration VAR, builds a tree like EXPR but with VAR replaced + by its DECL_INITIAL, and tries to fold it to a constant. If folding succeeds + then return the folded tree, else NULL_TREE. */ + +static tree +subst_constant_pool_initial (tree expr, tree var) +{ + if (TREE_CODE (expr) == VAR_DECL) + { + gcc_assert (DECL_IN_CONSTANT_POOL (expr)); + gcc_assert (expr == var); + return DECL_INITIAL (expr); + } + if (TREE_CODE (expr) == COMPONENT_REF) + { + tree field = TREE_OPERAND (expr, 1); + gcc_assert (TREE_CODE (field) == FIELD_DECL); + gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE); + tree record = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var); + if (!record) + return NULL_TREE; + tree ref = build3 (COMPONENT_REF, TREE_TYPE (expr), + record, field, NULL_TREE); + tree res = fold (ref); + if (res != ref) + return res; + } + else if (TREE_CODE (expr) == ARRAY_REF) + { + tree array = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var); + if (!array) + return NULL_TREE; + tree ref = build4 (ARRAY_REF, TREE_TYPE (expr), + array, + TREE_OPERAND (expr, 1), + TREE_OPERAND (expr, 2), + TREE_OPERAND (expr, 3)); + /* Ada (at least) builds ARRAY_REFs around strings. */ + if (tree folded = fold_read_from_constant_string (ref)) + return folded; + tree res = fold (ref); + if (res != ref) + return res; + } + else if (TREE_CODE (expr) == MEM_REF) + { + gcc_assert (TREE_CODE (TREE_OPERAND (expr, 0)) == ADDR_EXPR); + tree type = TREE_TYPE (expr); + tree sz = TYPE_SIZE (TREE_TYPE (expr)); + tree val = TREE_OPERAND (TREE_OPERAND (expr, 0), 0); + val = subst_constant_pool_initial (val, var); + if (TREE_CODE (val) != CONSTRUCTOR + || !tree_fits_uhwi_p (TREE_OPERAND (expr, 1)) + || !tree_fits_uhwi_p (sz)) + return NULL_TREE; + unsigned HOST_WIDE_INT offset = + tree_to_uhwi (TREE_OPERAND (expr, 1)) * BITS_PER_UNIT; + return fold_ctor_reference (type, val, offset, tree_to_uhwi (sz), var); + } + + /* Unknown expression, or could not constant-fold. */ + return NULL_TREE; +} + +static bool maybe_add_sra_candidate (tree var); + /* Create and insert access for EXPR. Return created access, or NULL if it is - not possible. */ + not possible. Also scan for uses of constant pool as we go along and add + to candidates. */ static struct access * create_access (tree expr, gimple *stmt, bool write) @@ -865,6 +942,32 @@ create_access (tree expr, gimple *stmt, bool write) else ptr = false; + /* For constant-pool entries, check we can substitute the constant value. */ + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) + && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) + { + gcc_assert (!write); + if (bitmap_bit_p (disqualified_constants, DECL_UID (base))) + return NULL; + tree repl = subst_constant_pool_initial (expr, base); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Trying to scalarize constant-pool "); + print_generic_expr (dump_file, base, 0); + fprintf (dump_file, "\n In expression: "); + print_generic_expr (dump_file, expr, 0); + fprintf (dump_file, "\n Constant replacement: "); + print_generic_expr (dump_file, repl, 0); + fprintf (dump_file, "\n"); + } + /* If we can't resolve the expression to a constant, we risk creating + a malformed expression (e.g. Ada testsuite, ACATS chapter C3). */ + if (!repl || !TREE_CONSTANT (repl)) + disqualify_candidate (base, "Cannot constant-propagate out of pool."); + else + maybe_add_sra_candidate (base); + } + if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) return NULL; @@ -923,6 +1026,8 @@ static bool scalarizable_type_p (tree type) { gcc_assert (!is_gimple_reg_type (type)); + if (type_contains_placeholder_p (type)) + return false; switch (TREE_CODE (type)) { @@ -970,15 +1075,19 @@ scalarizable_type_p (tree type) } } -static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree); +static bool scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree); /* Create total_scalarization accesses for all scalar fields of a member of type DECL_TYPE conforming to scalarizable_type_p. BASE must be the top-most VAR_DECL representing the variable; within that, OFFSET locates the member and REF must be the memory reference expression for - the member. */ + the member. -static void + Return TRUE if successful; may fail in the case of constant-pool entry BASE + whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or + COMPONENT_REF expressions. */ + +static bool completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) { switch (TREE_CODE (decl_type)) @@ -991,10 +1100,11 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) tree ft = TREE_TYPE (fld); tree nref = build3 (COMPONENT_REF, ft, ref, fld, NULL_TREE); - scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)), nref, - ft); + if (!scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)), + nref, ft)) + return false; } - break; + return true; case ARRAY_TYPE: { tree elemtype = TREE_TYPE (decl_type); @@ -1026,12 +1136,13 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) ref, size_int (idx + min), NULL_TREE, NULL_TREE); int el_off = offset + idx * el_size; - scalarize_elem (base, el_off, el_size, nref, elemtype); + if (!scalarize_elem (base, el_off, el_size, nref, elemtype)) + return false; } while (++idx <= lenlessone); } } - break; + return true; default: gcc_unreachable (); } @@ -1040,22 +1151,32 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) /* Create total_scalarization accesses for a member of type TYPE, which must satisfy either is_gimple_reg_type or scalarizable_type_p. BASE must be the top-most VAR_DECL representing the variable; within that, POS and SIZE locate - the member and REF must be the reference expression for it. */ + the member and REF must be the reference expression for it. -static void + Return TRUE if successful; may fail in the case of constant-pool entry BASE + whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or + COMPONENT_REF expressions. */ + +static bool scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, tree ref, tree type) { if (is_gimple_reg_type (type)) { + if (TREE_CODE (base) == VAR_DECL + && DECL_IN_CONSTANT_POOL (base) + && subst_constant_pool_initial (ref, base) == NULL_TREE) + return false; struct access *access = create_access_1 (base, pos, size); access->expr = ref; access->type = type; access->grp_total_scalarization = 1; /* Accesses for intraprocedural SRA can have their stmt NULL. */ + + return true; } - else - completely_scalarize (base, type, pos, ref); + + return completely_scalarize (base, type, pos, ref); } /* Create a total_scalarization access for VAR as a whole. VAR must be of a @@ -1843,7 +1964,12 @@ maybe_add_sra_candidate (tree var) reject (var, "not aggregate"); return false; } - if (needs_to_live_in_memory (var)) + /* Allow constant-pool entries (that "need to live in memory") + unless we are doing IPA SRA. */ + if (needs_to_live_in_memory (var) + && (sra_mode == SRA_MODE_EARLY_IPA + || TREE_CODE (var) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (var))) { reject (var, "needs to live in memory"); return false; @@ -2343,6 +2469,10 @@ analyze_access_subtree (struct access *root, struct access *parent, (unsigned) root->offset, (unsigned) root->size); fprintf (dump_file, " to an integer.\n"); } + gcc_checking_assert (TREE_CODE (root->base) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (root->base) + || subst_constant_pool_initial (root->expr, + root->base)); } root->grp_to_be_replaced = 1; @@ -2604,8 +2734,17 @@ analyze_all_variable_accesses (void) if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var))) <= max_scalarization_size) { + vec<access_p> &accesses = base_access_vec->get_or_insert (var); + unsigned orig_count = accesses.length (); + if (!completely_scalarize (var, TREE_TYPE (var), 0, var)) + { + /* Failed. Discard any accesses created during attempt. */ + for (unsigned i = orig_count; i < accesses.length (); i++) + access_pool.remove (accesses[i]); + accesses.truncate (orig_count); + continue; + } create_total_scalarization_access (var); - completely_scalarize (var, TREE_TYPE (var), 0, var); if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "Will attempt to totally scalarize "); @@ -3480,6 +3619,56 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) } } +static void +initialize_constant_pool_replacements (void) +{ + gimple_seq seq = NULL; + gimple_stmt_iterator gsi = gsi_start (seq); + bitmap_iterator bi; + unsigned i; + + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) + if (bitmap_bit_p (should_scalarize_away_bitmap, i) + && !bitmap_bit_p (cannot_scalarize_away_bitmap, i)) + { + tree var = candidate (i); + if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var)) + continue; + vec<access_p> *access_vec = get_base_access_vector (var); + if (!access_vec) + continue; + for (unsigned i = 0; i < access_vec->length (); i++) + { + struct access *access = (*access_vec)[i]; + tree val = subst_constant_pool_initial (access->expr, access->base); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Expression "); + print_generic_expr (dump_file, access->expr, 0); + fprintf (dump_file, " replaced with "); + print_generic_expr (dump_file, access->replacement_decl, 0); + fprintf (dump_file, " initialized to "); + print_generic_expr (dump_file, val, 0); + fprintf (dump_file, "\n"); + if (!access->replacement_decl) + dump_access (dump_file, access, true); + } + if (!access->replacement_decl) + continue; + gcc_assert (val); + gassign *stmt = gimple_build_assign ( + get_access_replacement (access), val); + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); + update_stmt (stmt); + } + } + + seq = gsi_seq (gsi); + if (seq) + gsi_insert_seq_on_edge_immediate ( + single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq); +} + /* Traverse the function body and all modifications as decided in analyze_all_variable_accesses. Return true iff the CFG has been changed. */ @@ -3490,6 +3679,8 @@ sra_modify_function_body (void) bool cfg_changed = false; basic_block bb; + initialize_constant_pool_replacements (); + FOR_EACH_BB_FN (bb, cfun) { gimple_stmt_iterator gsi = gsi_start_bb (bb); -- 1.9.1 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] Make SRA replace constant-pool loads 2015-10-29 19:32 ` [PATCH 6/6] Make SRA replace constant-pool loads Alan Lawrence @ 2015-10-31 23:48 ` Bernhard Reutner-Fischer 2015-11-03 14:01 ` Richard Biener 1 sibling, 0 replies; 26+ messages in thread From: Bernhard Reutner-Fischer @ 2015-10-31 23:48 UTC (permalink / raw) To: Alan Lawrence, gcc-patches On October 29, 2015 8:18:22 PM GMT+01:00, Alan Lawrence <alan.lawrence@arm.com> wrote: >+ >+static tree >+subst_constant_pool_initial (tree expr, tree var) >+{ >+ if (TREE_CODE (expr) == VAR_DECL) Just a nit, but i thought we had VAR_DECL_P or VAR_P for the TREE_CODE (NODE) == VAR_DECL predicate? Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] Make SRA replace constant-pool loads 2015-10-29 19:32 ` [PATCH 6/6] Make SRA replace constant-pool loads Alan Lawrence 2015-10-31 23:48 ` Bernhard Reutner-Fischer @ 2015-11-03 14:01 ` Richard Biener 2015-11-05 18:55 ` Alan Lawrence 1 sibling, 1 reply; 26+ messages in thread From: Richard Biener @ 2015-11-03 14:01 UTC (permalink / raw) To: Alan Lawrence; +Cc: GCC Patches On Thu, Oct 29, 2015 at 8:18 PM, Alan Lawrence <alan.lawrence@arm.com> wrote: > This has changed quite a bit since the previous revision > (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada > and specifically Ada on ARM. > > I didn't find a good alternative to scanning for constant-pool accesses "as we > go" through the function, and although I didn't find any of these accesses > being disqualified after we'd found them in C, some awkward constant-pool > expressions in Ada forced me to disqualify them for a new reason (inability to > constant-propagate). Hence, I introduced a new bitmap of > disqualified_constants, rather than an additional pass through the function. > > Next, the approach of using a replacement_expr (with tho constant value) in > place of the old replacement_decl, also failed (the decl could be used in an > expression where substituting in the expr produced ill-formed gimple). Hence, > constant-pool entries now use replacement_decls just like everything else, for > which I generate initializers in initialize_constant_pool_replacements. > > However, I was able to avoid generating the replacement constants early, and to > do so late in initialize_constant_pool_replacements; the main trick here was to > use fold_array_ctor_reference, kindly pointed out by Richie :), to fold the > MEM_REFs built in analyze_access_subtree. > > Finally, I found completely_scalarize was still failing to fold all the > constant expressions, because Ada was putting VIEW_CONVERT_EXPRs in the > constant pool, and fold-const.c did not deal with ARRAY_REFs of these. However, > requiring completely_scalarize to be able to fold everything, seemed fragile, > instead I decoupled from fold-const by allowing to fail. > > With these changes, ssa-dom-cse-2.c is fixed on all platforms with appropriate > --param. Hum. I still wonder why we need all this complication ... I would expect that if we simply decided to completely scalarize a constant pool aggregate load then we can always do that. SRA should then simply emit the _same_ IL as it does for other complete scalarization element accesses. The only difference is that the result should be simplifiable to omit the element load from the constant pool. 1) The FRE pass following SRA should do that for you. 2) You should be able to use fold_ctor_reference directly (in place of all your code in case offset and size are readily available - don't remember exactly how complete scalarization "walks" elements). Alternatively use fold_const_aggregate_ref. 3) You can simplify the stmt SRA generated by simply calling fold_stmt on it, that will do a bit more (wasted) work compared to 2) but may be easier. I wouldn't bother with the case where we for some reason do not simplify the constant pool load. That is, I'd like to see the patch greatly simplified to just consider constant pool complete scalarization as if it were a regular variable. Richard. > > There are still a few remaining test failures, on AArch64: > gcc.dg/guality/pr54970.c -O1 line 15 a[0] == 1 > gcc.dg/guality/pr54970.c -O1 line 20 a[0] == 1 > gcc.dg/guality/pr54970.c -O1 line 25 a[0] == 1 > (also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os). > ...which I'm working on. I also tested with the hack at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01483.html . This revealed one new failure in advsimd-intrinsics/vldX.c on AArch64 (fixed by https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02777.html), and fixed a bunch of guality failures on ARM: > gcc.dg/guality/pr54970.c -O1 line 31 a[0] == 4 > gcc.dg/guality/pr54970.c -O1 line 36 a[0] == 4 > gcc.dg/guality/pr54970.c -O1 line 45 a[0] == 4 > gcc.dg/guality/pr54970.c -O1 line 45 p[-2] == 4 > gcc.dg/guality/pr54970.c -O1 line 45 q[-1] == 4 > (also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os) > > --Alan > > gcc/ChangeLog: > > * tree-sra.c (disqualified_constants): New. > (sra_initialize): Initialize it. > (sra_deinitialize): Deallocate it. > (disqualify_candidate): Set bit in disqualified_constants. > (subst_constant_pool_initial): New. > (create_access): Scan for constant-pool entries as we go. > (scalarizable_type_p): Disallow types containing placeholders. > (completely_scalarize): Return bool to allow failure. > (scalarize_elem): Likewise; check we can generate constant replacements. > (maybe_add_sra_candidate): Allow constant-pool entries. > (analyze_access_subtree): Checking-assert that we can fold any refs > built for constant-pool entries. > (analyze_all_variable_accesses): Deal with completely_scalarize failing. > (initialize_constant_pool_replacements): New. > (sra_modify_function_body): Call previous. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Add sra-max-scalarization-size param, > remove xfail. > --- > gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 7 +- > gcc/tree-sra.c | 221 ++++++++++++++++++++++++-- > 2 files changed, 208 insertions(+), 20 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > index 9eccdc9..b13d583 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */ > +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized --param sra-max-scalarization-size-Ospeed=32" } */ > > int > foo () > @@ -17,7 +17,4 @@ foo () > /* After late unrolling the above loop completely DOM should be > able to optimize this to return 28. */ > > -/* See PR63679 and PR64159, if the target forces the initializer to memory then > - DOM is not able to perform this optimization. */ > - > -/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail aarch64*-*-* alpha*-*-* hppa*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } */ > +/* { dg-final { scan-tree-dump "return 28;" "optimized" } } */ > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 358db79..1920265 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -90,6 +90,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-iterator.h" > #include "gimplify-me.h" > #include "gimple-walk.h" > +#include "gimple-fold.h" > #include "tree-cfg.h" > #include "flags.h" > #include "insn-config.h" > @@ -336,6 +337,10 @@ candidate (unsigned uid) > those which cannot be (because they are and need be used as a whole). */ > static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap; > > +/* Bitmap of candidates in the constant pool, which cannot be scalarized > + because this would produce non-constant expressions (e.g. Ada). */ > +static bitmap disqualified_constants; > + > /* Obstack for creation of fancy names. */ > static struct obstack name_obstack; > > @@ -658,6 +663,7 @@ sra_initialize (void) > (vec_safe_length (cfun->local_decls) / 2); > should_scalarize_away_bitmap = BITMAP_ALLOC (NULL); > cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL); > + disqualified_constants = BITMAP_ALLOC (NULL); > gcc_obstack_init (&name_obstack); > base_access_vec = new hash_map<tree, auto_vec<access_p> >; > memset (&sra_stats, 0, sizeof (sra_stats)); > @@ -676,6 +682,7 @@ sra_deinitialize (void) > candidates = NULL; > BITMAP_FREE (should_scalarize_away_bitmap); > BITMAP_FREE (cannot_scalarize_away_bitmap); > + BITMAP_FREE (disqualified_constants); > access_pool.release (); > assign_link_pool.release (); > obstack_free (&name_obstack, NULL); > @@ -690,6 +697,8 @@ disqualify_candidate (tree decl, const char *reason) > { > if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl))) > candidates->remove_elt_with_hash (decl, DECL_UID (decl)); > + if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl)) > + bitmap_set_bit (disqualified_constants, DECL_UID (decl)); > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -841,8 +850,76 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size) > return access; > } > > +/* Given EXPR a chain of COMPONENT_REFs and/or ARRAY_REFs built around a > + constant-pool declaration VAR, builds a tree like EXPR but with VAR replaced > + by its DECL_INITIAL, and tries to fold it to a constant. If folding succeeds > + then return the folded tree, else NULL_TREE. */ > + > +static tree > +subst_constant_pool_initial (tree expr, tree var) > +{ > + if (TREE_CODE (expr) == VAR_DECL) > + { > + gcc_assert (DECL_IN_CONSTANT_POOL (expr)); > + gcc_assert (expr == var); > + return DECL_INITIAL (expr); > + } > + if (TREE_CODE (expr) == COMPONENT_REF) > + { > + tree field = TREE_OPERAND (expr, 1); > + gcc_assert (TREE_CODE (field) == FIELD_DECL); > + gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE); > + tree record = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var); > + if (!record) > + return NULL_TREE; > + tree ref = build3 (COMPONENT_REF, TREE_TYPE (expr), > + record, field, NULL_TREE); > + tree res = fold (ref); > + if (res != ref) > + return res; > + } > + else if (TREE_CODE (expr) == ARRAY_REF) > + { > + tree array = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var); > + if (!array) > + return NULL_TREE; > + tree ref = build4 (ARRAY_REF, TREE_TYPE (expr), > + array, > + TREE_OPERAND (expr, 1), > + TREE_OPERAND (expr, 2), > + TREE_OPERAND (expr, 3)); > + /* Ada (at least) builds ARRAY_REFs around strings. */ > + if (tree folded = fold_read_from_constant_string (ref)) > + return folded; > + tree res = fold (ref); > + if (res != ref) > + return res; > + } > + else if (TREE_CODE (expr) == MEM_REF) > + { > + gcc_assert (TREE_CODE (TREE_OPERAND (expr, 0)) == ADDR_EXPR); > + tree type = TREE_TYPE (expr); > + tree sz = TYPE_SIZE (TREE_TYPE (expr)); > + tree val = TREE_OPERAND (TREE_OPERAND (expr, 0), 0); > + val = subst_constant_pool_initial (val, var); > + if (TREE_CODE (val) != CONSTRUCTOR > + || !tree_fits_uhwi_p (TREE_OPERAND (expr, 1)) > + || !tree_fits_uhwi_p (sz)) > + return NULL_TREE; > + unsigned HOST_WIDE_INT offset = > + tree_to_uhwi (TREE_OPERAND (expr, 1)) * BITS_PER_UNIT; > + return fold_ctor_reference (type, val, offset, tree_to_uhwi (sz), var); > + } > + > + /* Unknown expression, or could not constant-fold. */ > + return NULL_TREE; > +} > + > +static bool maybe_add_sra_candidate (tree var); > + > /* Create and insert access for EXPR. Return created access, or NULL if it is > - not possible. */ > + not possible. Also scan for uses of constant pool as we go along and add > + to candidates. */ > > static struct access * > create_access (tree expr, gimple *stmt, bool write) > @@ -865,6 +942,32 @@ create_access (tree expr, gimple *stmt, bool write) > else > ptr = false; > > + /* For constant-pool entries, check we can substitute the constant value. */ > + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) > + && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) > + { > + gcc_assert (!write); > + if (bitmap_bit_p (disqualified_constants, DECL_UID (base))) > + return NULL; > + tree repl = subst_constant_pool_initial (expr, base); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Trying to scalarize constant-pool "); > + print_generic_expr (dump_file, base, 0); > + fprintf (dump_file, "\n In expression: "); > + print_generic_expr (dump_file, expr, 0); > + fprintf (dump_file, "\n Constant replacement: "); > + print_generic_expr (dump_file, repl, 0); > + fprintf (dump_file, "\n"); > + } > + /* If we can't resolve the expression to a constant, we risk creating > + a malformed expression (e.g. Ada testsuite, ACATS chapter C3). */ > + if (!repl || !TREE_CONSTANT (repl)) > + disqualify_candidate (base, "Cannot constant-propagate out of pool."); > + else > + maybe_add_sra_candidate (base); > + } > + > if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) > return NULL; > > @@ -923,6 +1026,8 @@ static bool > scalarizable_type_p (tree type) > { > gcc_assert (!is_gimple_reg_type (type)); > + if (type_contains_placeholder_p (type)) > + return false; > > switch (TREE_CODE (type)) > { > @@ -970,15 +1075,19 @@ scalarizable_type_p (tree type) > } > } > > -static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree); > +static bool scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree); > > /* Create total_scalarization accesses for all scalar fields of a member > of type DECL_TYPE conforming to scalarizable_type_p. BASE > must be the top-most VAR_DECL representing the variable; within that, > OFFSET locates the member and REF must be the memory reference expression for > - the member. */ > + the member. > > -static void > + Return TRUE if successful; may fail in the case of constant-pool entry BASE > + whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or > + COMPONENT_REF expressions. */ > + > +static bool > completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > { > switch (TREE_CODE (decl_type)) > @@ -991,10 +1100,11 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > tree ft = TREE_TYPE (fld); > tree nref = build3 (COMPONENT_REF, ft, ref, fld, NULL_TREE); > > - scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)), nref, > - ft); > + if (!scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)), > + nref, ft)) > + return false; > } > - break; > + return true; > case ARRAY_TYPE: > { > tree elemtype = TREE_TYPE (decl_type); > @@ -1026,12 +1136,13 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > ref, size_int (idx + min), > NULL_TREE, NULL_TREE); > int el_off = offset + idx * el_size; > - scalarize_elem (base, el_off, el_size, nref, elemtype); > + if (!scalarize_elem (base, el_off, el_size, nref, elemtype)) > + return false; > } > while (++idx <= lenlessone); > } > } > - break; > + return true; > default: > gcc_unreachable (); > } > @@ -1040,22 +1151,32 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > /* Create total_scalarization accesses for a member of type TYPE, which must > satisfy either is_gimple_reg_type or scalarizable_type_p. BASE must be the > top-most VAR_DECL representing the variable; within that, POS and SIZE locate > - the member and REF must be the reference expression for it. */ > + the member and REF must be the reference expression for it. > > -static void > + Return TRUE if successful; may fail in the case of constant-pool entry BASE > + whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or > + COMPONENT_REF expressions. */ > + > +static bool > scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, > tree ref, tree type) > { > if (is_gimple_reg_type (type)) > { > + if (TREE_CODE (base) == VAR_DECL > + && DECL_IN_CONSTANT_POOL (base) > + && subst_constant_pool_initial (ref, base) == NULL_TREE) > + return false; > struct access *access = create_access_1 (base, pos, size); > access->expr = ref; > access->type = type; > access->grp_total_scalarization = 1; > /* Accesses for intraprocedural SRA can have their stmt NULL. */ > + > + return true; > } > - else > - completely_scalarize (base, type, pos, ref); > + > + return completely_scalarize (base, type, pos, ref); > } > > /* Create a total_scalarization access for VAR as a whole. VAR must be of a > @@ -1843,7 +1964,12 @@ maybe_add_sra_candidate (tree var) > reject (var, "not aggregate"); > return false; > } > - if (needs_to_live_in_memory (var)) > + /* Allow constant-pool entries (that "need to live in memory") > + unless we are doing IPA SRA. */ > + if (needs_to_live_in_memory (var) > + && (sra_mode == SRA_MODE_EARLY_IPA > + || TREE_CODE (var) != VAR_DECL > + || !DECL_IN_CONSTANT_POOL (var))) > { > reject (var, "needs to live in memory"); > return false; > @@ -2343,6 +2469,10 @@ analyze_access_subtree (struct access *root, struct access *parent, > (unsigned) root->offset, (unsigned) root->size); > fprintf (dump_file, " to an integer.\n"); > } > + gcc_checking_assert (TREE_CODE (root->base) != VAR_DECL > + || !DECL_IN_CONSTANT_POOL (root->base) > + || subst_constant_pool_initial (root->expr, > + root->base)); > } > > root->grp_to_be_replaced = 1; > @@ -2604,8 +2734,17 @@ analyze_all_variable_accesses (void) > if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var))) > <= max_scalarization_size) > { > + vec<access_p> &accesses = base_access_vec->get_or_insert (var); > + unsigned orig_count = accesses.length (); > + if (!completely_scalarize (var, TREE_TYPE (var), 0, var)) > + { > + /* Failed. Discard any accesses created during attempt. */ > + for (unsigned i = orig_count; i < accesses.length (); i++) > + access_pool.remove (accesses[i]); > + accesses.truncate (orig_count); > + continue; > + } > create_total_scalarization_access (var); > - completely_scalarize (var, TREE_TYPE (var), 0, var); > if (dump_file && (dump_flags & TDF_DETAILS)) > { > fprintf (dump_file, "Will attempt to totally scalarize "); > @@ -3480,6 +3619,56 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) > } > } > > +static void > +initialize_constant_pool_replacements (void) > +{ > + gimple_seq seq = NULL; > + gimple_stmt_iterator gsi = gsi_start (seq); > + bitmap_iterator bi; > + unsigned i; > + > + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) > + if (bitmap_bit_p (should_scalarize_away_bitmap, i) > + && !bitmap_bit_p (cannot_scalarize_away_bitmap, i)) > + { > + tree var = candidate (i); > + if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var)) > + continue; > + vec<access_p> *access_vec = get_base_access_vector (var); > + if (!access_vec) > + continue; > + for (unsigned i = 0; i < access_vec->length (); i++) > + { > + struct access *access = (*access_vec)[i]; > + tree val = subst_constant_pool_initial (access->expr, access->base); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Expression "); > + print_generic_expr (dump_file, access->expr, 0); > + fprintf (dump_file, " replaced with "); > + print_generic_expr (dump_file, access->replacement_decl, 0); > + fprintf (dump_file, " initialized to "); > + print_generic_expr (dump_file, val, 0); > + fprintf (dump_file, "\n"); > + if (!access->replacement_decl) > + dump_access (dump_file, access, true); > + } > + if (!access->replacement_decl) > + continue; > + gcc_assert (val); > + gassign *stmt = gimple_build_assign ( > + get_access_replacement (access), val); > + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); > + update_stmt (stmt); > + } > + } > + > + seq = gsi_seq (gsi); > + if (seq) > + gsi_insert_seq_on_edge_immediate ( > + single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq); > +} > + > /* Traverse the function body and all modifications as decided in > analyze_all_variable_accesses. Return true iff the CFG has been > changed. */ > @@ -3490,6 +3679,8 @@ sra_modify_function_body (void) > bool cfg_changed = false; > basic_block bb; > > + initialize_constant_pool_replacements (); > + > FOR_EACH_BB_FN (bb, cfun) > { > gimple_stmt_iterator gsi = gsi_start_bb (bb); > -- > 1.9.1 > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] Make SRA replace constant-pool loads 2015-11-03 14:01 ` Richard Biener @ 2015-11-05 18:55 ` Alan Lawrence 2015-11-06 10:25 ` Eric Botcazou 2015-11-06 16:29 ` Richard Biener 0 siblings, 2 replies; 26+ messages in thread From: Alan Lawrence @ 2015-11-05 18:55 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On 3 November 2015 at 14:01, Richard Biener <richard.guenther@gmail.com> wrote: > > Hum. I still wonder why we need all this complication ... Well, certainly I'd love to make it simpler, and if the complication is because I've gone about trying to deal with especially Ada in the wrong way... > I would > expect that if > we simply decided to completely scalarize a constant pool aggregate load then > we can always do that. SRA should then simply emit the _same_ IL as it does > for other complete scalarization element accesses. The only difference is > that the result should be simplifiable to omit the element load from > the constant pool. > > 1) The FRE pass following SRA should do that for you. ... > That is, I'd like to see the patch greatly simplified to just consider > constant pool > complete scalarization as if it were a regular variable. Hmm, can you clarify, do you mean I should *not* replace constant pool values with their DECL_INITIAL? The attempt to substitute in the initial value is what leads to most of the problems. For example, in gnat/opt31.adb, create_access finds this expression accessing *.LC0: MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + 4294967295] *)&*.LC0][1 ...]{lb: 1 sz: 1} this is an ARRAY_RANGE_REF of a MEM_REF of an ADDR_EXPR of *.LC0. So far I haven't extended subst_constant_pool_initial to handle ARRAY_RANGE_REFs, as it can't even handle this MEM_REF: MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + 4294967295] *)&*.LC0] because the type here has size: MIN_EXPR <_GLOBAL.SZ2.ada_opt31 (<PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->UB0, <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0), 17179869176> inside the MEM_REF of the ADDR_EXPR is *.LC0, whose DECL_INITIAL is a 4-element array (fine). Sadly while the MEM_REF type_contains_placeholder_p, the type of the outer ARRAY_RANGE_REF does not.... One possibility is that this whole construct, ARRAY_RANGE_REF that it is, should mark *.LC0 in cannot_scalarize_away_bitmap. However, disqualified_constants is also necessary if I want a meaningful assertion that we do not re-add constant pool entries as candidates after we've discovered them - or should I try to rely on there only being one expression that accesses each constant pool entry? (I don't think that's guaranteed as tree_output_constant_def does hashing, I admit I haven't really tried to break that assumption) > 2) You should be able to use fold_ctor_reference directly (in place of > all your code > in case offset and size are readily available - don't remember exactly how > complete scalarization "walks" elements). Alternatively use > fold_const_aggregate_ref. That should work for completely_scalarize, yes, i.e. if I can remove the other route in create_access. > 3) You can simplify the stmt SRA generated by simply calling fold_stmt on it, > that will do a bit more (wasted) work compared to 2) but may be easier. > > I wouldn't bother with the case where we for some reason do not simplify > the constant pool load. Hmmm. As above, I'm not quite sure what you mean by "the constant pool load" - if that means, substituting the DECL_INITIAL in place of the VAR_DECL that is DECL_IN_CONSTANT_POOL, then I didn't find a general tree substitution routine, hence writing subst_constant_pool_initial. It might be possible to make that simpler/more generic (e.g. just call copy_node, recurse on each operand, return the original if nothing changed) and then fold at the end. Thanks, Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] Make SRA replace constant-pool loads 2015-11-05 18:55 ` Alan Lawrence @ 2015-11-06 10:25 ` Eric Botcazou 2015-11-06 16:29 ` Richard Biener 1 sibling, 0 replies; 26+ messages in thread From: Eric Botcazou @ 2015-11-06 10:25 UTC (permalink / raw) To: Alan Lawrence; +Cc: gcc-patches, Richard Biener > Hmm, can you clarify, do you mean I should *not* replace constant pool > values with their DECL_INITIAL? The attempt to substitute in the > initial value is what leads to most of the problems. For example, in > gnat/opt31.adb, create_access finds this expression accessing *.LC0: > > MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR > struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) > <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + > 4294967295] *)&*.LC0][1 ...]{lb: 1 sz: 1} > > this is an ARRAY_RANGE_REF of a MEM_REF of an ADDR_EXPR of *.LC0. So > far I haven't extended subst_constant_pool_initial to handle > ARRAY_RANGE_REFs, as it can't even handle this MEM_REF: > > MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR > struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) > <PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + > 4294967295] *)&*.LC0] > > because the type here has size: > > MIN_EXPR <_GLOBAL.SZ2.ada_opt31 (<PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->UB0, <PLACEHOLDER_EXPR struct > opt31__messages_t___XUP>.P_BOUNDS->LB0), 17179869176> > > inside the MEM_REF of the ADDR_EXPR is *.LC0, whose DECL_INITIAL is a > 4-element array (fine). Sadly while the MEM_REF > type_contains_placeholder_p, the type of the outer ARRAY_RANGE_REF > does not.... FWIW you are allowed to punt on this kind of complex expressions that appear only in Ada. New optimizations are sort of allowed to work on the C family of languages first, and be extended or not to the rest of languages afterwards. > One possibility is that this whole construct, ARRAY_RANGE_REF that it > is, should mark *.LC0 in cannot_scalarize_away_bitmap. ARRAY_RANGE_REF is only used in Ada so you can do that for now (unless this introduces regressions in the gnat.dg testsuite but I doubt it). -- Eric Botcazou ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] Make SRA replace constant-pool loads 2015-11-05 18:55 ` Alan Lawrence 2015-11-06 10:25 ` Eric Botcazou @ 2015-11-06 16:29 ` Richard Biener 1 sibling, 0 replies; 26+ messages in thread From: Richard Biener @ 2015-11-06 16:29 UTC (permalink / raw) To: Alan Lawrence; +Cc: GCC Patches On November 5, 2015 7:55:12 PM GMT+01:00, Alan Lawrence <alan.lawrence@arm.com> wrote: >On 3 November 2015 at 14:01, Richard Biener ><richard.guenther@gmail.com> wrote: >> >> Hum. I still wonder why we need all this complication ... > >Well, certainly I'd love to make it simpler, and if the complication >is because I've gone about trying to deal with especially Ada in the >wrong way... > >> I would >> expect that if >> we simply decided to completely scalarize a constant pool aggregate >load then >> we can always do that. SRA should then simply emit the _same_ IL as >it does >> for other complete scalarization element accesses. The only >difference is >> that the result should be simplifiable to omit the element load from >> the constant pool. >> >> 1) The FRE pass following SRA should do that for you. >... >> That is, I'd like to see the patch greatly simplified to just >consider >> constant pool >> complete scalarization as if it were a regular variable. > >Hmm, can you clarify, do you mean I should *not* replace constant pool >values with their DECL_INITIAL? The attempt to substitute in the >initial value is what leads to most of the problems. For example, in >gnat/opt31.adb, create_access finds this expression accessing *.LC0: > >MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR >struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) ><PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + >4294967295] *)&*.LC0][1 ...]{lb: 1 sz: 1} > >this is an ARRAY_RANGE_REF of a MEM_REF of an ADDR_EXPR of *.LC0. So >far I haven't extended subst_constant_pool_initial to handle >ARRAY_RANGE_REFs, as it can't even handle this MEM_REF: > >MEM[(interfaces__unsigned_8[(sizetype) <PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->LB0:<PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->UB0 >= <PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->LB0 ? (sizetype) <PLACEHOLDER_EXPR >struct opt31__messages_t___XUP>.P_BOUNDS->UB0 : (sizetype) ><PLACEHOLDER_EXPR struct opt31__messages_t___XUP>.P_BOUNDS->LB0 + >4294967295] *)&*.LC0] > >because the type here has size: > >MIN_EXPR <_GLOBAL.SZ2.ada_opt31 (<PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->UB0, <PLACEHOLDER_EXPR struct >opt31__messages_t___XUP>.P_BOUNDS->LB0), 17179869176> > >inside the MEM_REF of the ADDR_EXPR is *.LC0, whose DECL_INITIAL is a >4-element array (fine). Sadly while the MEM_REF >type_contains_placeholder_p, the type of the outer ARRAY_RANGE_REF >does not.... > >One possibility is that this whole construct, ARRAY_RANGE_REF that it >is, should mark *.LC0 in cannot_scalarize_away_bitmap. > >However, disqualified_constants is also necessary if I want a >meaningful assertion that we do not re-add constant pool entries as >candidates after we've discovered them - or should I try to rely on >there only being one expression that accesses each constant pool >entry? (I don't think that's guaranteed as tree_output_constant_def >does hashing, I admit I haven't really tried to break that assumption) I see. >> 2) You should be able to use fold_ctor_reference directly (in place >of >> all your code >> in case offset and size are readily available - don't remember >exactly how >> complete scalarization "walks" elements). Alternatively use >> fold_const_aggregate_ref. > >That should work for completely_scalarize, yes, i.e. if I can remove >the other route in create_access. > >> 3) You can simplify the stmt SRA generated by simply calling >fold_stmt on it, >> that will do a bit more (wasted) work compared to 2) but may be >easier. >> >> I wouldn't bother with the case where we for some reason do not >simplify >> the constant pool load. > >Hmmm. As above, I'm not quite sure what you mean by "the constant pool >load" - if that means, substituting the DECL_INITIAL in place of the >VAR_DECL that is DECL_IN_CONSTANT_POOL, then I didn't find a general >tree substitution routine, hence writing subst_constant_pool_initial. >It might be possible to make that simpler/more generic (e.g. just call >copy_node, recurse on each operand, return the original if nothing >changed) and then fold at the end. I tried to suggest creating piecewise loads from the constant pool as opposed to the aggregate one. But I suppose the difficulty is to determine 'pieces', not their values (that followup passes and folding can handle quite efficiently). As Eric I suggest you simply disqualify the candidate based on its type. Richard. >Thanks, >Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2015-11-06 21:19 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-29 19:22 [PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads Alan Lawrence 2015-10-29 19:20 ` [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF Alan Lawrence 2015-10-30 5:39 ` Jeff Law 2015-10-30 9:21 ` Richard Biener 2015-11-03 10:27 ` Alan Lawrence 2015-11-03 11:13 ` Alan Lawrence 2015-11-03 11:49 ` Richard Biener 2015-10-29 19:21 ` [PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs Alan Lawrence 2015-10-29 19:22 ` [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices Alan Lawrence 2015-10-30 10:54 ` Richard Biener 2015-10-30 10:55 ` Eric Botcazou 2015-11-05 13:22 ` Alan Lawrence 2015-11-06 12:30 ` Richard Biener 2015-10-29 19:22 ` [PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions Alan Lawrence 2015-10-30 5:35 ` Jeff Law 2015-10-29 19:22 ` [PATCH 3/6] Share code from fold_array_ctor_reference with fold Alan Lawrence 2015-10-30 9:25 ` Richard Biener 2015-11-03 3:20 ` Jeff Law 2015-11-04 14:35 ` Alan Lawrence 2015-11-06 21:19 ` Jeff Law 2015-10-29 19:32 ` [PATCH 6/6] Make SRA replace constant-pool loads Alan Lawrence 2015-10-31 23:48 ` Bernhard Reutner-Fischer 2015-11-03 14:01 ` Richard Biener 2015-11-05 18:55 ` Alan Lawrence 2015-11-06 10:25 ` Eric Botcazou 2015-11-06 16:29 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).