From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9039 invoked by alias); 30 Oct 2009 09:37:30 -0000 Received: (qmail 9030 invoked by uid 22791); 30 Oct 2009 09:37:28 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-px0-f192.google.com (HELO mail-px0-f192.google.com) (209.85.216.192) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 30 Oct 2009 09:37:23 +0000 Received: by pxi30 with SMTP id 30so1751484pxi.14 for ; Fri, 30 Oct 2009 02:37:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.202.12 with SMTP id z12mr66479rvf.112.1256895438451; Fri, 30 Oct 2009 02:37:18 -0700 (PDT) In-Reply-To: <200910291829.n9TITTdO026501@d12av02.megacenter.de.ibm.com> References: <84fc9c000910280312t642f85abn1646c3ad446001a7@mail.gmail.com> <200910291829.n9TITTdO026501@d12av02.megacenter.de.ibm.com> Date: Fri, 30 Oct 2009 09:51:00 -0000 Message-ID: <84fc9c000910300237r41d98720p82423e204223704c@mail.gmail.com> Subject: Re: Fix PR tree-optimization/41857 (Re: m32c support for named addr spaces branch) From: Richard Guenther To: Ulrich Weigand Cc: DJ Delorie , gcc-patches@gcc.gnu.org, Zdenek Dvorak Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-10/txt/msg01787.txt.bz2 On Thu, Oct 29, 2009 at 7:29 PM, Ulrich Weigand wrote: >> On Tue, Oct 27, 2009 at 7:22 PM, Ulrich Weigand wr= ote: >> > I now seem to be running into a related problem with __ea on SPU, in a >> > loop where the loop optimizer decides to use a DImode IV to implement >> > a 64-bit __ea pointer. >> > >> > This causes create_mem_ref to be called with an affine combination >> > that contains only a single value, of 64-bit integral type. >> > >> > Because this is not a pointer type, addr_to_parts interprets this >> > value as "index", not "base". =A0(I'm not quite sure I understand >> > how this is intended to work anyway: it seems to me that loop IVs >> > use (nearly?) always integral types. =A0Why does move_pointer_to_base >> > check for pointer type then?) > > I've now opened PR 41857 for this particular problem. > > The core problem in this case is that pointers may be converted to > sizetype by loop IV optimizations. =A0This is particularly aggravated > by the fact that create_mem_ref has a hard time determining which > of the variables in the affine expression it gets as input > represents the base of the reference. > > The heuristics used to determine this looks for variables of pointer > type. =A0However, due to problems related to undefined overflow, the > IV optimizations have switched to always using integer types for > induction variables, even those that represent memory addresses. > This makes the create_mem_ref heuristics mostly useless in many > cases ... > > However, rewrite_use_address (the call site of create_mem_ref) does > actually have the information; it knows whether the IV candidate > is based on a memory object. =A0The information is simply not passed > to create_mem_ref. =A0The patch below changes this by adding a new > BASE_HINT argument to create_mem_ref, which is set to the IV that > is to be used as base of the reference, if any. > > create_mem_ref and its subroutines then use this variable (casted > to the appropriate pointer type) as base. > > As a side effect, the patch also fixes a FIXME in > most_expensive_mult_to_index. > > While not a complete fix to the general sizetype problem, this > patch fixes the ICE-on-valid bug on the testcase in the PR, and > it seems a good idea to properly identify the base anyway; this > is needed (or at least beneficial) on some platforms. > > Tested on s390x-ibm-linux and spu-elf. > OK for mainline? This looks good to me, though I'd like Zdenek to have a 2nd look. Thanks, Richard. > Bye, > Ulrich > > > ChangeLog: > > gcc/ > =A0 =A0 =A0 =A0PR tree-optimization/41857 > =A0 =A0 =A0 =A0* tree-flow.h (rewrite_use_address): Add BASE_HINT argumen= t. > =A0 =A0 =A0 =A0* tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base = hint > =A0 =A0 =A0 =A0to create_mem_ref. > =A0 =A0 =A0 =A0* tree-ssa-address.c (move_hint_to_base): New function. > =A0 =A0 =A0 =A0(most_expensive_mult_to_index): Add TYPE argument. =A0Use = mode and > =A0 =A0 =A0 =A0address space associated with TYPE. > =A0 =A0 =A0 =A0(addr_to_parts): Add TYPE and BASE_HINT arguments. =A0Pass= TYPE to > =A0 =A0 =A0 =A0most_expensive_mult_to_index. =A0Call move_hint_to_base. > =A0 =A0 =A0 =A0(create_mem_ref): Add BASE_HINT argument. =A0Pass BASE_HIN= T and > =A0 =A0 =A0 =A0TYPE to addr_to_parts. > > gcc/testsuite/ > =A0 =A0 =A0 =A0PR tree-optimization/41857 > =A0 =A0 =A0 =A0* gcc.target/spu/ea/pr41857.c: New file. > > > Index: gcc/tree-ssa-loop-ivopts.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > *** gcc/tree-ssa-loop-ivopts.c =A0(revision 153683) > --- gcc/tree-ssa-loop-ivopts.c =A0(working copy) > *************** rewrite_use_address (struct ivopts_data > *** 5510,5515 **** > --- 5510,5516 ---- > =A0{ > =A0 =A0aff_tree aff; > =A0 =A0gimple_stmt_iterator bsi =3D gsi_for_stmt (use->stmt); > + =A0 tree base_hint =3D NULL_TREE; > =A0 =A0tree ref; > =A0 =A0bool ok; > > *************** rewrite_use_address (struct ivopts_data > *** 5517,5523 **** > =A0 =A0gcc_assert (ok); > =A0 =A0unshare_aff_combination (&aff); > > ! =A0 ref =3D create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, data->s= peed); > =A0 =A0copy_ref_info (ref, *use->op_p); > =A0 =A0*use->op_p =3D ref; > =A0} > --- 5518,5539 ---- > =A0 =A0gcc_assert (ok); > =A0 =A0unshare_aff_combination (&aff); > > ! =A0 /* To avoid undefined overflow problems, all IV candidates use unsi= gned > ! =A0 =A0 =A0integer types. =A0The drawback is that this makes it impossi= ble for > ! =A0 =A0 =A0create_mem_ref to distinguish an IV that is based on a memor= y object > ! =A0 =A0 =A0from one that represents simply an offset. > ! > ! =A0 =A0 =A0To work around this problem, we pass a hint to create_mem_re= f that > ! =A0 =A0 =A0indicates which variable (if any) in aff is an IV based on a= memory > ! =A0 =A0 =A0object. =A0Note that we only consider the candidate. =A0If t= his is not > ! =A0 =A0 =A0based on an object, the base of the reference is in some sub= expression > ! =A0 =A0 =A0of the use -- but these will use pointer types, so they are = recognized > ! =A0 =A0 =A0by the create_mem_ref heuristics anyway. =A0*/ > ! =A0 if (cand->iv->base_object) > ! =A0 =A0 base_hint =3D var_at_stmt (data->current_loop, cand, use->stmt); > ! > ! =A0 ref =3D create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, base_hi= nt, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data->speed); > =A0 =A0copy_ref_info (ref, *use->op_p); > =A0 =A0*use->op_p =3D ref; > =A0} > Index: gcc/tree-ssa-address.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > *** gcc/tree-ssa-address.c =A0 =A0 =A0(revision 153683) > --- gcc/tree-ssa-address.c =A0 =A0 =A0(working copy) > *************** move_fixed_address_to_symbol (struct mem > *** 392,397 **** > --- 392,424 ---- > =A0 =A0aff_combination_remove_elt (addr, i); > =A0} > > + /* If ADDR contains an instance of BASE_HINT, move it to PARTS->base. = =A0*/ > + > + static void > + move_hint_to_base (tree type, struct mem_address *parts, tree base_hint, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0aff_tree *addr) > + { > + =A0 unsigned i; > + =A0 tree val =3D NULL_TREE; > + > + =A0 for (i =3D 0; i < addr->n; i++) > + =A0 =A0 { > + =A0 =A0 =A0 if (!double_int_one_p (addr->elts[i].coef)) > + =A0 =A0 =A0 continue; > + > + =A0 =A0 =A0 val =3D addr->elts[i].val; > + =A0 =A0 =A0 if (operand_equal_p (val, base_hint, 0)) > + =A0 =A0 =A0 break; > + =A0 =A0 } > + > + =A0 if (i =3D=3D addr->n) > + =A0 =A0 return; > + > + =A0 /* Cast value to appropriate pointer type. =A0*/ > + =A0 parts->base =3D fold_convert (build_pointer_type (type), val); > + =A0 aff_combination_remove_elt (addr, i); > + } > + > =A0/* If ADDR contains an address of a dereferenced pointer, move it to > =A0 =A0 PARTS->base. =A0*/ > > *************** add_to_parts (struct mem_address *parts, > *** 453,461 **** > =A0 =A0 element(s) to PARTS. =A0*/ > > =A0static void > ! most_expensive_mult_to_index (struct mem_address *parts, aff_tree *addr, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool speed) > =A0{ > =A0 =A0HOST_WIDE_INT coef; > =A0 =A0double_int best_mult, amult, amult_neg; > =A0 =A0unsigned best_mult_cost =3D 0, acost; > --- 480,490 ---- > =A0 =A0 element(s) to PARTS. =A0*/ > > =A0static void > ! most_expensive_mult_to_index (tree type, struct mem_address *parts, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 aff_tree *addr,= bool speed) > =A0{ > + =A0 addr_space_t as =3D TYPE_ADDR_SPACE (type); > + =A0 enum machine_mode address_mode =3D targetm.addr_space.address_mode = (as); > =A0 =A0HOST_WIDE_INT coef; > =A0 =A0double_int best_mult, amult, amult_neg; > =A0 =A0unsigned best_mult_cost =3D 0, acost; > *************** most_expensive_mult_to_index (struct mem > *** 469,483 **** > =A0 =A0 =A0 =A0if (!double_int_fits_in_shwi_p (addr->elts[i].coef)) > =A0 =A0 =A0 =A0continue; > > - =A0 =A0 =A0 /* FIXME: Should use the correct memory mode rather than Pm= ode. =A0*/ > - > =A0 =A0 =A0 =A0coef =3D double_int_to_shwi (addr->elts[i].coef); > =A0 =A0 =A0 =A0if (coef =3D=3D 1 > ! =A0 =A0 =A0 =A0 || !multiplier_allowed_in_address_p (coef, Pmode, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0ADDR_SPACE_GENERIC)) > =A0 =A0 =A0 =A0continue; > > ! =A0 =A0 =A0 acost =3D multiply_by_cost (coef, Pmode, speed); > > =A0 =A0 =A0 =A0if (acost > best_mult_cost) > =A0 =A0 =A0 =A0{ > --- 498,509 ---- > =A0 =A0 =A0 =A0if (!double_int_fits_in_shwi_p (addr->elts[i].coef)) > =A0 =A0 =A0 =A0continue; > > =A0 =A0 =A0 =A0coef =3D double_int_to_shwi (addr->elts[i].coef); > =A0 =A0 =A0 =A0if (coef =3D=3D 1 > ! =A0 =A0 =A0 =A0 || !multiplier_allowed_in_address_p (coef, TYPE_MODE (t= ype), as)) > =A0 =A0 =A0 =A0continue; > > ! =A0 =A0 =A0 acost =3D multiply_by_cost (coef, address_mode, speed); > > =A0 =A0 =A0 =A0if (acost > best_mult_cost) > =A0 =A0 =A0 =A0{ > *************** most_expensive_mult_to_index (struct mem > *** 520,527 **** > =A0 =A0parts->step =3D double_int_to_tree (sizetype, best_mult); > =A0} > > ! /* Splits address ADDR into PARTS. > ! > =A0 =A0 TODO -- be more clever about the distribution of the elements of = ADDR > =A0 =A0 to PARTS. =A0Some architectures do not support anything but single > =A0 =A0 register in address, possibly with a small integer offset; while > --- 546,555 ---- > =A0 =A0parts->step =3D double_int_to_tree (sizetype, best_mult); > =A0} > > ! /* Splits address ADDR for a memory access of type TYPE into PARTS. > ! =A0 =A0If BASE_HINT is non-NULL, it specifies an SSA name to be used > ! =A0 =A0preferentially as base of the reference. > ! > =A0 =A0 TODO -- be more clever about the distribution of the elements of = ADDR > =A0 =A0 to PARTS. =A0Some architectures do not support anything but single > =A0 =A0 register in address, possibly with a small integer offset; while > *************** most_expensive_mult_to_index (struct mem > *** 530,536 **** > =A0 =A0 addressing modes is useless. =A0*/ > > =A0static void > ! addr_to_parts (aff_tree *addr, struct mem_address *parts, bool speed) > =A0{ > =A0 =A0tree part; > =A0 =A0unsigned i; > --- 558,565 ---- > =A0 =A0 addressing modes is useless. =A0*/ > > =A0static void > ! addr_to_parts (tree type, aff_tree *addr, tree base_hint, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0struct mem_address *parts, bool speed) > =A0{ > =A0 =A0tree part; > =A0 =A0unsigned i; > *************** addr_to_parts (aff_tree *addr, struct me > *** 550,561 **** > > =A0 =A0/* First move the most expensive feasible multiplication > =A0 =A0 =A0 to index. =A0*/ > ! =A0 most_expensive_mult_to_index (parts, addr, speed); > > =A0 =A0/* Try to find a base of the reference. =A0Since at the moment > =A0 =A0 =A0 there is no reliable way how to distinguish between pointer a= nd its > =A0 =A0 =A0 offset, this is just a guess. =A0*/ > ! =A0 if (!parts->symbol) > =A0 =A0 =A0move_pointer_to_base (parts, addr); > > =A0 =A0/* Then try to process the remaining elements. =A0*/ > --- 579,592 ---- > > =A0 =A0/* First move the most expensive feasible multiplication > =A0 =A0 =A0 to index. =A0*/ > ! =A0 most_expensive_mult_to_index (type, parts, addr, speed); > > =A0 =A0/* Try to find a base of the reference. =A0Since at the moment > =A0 =A0 =A0 there is no reliable way how to distinguish between pointer a= nd its > =A0 =A0 =A0 offset, this is just a guess. =A0*/ > ! =A0 if (!parts->symbol && base_hint) > ! =A0 =A0 move_hint_to_base (type, parts, base_hint, addr); > ! =A0 if (!parts->symbol && !parts->base) > =A0 =A0 =A0move_pointer_to_base (parts, addr); > > =A0 =A0/* Then try to process the remaining elements. =A0*/ > *************** gimplify_mem_ref_parts (gimple_stmt_iter > *** 592,604 **** > > =A0tree > =A0create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool speed) > =A0{ > =A0 =A0tree mem_ref, tmp; > =A0 =A0tree atype; > =A0 =A0struct mem_address parts; > > ! =A0 addr_to_parts (addr, &parts, speed); > =A0 =A0gimplify_mem_ref_parts (gsi, &parts); > =A0 =A0mem_ref =3D create_mem_ref_raw (type, &parts); > =A0 =A0if (mem_ref) > --- 623,635 ---- > > =A0tree > =A0create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 tree base_hint, bool speed) > =A0{ > =A0 =A0tree mem_ref, tmp; > =A0 =A0tree atype; > =A0 =A0struct mem_address parts; > > ! =A0 addr_to_parts (type, addr, base_hint, &parts, speed); > =A0 =A0gimplify_mem_ref_parts (gsi, &parts); > =A0 =A0mem_ref =3D create_mem_ref_raw (type, &parts); > =A0 =A0if (mem_ref) > Index: gcc/tree-flow.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > *** gcc/tree-flow.h =A0 =A0 (revision 153683) > --- gcc/tree-flow.h =A0 =A0 (working copy) > *************** struct mem_address > *** 921,927 **** > > =A0struct affine_tree_combination; > =A0tree create_mem_ref (gimple_stmt_iterator *, tree, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct affine_tree_combination *= , bool); > =A0rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool); > =A0void get_address_description (tree, struct mem_address *); > =A0tree maybe_fold_tmr (tree); > --- 921,927 ---- > > =A0struct affine_tree_combination; > =A0tree create_mem_ref (gimple_stmt_iterator *, tree, > ! =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct affine_tree_combination *= , tree, bool); > =A0rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool); > =A0void get_address_description (tree, struct mem_address *); > =A0tree maybe_fold_tmr (tree); > *** /dev/null =A0 2009-10-07 18:17:03.469375344 +0200 > --- gcc/testsuite/gcc.target/spu/ea/pr41857.c =A0 2009-10-29 14:11:18.000= 000000 +0100 > *************** > *** 0 **** > --- 1,29 ---- > + /* Copyright (C) 2009 Free Software Foundation, Inc. > + > + =A0 =A0This file is free software; you can redistribute it and/or modif= y it under > + =A0 =A0the terms of the GNU General Public License as published by the = Free > + =A0 =A0Software Foundation; either version 3 of the License, or (at you= r option) > + =A0 =A0any later version. > + > + =A0 =A0This file is distributed in the hope that it will be useful, but= WITHOUT > + =A0 =A0ANY WARRANTY; without even the implied warranty of MERCHANTABILI= TY or > + =A0 =A0FITNESS FOR A PARTICULAR PURPOSE. =A0See the GNU General Public = License > + =A0 =A0for more details. > + > + =A0 =A0You should have received a copy of the GNU General Public License > + =A0 =A0along with this file; see the file COPYING3. =A0If not see > + =A0 =A0. =A0*/ > + > + /* { dg-do compile } */ > + > + __ea char *strchr_ea (__ea const char *s, int c); > + __ea char *foo (__ea char *s) > + { > + =A0 __ea char *ret =3D s; > + =A0 int i; > + > + =A0 for (i =3D 0; i < 3; i++) > + =A0 =A0 ret =3D strchr_ea (ret, s[i]); > + > + =A0 return ret; > + } > > > -- > =A0Dr. Ulrich Weigand > =A0GNU Toolchain for Linux on System z and Cell BE > =A0Ulrich.Weigand@de.ibm.com >