From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4613 invoked by alias); 26 Apr 2004 12:52:54 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 4603 invoked from network); 26 Apr 2004 12:52:53 -0000 Received: from unknown (HELO smtp.uk.superh.com) (193.128.105.170) by sources.redhat.com with SMTP; 26 Apr 2004 12:52:53 -0000 Received: from sh-uk-ex01.uk.w2k.superh.com (sh-uk-ex01 [192.168.16.17]) by smtp.uk.superh.com (8.12.10/8.12.10) with ESMTP id i3QCqkNq014270; Mon, 26 Apr 2004 13:52:52 +0100 (BST) Received: from linsvr1.uk.superh.com ([192.168.16.50]) by sh-uk-ex01.uk.w2k.superh.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 26 Apr 2004 13:54:34 +0100 Received: (from renneckej@localhost) by linsvr1.uk.superh.com (8.11.6/8.11.6) id i3QCqeV17596; Mon, 26 Apr 2004 13:52:40 +0100 From: Joern Rennecke Message-Id: <200404261252.i3QCqeV17596@linsvr1.uk.superh.com> Subject: Re: Exploiting dual mode operation To: NAMOLARU@il.ibm.com (Mircea Namolaru) Date: Mon, 26 Apr 2004 16:57:00 -0000 Cc: joern.rennecke@superh.com (Joern Rennecke), gcc@gcc.gnu.org, LEEHOD@il.ibm.com (Leehod Baruch) In-Reply-To: from "Mircea Namolaru" at Apr 26, 2004 12:30:16 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 26 Apr 2004 12:54:34.0748 (UTC) FILETIME=[9FF8C3C0:01C42B8D] X-Scanned-By: MIMEDefang 2.42 X-SW-Source: 2004-04/txt/msg01206.txt.bz2 > 1. Trying to solve the sign extension removal problem using the live > highpart > information has some limitations. For instance in the following case > (which > appears during computation of array addresses): > > i = sign extension i1; > .... > index = 64-bit shift of i // the target and the source are 64 bits > > In some architectures we may get the same result without using an explicit > sign > extension. As we understand it, your algorithm will found that the > highpart of > "i" is live and the sign extension will not be discarded. It depends. If you have a static right shift such that the highpart of the value being shifted does not actually influence the result, the highpart attribute can be 'ignore', and the sign extension can be eliminated. But sign extension / shift combination can actually be handled generally and much simpler in the combiner. You only have to make sure that your machine description contains the matching patterns, and it will just work - no patches to the machine independent code required. > Another example is: > > int i, s; > for (i = 0; i < N; i++) > { > s1 = s + i; > s = sign extend s1; > } > return s; > The sign extension is required for the return only, so the sign extension > can be > removed from the loop and placed before the return. The highpart of "s" is > live, > but this information alone will not help to improve the code. Yes, this is not covered by the highpart liveness optimization. The SHmedia intruction set has (among others) an addition instruction that does a 32->64 bit sign extension of the result, so again this can be handled by the combiner. I have some across some code that uses short or unsigned short basic induction variables, though. I've written some patches for the loop optimizer to pre-condition the loop so that it stops at the end or at the signed overflow, whichever is earlier, and then use an outer loop to handle the sign extend. If vector addition is available, that is also used to do a zero-extending increment of a value that has been pre-conditioned to be zero-extended. > 2. To exploit the dual mode operation, for instructions that uses the > result of > explicit sign extensions we need to found if it is possible to get the > same result > via an instruction that doesn't require explicit sign extensions. > Basically we > need to found if: > > s1 = sign extend s > t1 = sign extend t > result = inst s1, t1 > > can be replaced by an instruction inst1: > > result = inst1 s, t. > > But this seems similar with what combine does, so the information > from the description file should suffice. Yes. Just write a testcase, start gdb on cc1, set a breakpoint at combine_instructions, start cc1 on your testcase, look for the patterns you want combined, then set a breakpoint at try_combine with a condition that the uids of i2 and i3 are your two patterns. Then step through it to see if there is any snag that prevents the pattern from being combined, or if not, look what pattern it generates. Than add a matching pattern to your machine description. > 3. One possible way for implementation is to use reaching definitions > to propagate the sign extensions forward right before the uses. This will > create > opportunities for combine and gcse to do the rest of the work afterward. Do you mean putting the sign extended values into new pseudo registers? That seems to have about as much potential for harm as good, since it can leave you with extra register-register copies, and you might loose strength reduction unless you change the loop optimizer to grok these new copies too. > Another possible way is to extend gcse (but there are some issues that > we still need to clarify). gcse works by computing the values in separate pseudos, thus creating new register-register copies as discussed above. > > Maybe there is a way to use your code (or part of it) ? Would you like a unidiff of all our patches against gcc 3.4.0 20040414 (prerelease) ? It's 736615 bytes raw, or 216022 bytes gzipped & uuencoded.