From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5563 invoked by alias); 10 Nov 2014 12:49:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5549 invoked by uid 89); 10 Nov 2014 12:49:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f178.google.com Received: from mail-wi0-f178.google.com (HELO mail-wi0-f178.google.com) (209.85.212.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 10 Nov 2014 12:49:02 +0000 Received: by mail-wi0-f178.google.com with SMTP id bs8so10258861wib.17 for ; Mon, 10 Nov 2014 04:48:59 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.180.149.242 with SMTP id ud18mr30377547wib.58.1415623739669; Mon, 10 Nov 2014 04:48:59 -0800 (PST) Received: by 10.194.20.74 with HTTP; Mon, 10 Nov 2014 04:48:59 -0800 (PST) In-Reply-To: <545FF8EE.1090900@linaro.org> References: <53FEDF34.4090605@linaro.org> <5407DF57.2040902@linaro.org> <540912E1.30505@linaro.org> <545FF8EE.1090900@linaro.org> Date: Mon, 10 Nov 2014 12:56:00 -0000 Message-ID: Subject: Re: [RFC] Elimination of zext/sext - type promotion pass From: Richard Biener To: Kugan Cc: Uros Bizjak , "gcc-patches@gcc.gnu.org" , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-11/txt/msg00789.txt.bz2 On Mon, Nov 10, 2014 at 12:29 AM, Kugan wrote: > >> Well - the best way would be to expose the target specifics to GIMPLE >> at some point in the optimization pipeline. My guess would be that it's >> appropriate after loop optimizations (but maybe before induction variable >> optimization). >> >> That is, have a pass that applies register promotion to all SSA names >> in the function, inserting appropriate truncations and extensions. That >> way you'd never see (set (subreg...) on RTL. The VRP and DOM >> passes running after that pass would then be able to aggressively >> optimize redundant truncations and extensions. >> >> Effects on debug information are to be considered. You can change >> the type of SSA names in-place but you don't want to do that for >> user DECLs (and we can't have the SSA name type and its DECL >> type differ - and not sure if we might want to lift that restriction). > > > Thanks for the comments. Here is a prototype patch that implements a > type promotion pass. This pass records SSA variables that will have > values in higher bits (than the original type precision) if promoted and > uses this information in inserting appropriate truncations and > extensions. This pass also classifies some of the stmts that sets ssa's > to be unsafe to promote. Here is a gimple difference for the type > promotion as compared to previous dump for a testcase. Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR) it has no convenient way to sign-extend other than truncating to a signed (non-promoted) type and then extending to the promoted type. Thus I think such pass should be accompanied with a new tree code, SEXT_EXPR. Otherwise we end up with "spurious" un-promoted signed types which later optimizations may be confused about. Not sure if that is the actual issue though. Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn should be gimple-ssa-type-promote.c. In the end all targets with non-trivial PROMOTE_MODE should run the pass as a lowering step so it should be enabled even at -O0 (and not disablable). I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the end I'd like to run it before IVOPTs ... which means moving IVOPTs later, after VRP which should be the pass optimizing away some of the extensions). In get_promoted_type I don't understand why you preserve qualifiers. Also even for targets without PROMOTE_MODE it may be beneficial to expose truncations required by expanding bit-precision arithmetic earlier (that is, if !PROMOTE_MODE at least promote to GET_MODE_PRECISION (TYPE_MODE (type))). A testcase for that is for example struct { long i : 33; long j : 33; } a; return a.i + a.j; where bitfields of type > int do not promote so you get a 33 bit add which we expand to a 64bit add plus a sign-extension (and nothing optimizes that later usually). insert_next_bb sounds like you want to use insert_on_edge somewhere. in assign_rhs_promotable_p you handle comparisons special but the ternary COND_EXPR and VEC_COND_EXPR can have comparisons embedded in their first operand. The comment confuses me though - with proper sign- or zero-extensions inserted you should be able to promote them anyway? You seem to miss that a GIMPLE_ASSIGN can have 3 operands in promote_cst_in_stmt as well. In promote_assign_stmt_use I consider a default: case that ends up doing nothing dangerous ;) Please either use gcc_unreachable () or do the safe thing (fix = true;?). You seem to be working with a lattice of some kind - fixing up stmt uses the way you do - walking over immediate uses - is not very cache friendly. Why not use a lattice for this - record promoted vars to be used for old SSA names and walk over all stmts instead, replacing SSA uses on them? Btw, you don't need to call update_stmt if you SET_USE and not replace an SSA name with a constant. You seem to "fix" with a single stmt but I don't see where you insert zero- or sign-extensions for ssa_overflows_p cases? Note that at least for SSA names with !SSA_NAME_VAR (thus anonymous vars) you want to do a cheaper promotion by not allocating a new SSA name but simply "fixing" its type by assigning to its TREE_TYPE. For SSA names with SSA_NAME_VAR there is of course debug-info to consider and thus doing what you do is better (but probably still will wreck debuginfo?). GIMPLE_NOPs are not only used for parameters but also uninitialized uses - for non-parameters you should simply adjust their type. No need to fixup their value. The pass needs more comments. It looks like you are not promoting all variables but only those where compensation code (zero-/sign-extensions) is not necessary? Thanks for trying to work on this. Richard. > crc2 (short unsigned int crc, unsigned char data) > { > unsigned char carry; > unsigned char x16; > unsigned char i; > - unsigned char ivtmp_5; > + unsigned int _2; > + unsigned char _3; > + unsigned int _4; > + unsigned int _5; > unsigned char _9; > - unsigned char _10; > - unsigned char ivtmp_18; > + unsigned int _10; > + unsigned int _11; > + unsigned int _12; > + unsigned int _13; > + unsigned int _15; > + unsigned int _16; > + unsigned int _18; > + unsigned int _21; > + unsigned int _22; > + unsigned int _24; > + short unsigned int _26; > + unsigned char _27; > + unsigned int _28; > + unsigned int _29; > + unsigned int _30; > > : > + _12 = (unsigned int) data_8(D); > + _2 = (unsigned int) crc_7(D); > > : > - # crc_28 = PHI > - # data_29 = PHI > - # ivtmp_18 = PHI > - _9 = (unsigned char) crc_28; > - _10 = _9 ^ data_29; > - x16_11 = _10 & 1; > - data_12 = data_29 >> 1; > - if (x16_11 == 1) > + # _30 = PHI <_28(5), _2(2)> > + # _16 = PHI <_29(5), _12(2)> > + # _4 = PHI <_18(5), 8(2)> > + _9 = (unsigned char) _30; > + _5 = (unsigned int) _9; > + _22 = _5 ^ _16; > + _10 = _22 & 1; > + _29 = _16 >> 1; > + _27 = (unsigned char) _10; > + if (_27 == 1) > goto ; > else > goto ; > > : > - crc_13 = crc_28 ^ 16386; > - crc_24 = crc_13 >> 1; > - crc_15 = crc_24 | 32768; > + _11 = _30 ^ 16386; > + _13 = _11 >> 1; > + _24 = _13 | 32768; > > : > - # crc_2 = PHI > - ivtmp_5 = ivtmp_18 - 1; > - if (ivtmp_5 != 0) > + # _28 = PHI <_24(4), _15(7)> > + _18 = _4 - 1; > + _3 = (unsigned char) _18; > + if (_3 != 0) > goto ; > else > goto ; > > : > - # crc_19 = PHI > - return crc_19; > + # _21 = PHI <_28(5)> > + _26 = (short unsigned int) _21; > + return _26; > > : > - crc_21 = crc_28 >> 1; > + _15 = _30 >> 1; > goto ; > > } > > > I experimented with few simple test-cases and results so far are mixed. > It also seems that subsequent passes are not always optimising as > expected. I haven't looked in detail but will look into it based on the > feedback. > > Please also note that this pass still doest handle debug instructions > and there are couple regression failures for ARM. > > Thanks, > Kugan > >