From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8628 invoked by alias); 17 Sep 2009 13:00:28 -0000 Received: (qmail 8605 invoked by uid 22791); 17 Sep 2009 13:00:25 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-pz0-f200.google.com (HELO mail-pz0-f200.google.com) (209.85.222.200) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 17 Sep 2009 12:58:49 +0000 Received: by pzk38 with SMTP id 38so649pzk.9 for ; Thu, 17 Sep 2009 05:58:48 -0700 (PDT) Received: by 10.115.116.4 with SMTP id t4mr18588981wam.106.1253192327997; Thu, 17 Sep 2009 05:58:47 -0700 (PDT) Received: from Paullaptop (203-214-134-90.perm.iinet.net.au [203.214.134.90]) by mx.google.com with ESMTPS id 22sm1992158pzk.14.2009.09.17.05.58.44 (version=SSLv3 cipher=RC4-MD5); Thu, 17 Sep 2009 05:58:46 -0700 (PDT) Message-ID: <89EAA39DC259473589AC0E907390A6E5@Paullaptop> From: "Paul Edwards" To: "Ulrich Weigand" Cc: "Joseph S. Myers" , References: <200909151350.n8FDownl009821@d12av02.megacenter.de.ibm.com> In-Reply-To: <200909151350.n8FDownl009821@d12av02.megacenter.de.ibm.com> Subject: Re: i370 port Date: Thu, 17 Sep 2009 13:00:00 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="Windows-1252"; reply-type=original Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-09/txt/msg00306.txt.bz2 Hi Ulrich. Good news is that I have now gotten GCC 3.4.6 to recompile itself with full optimization on. The compilation time on the (emulated) mainframe is only 2.5 hours as only a single pass is required. GCC 3.4.6 requires 49 MB to recompile c-common! I assume with GCC 3.4.6 it is doing global optimization or something. It was only 20 MB under 3.2.3. Anyway, I'm still continuing the cleanup, but now have a strong fallback position. Basically I won't introduce any machine definition change that causes the self-compile to fail. >> ;(define_insn "" >> ; [(set (match_operand:SI 0 "register_operand" "=d") >> ; (mult:SI (match_operand:SI 1 "register_operand" "0") >> ; (match_operand:SI 2 "immediate_operand" "K")))] >> ; "" >> ; "* >> ;{ >> ; check_label_emit (); >> ; mvs_check_page (0, 4, 0); >> ; return \"MH %0,%H2\"; >> ;}" >> ; [(set_attr "length" "4")] >> ;) > > The combination of predicates and constraints on this insn is broken. > > Before reload, the predicate "immediate_operand" explicitly allows > *any* SImode immediate value. However, during reload, the "K" > constraint accepts only a subset of values. Is there a way to give a predicate that just says "look at the constraint"? It seems a bit overkill to add a new predicate for this one instruction. > As there is no other alternative, No other alternative for this same pattern, right? There was an alternative - the pattern that I explictly asked it to use, since I'd already done the K check in advance. > and the insn supports neither memory nor register > operands, this is impossible for reload to fix up. Hmmm. I was wondering whether I could put a memory operand there, if that means it can fix it up regardless. But that would give it the idea that it can put a fullword there, when a halfword operand is required, right? > In addition, I don't quite understand how this pattern works in > the first place; MH requires a memory operand, but this pattern > seems to output an immediate value as operand. Is there some > magic going on in your assembler? %H2 is ... ;; Special formats used for outputting 370 instructions. ;; ;; %H -- Print a signed 16-bit constant. in the i370.md documentation which can be seen here: http://gcc.gnu.org/viewcvs/trunk/gcc/config/i370/i370.md?revision=71850&view=markup&pathrev=77215 (there's not a lot of technical changes since then, mainly because no-one knew how to make them). > If you indeed want to output immediate values here, you should As opposed to wanting what? All I want is the MH instruction to be available for use, so that when someone writes x = x * 5, it doesn't have to organize a register pair. > probably define a new *predicate* that constrains the set of > allowed values even before reload. Ok, that should be straightforward if that's the best solution. > In the s390 port, we're instead modelling the MH instruction > with a memory operand (this still allows the generic parts of > GCC to push immediate operands into memory, if they are in > range for an HImode operand): > > (define_insn "*mulsi3_sign" > [(set (match_operand:SI 0 "register_operand" "=d") > (mult:SI (sign_extend:SI (match_operand:HI 2 "memory_operand" "R")) > (match_operand:SI 1 "register_operand" "0")))] > "" > "mh\t%0,%2" > [(set_attr "op_type" "RX") > (set_attr "type" "imul")]) I tried a lot of variations to try to get this to fit into the i370 scheme, but didn't have any luck. e.g. I managed to make this: (define_insn "" [(set (match_operand:SI 0 "register_operand" "=d") (mult:SI (sign_extend:SI (match_operand:HI 2 "memory_operand" "g")) (match_operand:SI 1 "register_operand" "0")))] "" "* { check_label_emit (); mvs_check_page (0, 4, 0); return \"MH^I%0,%2\"; }" [(set_attr "length" "4")] ) produce: C:\devel\gccnew\gcc>gccmvs -DUSE_MEMMGR -Os -S -ansi -pedantic-errors -DHAVE_CON FIG_H -DIN_GCC -DPUREISO -I ../../pdos/pdpclib -I . -I config/i370 -I ../include cfgloopanal.c cfgloopanal.c: In function `average_num_loop_insns': cfgloopanal.c:1379: error: unrecognizable insn: (insn 68 67 71 7 (set (reg:SI 45) (mult:SI (reg:SI 44 [ .frequency ]) (const_int 10000 [0x2710]))) -1 (insn_list 67 (nil)) (expr_list:REG_DEAD (reg:SI 44 [ .frequency ]) (nil))) > This also seems broken. A MULT:DI must have two DImode operands, > it cannot have one DImode and one SImode operand. Also, it is in > fact incorrect that it takes the full DImode first operand; rather, > it only uses the low 32-bit of its first operand as input. Ok. > In the s390 port we're modelling the real behavior of the instruction > using two explicit SIGN_EXTEND codes: > > (define_insn "mulsidi3" > [(set (match_operand:DI 0 "register_operand" "=d,d") > (mult:DI (sign_extend:DI > (match_operand:SI 1 "register_operand" "%0,0")) > (sign_extend:DI > (match_operand:SI 2 "nonimmediate_operand" "d,R"))))] Ok. That certainly looks better. > Well, the point of optimization is that the RTXes do not stay the > way they were originally expanded ... The optimizers will attempt > to perform various generic optimization on the code, and if the > back-end claims to support a pattern that implements any of those > optimized forms, it will get used. In this case, even though you > expanded a DImode multiply, common code may notice that it can > be optimized to a SImode multiply instead. > > Generally speaking, your RTX patterns *must* be fully correct and > represent the actual behavior of the machine in all cases. If there > are corner cases formally allowed by the RTX pattern, but the > behavior of the machine differs, this may cause breakage. Even if > your expanders avoid those corner cases when using your patterns, > this will not be true for the optimizers. Ok. It seems the proper way to go, but given that I don't know how to integrate that into the existing code, it's probably better for me to go with the new predicate, which I can very likely get to work. BFN. Paul.