From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4507 invoked by alias); 31 Aug 2004 22:46:50 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 4472 invoked from network); 31 Aug 2004 22:46:47 -0000 Received: from unknown (HELO palrel11.hp.com) (156.153.255.246) by sourceware.org with SMTP; 31 Aug 2004 22:46:47 -0000 Received: from smtp2.ptp.hp.com (smtp2.ptp.hp.com [15.1.28.240]) by palrel11.hp.com (Postfix) with ESMTP id 5FC5DFFA2 for ; Tue, 31 Aug 2004 15:46:47 -0700 (PDT) Received: from hpsje.cup.hp.com (hpsje.cup.hp.com [15.244.96.221]) by smtp2.ptp.hp.com (Postfix) with ESMTP id 353F91089 for ; Tue, 31 Aug 2004 15:46:47 -0700 (PDT) Received: (from sje@localhost) by hpsje.cup.hp.com (8.9.3 (PHNE_24419+JAGae58098)/8.7.3 TIS Messaging 5.0) id PAA26407 for gcc@gcc.gnu.org; Tue, 31 Aug 2004 15:46:47 -0700 (PDT) Date: Tue, 31 Aug 2004 23:20:00 -0000 From: Steve Ellcey Message-Id: <200408312246.PAA26407@hpsje.cup.hp.com> To: gcc@gcc.gnu.org Subject: IA64 floating point division question Reply-To: sje@cup.hp.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-SW-Source: 2004-08/txt/msg01695.txt.bz2 I have been experimenting with the IA64 floating point division code sequence. Currently the code sequence for floating point division is expanded late and thus isn't scheduled very well. The reason for this (as I understood it) was that we access some registers using multiple modes by creating operands for existing registers with different modes. So I tried to address this by using more temporary registers during the code sequence and accessing each of them only in a single mode. I got that to work for divsf3_internal_thr (which is a define_insn_and_split) and things looked good but the splitting of the division into multiple instructions was still happening late in code generation and I wasn't getting any improvement in my scheduling. I saw that I had "&& reload_completed" in the define_insn_and_expand so I tried removing that but then I got: y.c: In function `foo': y.c:9: error: unrecognizable insn: (insn 36 19 37 0 (parallel [ (set (reg:SF 351) (div:SF (const_int 1 [0x1]) (reg:SF 350 [ b ]))) (set (scratch:BI) (unspec:BI [ (reg:SF 349 [ a ]) (reg:SF 350 [ b ]) ] 14)) (use (const_int 1 [0x1])) ]) -1 (nil) (expr_list:REG_UNUSED (scratch:BI) (expr_list:REG_UNUSED (scratch:BI) (nil)))) y.c:9: internal compiler error: in extract_insn, at recog.c:2037 This instruction was recognized and expanded when I had "&& reload_completed" in the define_insn_and_split so I don't understand why it is not recognized now. Is removing "&& reload_completed" what I need to do to allow this instruction to be split up earlier? I am sure there is something basic I don't understand about the machine description setup but I don't know what it is. Is it related to the predication? Any help? Steve Ellcey sje@cup.hp.com Here is my new divsf3_internal_thr instruction (without the "&& reload_completed"): (define_insn_and_split "divsf3_internal_thr" [(set (match_operand:SF 0 "fr_register_operand" "=&f") (div:SF (match_operand:SF 1 "fr_register_operand" "f") (match_operand:SF 2 "fr_register_operand" "f"))) (clobber (match_scratch:XF 3 "=&f")) (clobber (match_scratch:XF 4 "=&f")) (clobber (match_scratch:XF 5 "=&f")) (clobber (match_scratch:SF 6 "=&f")) (clobber (match_scratch:XF 7 "=f")) (clobber (match_scratch:BI 8 "=c"))] "" "#" "" [(parallel [(set (match_dup 0) (div:SF (const_int 1) (match_dup 2))) (set (match_dup 8) (unspec:BI [(match_dup 1) (match_dup 2)] UNSPEC_FR_RECIP_APPROX)) (use (const_int 1))]) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 3) (minus:XF (match_dup 9) (mult:XF (float_extend:XF (match_dup 2)) (float_extend:XF (match_dup 0))))) (use (const_int 1))])) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 4) (plus:XF (mult:XF (match_dup 3) (match_dup 3)) (match_dup 3))) (use (const_int 1))])) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 5) (plus:XF (mult:XF (match_dup 4) (float_extend:XF (match_dup 0))) (float_extend:XF (match_dup 0)))) (use (const_int 1))])) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 6) (float_truncate:SF (mult:XF (float_extend:XF (match_dup 1)) (match_dup 5)))) (use (const_int 1))])) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 7) (minus:XF (float_extend:XF (match_dup 1)) (mult:XF (float_extend:XF (match_dup 2)) (float_extend:XF (match_dup 6))))) (use (const_int 1))])) (cond_exec (ne (match_dup 8) (const_int 0)) (parallel [(set (match_dup 0) (float_truncate:SF (plus:XF (mult:XF (match_dup 7) (match_dup 5)) (float_extend:XF (match_dup 6))))) (use (const_int 1))])) ] { operands[9] = CONST1_RTX (XFmode); } [(set_attr "predicable" "no")])