IA64 floating point division question

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* IA64 floating point division question
@ 2004-08-31 23:20 Steve Ellcey
  2004-08-31 23:58 ` Zack Weinberg
  2004-09-01 12:28 ` Joern Rennecke
  0 siblings, 2 replies; 6+ messages in thread
From: Steve Ellcey @ 2004-08-31 23:20 UTC (permalink / raw)
  To: gcc


I have been experimenting with the IA64 floating point division code
sequence.  Currently the code sequence for floating point division is
expanded late and thus isn't scheduled very well.  The reason for this
(as I understood it) was that we access some registers using multiple
modes by creating operands for existing registers with different modes.

So I tried to address this by using more temporary registers during the
code sequence and accessing each of them only in a single mode.  I got
that to work for divsf3_internal_thr (which is a define_insn_and_split)
and things looked good but the splitting of the division into multiple
instructions was still happening late in code generation and I wasn't
getting any improvement in my scheduling.  I saw that I had "&&
reload_completed" in the define_insn_and_expand so I tried removing that
but then I got:

y.c: In function `foo':
y.c:9: error: unrecognizable insn:
(insn 36 19 37 0 (parallel [
            (set (reg:SF 351)
                (div:SF (const_int 1 [0x1])
                    (reg:SF 350 [ b ])))
            (set (scratch:BI)
                (unspec:BI [
                        (reg:SF 349 [ a ])
                        (reg:SF 350 [ b ])
                    ] 14))
            (use (const_int 1 [0x1]))
        ]) -1 (nil)
    (expr_list:REG_UNUSED (scratch:BI)
        (expr_list:REG_UNUSED (scratch:BI)
            (nil))))
y.c:9: internal compiler error: in extract_insn, at recog.c:2037


This instruction was recognized and expanded when I had "&&
reload_completed" in the define_insn_and_split so I don't understand why
it is not recognized now.  Is removing "&& reload_completed" what I need
to do to allow this instruction to be split up earlier?  I am sure there
is something basic I don't understand about the machine description
setup but I don't know what it is.  Is it related to the predication?

Any help?

Steve Ellcey
sje@cup.hp.com


Here is my new divsf3_internal_thr instruction (without the "&&
reload_completed"):

(define_insn_and_split "divsf3_internal_thr"
  [(set (match_operand:SF 0 "fr_register_operand" "=&f")
	(div:SF (match_operand:SF 1 "fr_register_operand" "f")
		(match_operand:SF 2 "fr_register_operand" "f")))
   (clobber (match_scratch:XF 3 "=&f"))
   (clobber (match_scratch:XF 4 "=&f"))
   (clobber (match_scratch:XF 5 "=&f"))
   (clobber (match_scratch:SF 6 "=&f"))
   (clobber (match_scratch:XF 7 "=f"))
   (clobber (match_scratch:BI 8 "=c"))]
  ""
  "#"
  ""
  [(parallel [(set (match_dup 0) (div:SF (const_int 1) (match_dup 2)))
	      (set (match_dup 8) (unspec:BI [(match_dup 1) (match_dup 2)]
					    UNSPEC_FR_RECIP_APPROX))
	      (use (const_int 1))])
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 3)
		     (minus:XF (match_dup 9)
			       (mult:XF (float_extend:XF (match_dup 2))
                                        (float_extend:XF (match_dup 0)))))
		(use (const_int 1))]))
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 4)
		     (plus:XF (mult:XF (match_dup 3) (match_dup 3))
			      (match_dup 3)))
		(use (const_int 1))]))
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 5)
		     (plus:XF (mult:XF (match_dup 4)
                                       (float_extend:XF (match_dup 0)))
			      (float_extend:XF (match_dup 0))))
		(use (const_int 1))]))
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 6)
		     (float_truncate:SF
		       (mult:XF (float_extend:XF (match_dup 1))
                                (match_dup 5))))
		(use (const_int 1))]))
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 7)
		     (minus:XF (float_extend:XF (match_dup 1))
			       (mult:XF (float_extend:XF (match_dup 2))
                                        (float_extend:XF (match_dup 6)))))
		(use (const_int 1))]))
   (cond_exec (ne (match_dup 8) (const_int 0))
     (parallel [(set (match_dup 0)
                      (float_truncate:SF 
	                (plus:XF (mult:XF (match_dup 7) (match_dup 5))
		                 (float_extend:XF (match_dup 6)))))
		(use (const_int 1))]))
  ]
{
  operands[9] = CONST1_RTX (XFmode);
}
  [(set_attr "predicable" "no")])

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IA64 floating point division question
  2004-08-31 23:20 IA64 floating point division question Steve Ellcey
@ 2004-08-31 23:58 ` Zack Weinberg
  2004-09-01 12:25   ` Joern Rennecke
  2004-09-01 12:28 ` Joern Rennecke
  1 sibling, 1 reply; 6+ messages in thread
From: Zack Weinberg @ 2004-08-31 23:58 UTC (permalink / raw)
  To: sje; +Cc: gcc

Steve Ellcey <sje@cup.hp.com> writes:

> y.c: In function `foo':
> y.c:9: error: unrecognizable insn:
> (insn 36 19 37 0 (parallel [
>             (set (reg:SF 351)
>                 (div:SF (const_int 1 [0x1])
>                     (reg:SF 350 [ b ])))
>             (set (scratch:BI)
>                 (unspec:BI [
>                         (reg:SF 349 [ a ])
>                         (reg:SF 350 [ b ])
>                     ] 14))
>             (use (const_int 1 [0x1]))
>         ]) -1 (nil)
>     (expr_list:REG_UNUSED (scratch:BI)
>         (expr_list:REG_UNUSED (scratch:BI)
>             (nil))))
> y.c:9: internal compiler error: in extract_insn, at recog.c:2037

I'm fairly sure that your problem is with the "*recip_approx"
instruction, which is what is supposed to match this pattern.  The
catch is, it *only* allows XFmode operands, whereas you're trying to
feed it SFmode.

I'm not sure how to fix this.  When I tried to do something similar, I
got completely stuck because I couldn't make GCC refer to the *same*
pseudo register in two different modes.

zw

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IA64 floating point division question
  2004-08-31 23:58 ` Zack Weinberg
@ 2004-09-01 12:25   ` Joern Rennecke
  0 siblings, 0 replies; 6+ messages in thread
From: Joern Rennecke @ 2004-09-01 12:25 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: sje, gcc

> I'm not sure how to fix this.  When I tried to do something similar, I
> got completely stuck because I couldn't make GCC refer to the *same*
> pseudo register in two different modes.

You need to use SUBREGs, TRUNCATE or FLOAT_/SIGN_/ZERO_EXTEND for that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IA64 floating point division question
  2004-08-31 23:20 IA64 floating point division question Steve Ellcey
  2004-08-31 23:58 ` Zack Weinberg
@ 2004-09-01 12:28 ` Joern Rennecke
  2004-09-01 17:51   ` Steve Ellcey
  1 sibling, 1 reply; 6+ messages in thread
From: Joern Rennecke @ 2004-09-01 12:28 UTC (permalink / raw)
  To: sje; +Cc: gcc

>             (set (scratch:BI)
>                 (unspec:BI [
>                         (reg:SF 349 [ a ])
>                         (reg:SF 350 [ b ])
>                     ] 14))

'scratch' will probably fail to match your operand predicate.
match_scratch is no good if you want your pattern to be split before
reload.  Instead, you have to use match_operand, and make your expander
allocate a pseudo register and stick it into the scratch operand.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IA64 floating point division question
  2004-09-01 12:28 ` Joern Rennecke
@ 2004-09-01 17:51   ` Steve Ellcey
  0 siblings, 0 replies; 6+ messages in thread
From: Steve Ellcey @ 2004-09-01 17:51 UTC (permalink / raw)
  To: joern.rennecke; +Cc: gcc

> >             (set (scratch:BI)
> >                 (unspec:BI [
> >                         (reg:SF 349 [ a ])
> >                         (reg:SF 350 [ b ])
> >                     ] 14))
> 
> 'scratch' will probably fail to match your operand predicate.
> match_scratch is no good if you want your pattern to be split before
> reload.  Instead, you have to use match_operand, and make your expander
> allocate a pseudo register and stick it into the scratch operand.

Ah, that's the piece of the puzzle I was missing.

Thank you,

Steve Ellcey
sje@cup.hp.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <200408312358.QAA26610@hpsje.cup.hp.com>]

* Re: IA64 floating point division question
       [not found] <200408312358.QAA26610@hpsje.cup.hp.com>
@ 2004-09-01  0:10 ` Zack Weinberg
  0 siblings, 0 replies; 6+ messages in thread
From: Zack Weinberg @ 2004-09-01  0:10 UTC (permalink / raw)
  To: Steve Ellcey; +Cc: gcc

Steve Ellcey <sje@cup.hp.com> writes:

>> I'm fairly sure that your problem is with the "*recip_approx"
>> instruction, which is what is supposed to match this pattern.  The
>> catch is, it *only* allows XFmode operands, whereas you're trying to
>> feed it SFmode.
>
> No, I created some new versions of *recip_approx that "generate" SFmode
> or DFmode (as well as some new _alts instructions) so that I can refer
> to each register in only one mode. My new *recip_approx_sf instruction
> is:
>
> (define_insn "*recip_approx_sf"
>   [(set (match_operand:SF 0 "fr_register_operand" "=f")
>         (div:SF (const_int 1)
>                 (match_operand:SF 3 "fr_register_operand" "f")))
>    (set (match_operand:BI 1 "register_operand" "=c")
>         (unspec:BI [(match_operand:SF 2 "fr_register_operand" "f")
>                     (match_dup 3)] UNSPEC_FR_RECIP_APPROX))
>    (use (match_operand:SI 4 "const_int_operand" ""))]
>   ""
>   "frcpa.s%4 %0, %1 = %2, %3"
>   [(set_attr "itanium_class" "fmisc")
>    (set_attr "predicable" "no")])

Odd; that certainly ought to match the pattern you showed.  I have no
idea why it wouldn't.

zw

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-09-01 17:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-31 23:20 IA64 floating point division question Steve Ellcey
2004-08-31 23:58 ` Zack Weinberg
2004-09-01 12:25   ` Joern Rennecke
2004-09-01 12:28 ` Joern Rennecke
2004-09-01 17:51   ` Steve Ellcey
     [not found] <200408312358.QAA26610@hpsje.cup.hp.com>
2004-09-01  0:10 ` Zack Weinberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).