muldi3, divdi3 and remdi3 patterns

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* muldi3, divdi3 and remdi3 patterns
@ 1999-08-07 13:56 Mark Klein
  1999-08-08 22:07 ` Jeffrey A Law
  1999-08-31 23:20 ` Mark Klein
  0 siblings, 2 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-07 13:56 UTC (permalink / raw)
  To: gcc

I've got a puzzle that I've been poking at off and on for the past 
number of weeks, and I guess it's time to ask for help. 

In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
instead of simply using the remsi3 pattern. It starts down this
path in expand_divmod when expand_mult_highpart recognizes it can
use the muldi3 pattern. 

Any ideas?

TIA,


M.
--- snip ---

My source:

#define HASH_TABLE_SIZE 31

extern int string_hash_fun(void);

int
internString()
{
  return (string_hash_fun() % HASH_TABLE_SIZE);   
}

--- the rtl ---
;; Function internString

(note 2 0 3 "" NOTE_INSN_DELETED)

(note 3 2 4 "" NOTE_INSN_FUNCTION_BEG)

(note 4 3 6 "" NOTE_INSN_DELETED)

(note 6 4 8 "" NOTE_INSN_DELETED)

(call_insn 8 6 10 (parallel[ 
            (set (reg:SI 28 %r28)
                (call (mem:SI (symbol_ref/v:SI ("@string_hash_fun")) 0)
                    (const_int 16 [0x10])))
            (clobber (reg:SI 2 %r2))
            (use (const_int 0 [0x0]))
        ] ) -1 (nil)
    (nil)
    (nil))

(insn 10 8 14 (set (reg:SI 95)
        (reg:SI 28 %r28)) -1 (nil)
    (nil))

(insn 14 10 17 (set (reg:SI 99)
        (ashiftrt:SI (reg:SI 95)
            (const_int 31 [0x1f]))) -1 (nil)
    (nil))

(insn 17 14 12 (clobber (reg:DI 98)) -1 (nil)
    (insn_list:REG_LIBCALL 19 (nil)))

(insn 12 17 16 (set (subreg:SI (reg:DI 98) 1)
        (reg:SI 95)) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:SI 95)
        (nil)))

(insn 16 12 19 (set (subreg:SI (reg:DI 98) 0)
        (reg:SI 99)) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:SI 95)
        (nil)))

(insn 19 16 20 (set (reg:DI 98)
        (reg:DI 98)) -1 (nil)
    (insn_list:REG_RETVAL 17 (expr_list:REG_EQUAL (sign_extend:DI (reg:SI 95))
            (nil))))

(insn 20 19 21 (set (reg:DI 101)
        (high:DI (const_int -2078209981 [0x84210843]))) -1 (nil)
    (nil))

(insn 21 20 22 (set (reg:DI 100)
        (lo_sum:DI (reg:DI 101)
            (const_int -2078209981 [0x84210843]))) -1 (nil)
    (nil))

(insn 22 21 23 (set (reg:DI 25 %r25)
        (reg:DI 98)) -1 (nil)
    (nil))

(insn 23 22 24 (set (reg:DI 23 %r23)
        (reg:DI 100)) -1 (nil)
    (nil))

(insn 24 23 25 (parallel[ 
            (set (reg:DI 28 %r28)
                (mult:DI (reg:DI 25 %r25)
                    (reg:DI 23 %r23)))
            (clobber (reg:SI 102))
            (clobber (reg:DI 25 %r25))
            (clobber (reg:DI 23 %r23))
            (clobber (reg:SI 31 %r31))
        ] ) -1 (nil)
    (nil))

(insn 25 24 26 (set (reg:DI 97)
        (reg:DI 28 %r28)) -1 (nil)
    (expr_list:REG_EQUAL (mult:DI (reg:DI 98)
            (reg:DI 100))
        (nil)))

(insn 26 25 28 (set (reg:SI 103)
        (plus:SI (reg:SI 95)
            (subreg:SI (reg:DI 97) 0))) -1 (nil)
    (nil))

(insn 28 26 30 (set (reg:SI 104)
        (ashiftrt:SI (reg:SI 103)
            (const_int 4 [0x4]))) -1 (nil)
    (nil))

(insn 30 28 31 (set (reg:SI 105)
        (ashiftrt:SI (reg:SI 95)
            (const_int 31 [0x1f]))) -1 (nil)
    (nil))

(insn 31 30 33 (set (reg:SI 94)
        (minus:SI (reg:SI 104)
            (reg:SI 105))) -1 (nil)
    (expr_list:REG_EQUAL (div:SI (reg:SI 95)
            (const_int 31 [0x1f]))
        (nil)))

(insn 33 31 35 (set (reg:SI 106)
        (reg:SI 94)) -1 (nil)
    (nil))

(insn 35 33 36 (set (reg:SI 107)
        (ashift:SI (reg:SI 106)
            (const_int 5 [0x5]))) -1 (nil)
    (expr_list:REG_EQUAL (mult:SI (reg:SI 94)
            (const_int 32 [0x20]))
        (nil)))

(insn 36 35 37 (set (reg:SI 107)
        (minus:SI (reg:SI 107)
            (reg:SI 94))) -1 (nil)
    (expr_list:REG_EQUAL (mult:SI (reg:SI 94)
            (const_int 31 [0x1f]))
        (nil)))

(insn 37 36 39 (set (reg:SI 94)
        (minus:SI (reg:SI 95)
            (reg:SI 107))) -1 (nil)
    (nil))

(insn 39 37 40 (set (reg/i:SI 28 %r28)
        (reg:SI 94)) -1 (nil)
    (nil))

(insn 40 39 41 (use (reg/i:SI 28 %r28)) -1 (nil)
    (nil))

(jump_insn 41 40 42 (set (pc)
        (label_ref 46)) -1 (nil)
    (nil))

(barrier 42 41 44)

(note 44 42 46 "" NOTE_INSN_FUNCTION_END)

(code_label 46 44 0 2 "" [num uses: 0])

--- snip ---


--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-07 13:56 muldi3, divdi3 and remdi3 patterns Mark Klein
@ 1999-08-08 22:07 ` Jeffrey A Law
  1999-08-09  7:52   ` Mark Klein
  1999-08-31 23:20   ` Jeffrey A Law
  1999-08-31 23:20 ` Mark Klein
  1 sibling, 2 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-08 22:07 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990807132315.00ccc670@garfield.dis.com >you write:
  > I've got a puzzle that I've been poking at off and on for the past 
  > number of weeks, and I guess it's time to ask for help. 
  > 
  > In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
  > that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
  > instead of simply using the remsi3 pattern. It starts down this
  > path in expand_divmod when expand_mult_highpart recognizes it can
  > use the muldi3 pattern. 
Yes.  This is the division by using widening multiplications.  This is
precisely what I would expect the compiler to do.
jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-08 22:07 ` Jeffrey A Law
@ 1999-08-09  7:52   ` Mark Klein
  1999-08-10  0:32     ` Jeffrey A Law
  1999-08-31 23:20     ` Mark Klein
  1999-08-31 23:20   ` Jeffrey A Law
  1 sibling, 2 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-09  7:52 UTC (permalink / raw)
  To: law; +Cc: gcc

At 11:04 PM 8/8/99 -0600, Jeffrey A Law wrote:

>  > In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
>  > that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
>  > instead of simply using the remsi3 pattern. It starts down this
>  > path in expand_divmod when expand_mult_highpart recognizes it can
>  > use the muldi3 pattern. 

>Yes.  This is the division by using widening multiplications.  This is
>precisely what I would expect the compiler to do.

So even though $$remoI is faster than the combination of $$mulo2I and
the shifting, on PA we'll get the less efficient implementation? I
don't quite understand how the current costs have been allocated .. is
there a way to influence this such that I'll use the modulo millicode
instead of the multiply/divide?

Regards,


M.
--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-09  7:52   ` Mark Klein
@ 1999-08-10  0:32     ` Jeffrey A Law
  1999-08-10  8:58       ` Mark Klein
  1999-08-31 23:20       ` Jeffrey A Law
  1999-08-31 23:20     ` Mark Klein
  1 sibling, 2 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-10  0:32 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990809074734.00cdecc0@garfield.dis.com >you write:
  > So even though $$remoI is faster than the combination of $$mulo2I and
Why would you even consider calling mul2I?  Use 3 xmpyu instructions to do a
standard cross-product multiply.  Calling the mul2I routines is totally
braindead.


  > I don't quite understand how the current costs have been allocated ..
See RTX_COST, CONST_COSTS and friends.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10  0:32     ` Jeffrey A Law
@ 1999-08-10  8:58       ` Mark Klein
  1999-08-10 22:09         ` Jeffrey A Law
  1999-08-31 23:20         ` Mark Klein
  1999-08-31 23:20       ` Jeffrey A Law
  1 sibling, 2 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-10  8:58 UTC (permalink / raw)
  To: law; +Cc: gcc

At 01:29 AM 8/10/99 -0600, Jeffrey A Law wrote:

>
>  In message < 4.1.19990809074734.00cdecc0@garfield.dis.com >you write:
>  > So even though $$remoI is faster than the combination of $$mulo2I and
>Why would you even consider calling mul2I?  Use 3 xmpyu instructions to do a
>standard cross-product multiply.  Calling the mul2I routines is totally
>braindead.

PA 1.0 is the primary reason - many of the HP3000's are still PA 1.0
boxes without xmpyu. Regardless, it's the $$rem.. millicode that I'm
trying to add.


--
Mark Klein                                    DIS International, Ltd.
http://www.dis.com                            415-892-8400
PGP Public Key Available
--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10  8:58       ` Mark Klein
@ 1999-08-10 22:09         ` Jeffrey A Law
  1999-08-11  7:55           ` Mark Klein
  1999-08-31 23:20           ` Jeffrey A Law
  1999-08-31 23:20         ` Mark Klein
  1 sibling, 2 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-10 22:09 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990810085233.00c17300@garfield.dis.com >you write:
  > PA 1.0 is the primary reason - many of the HP3000's are still PA 1.0
  > boxes without xmpyu. Regardless, it's the $$rem.. millicode that I'm
  > trying to add.
Then you would want to tune the costs of the multiply, divide & remainders.

What's out there is already aware of PA1.0 vs PA1.1 processors, but may not
be aware of the relative cost of operations for DImode operations.

jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10 22:09         ` Jeffrey A Law
@ 1999-08-11  7:55           ` Mark Klein
  1999-08-12  1:58             ` Jeffrey A Law
  1999-08-31 23:20             ` Mark Klein
  1999-08-31 23:20           ` Jeffrey A Law
  1 sibling, 2 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-11  7:55 UTC (permalink / raw)
  To: law; +Cc: gcc

At 11:05 PM 8/10/99 -0600, Jeffrey A Law wrote:

>Then you would want to tune the costs of the multiply, divide & remainders.

Let me rephrase my earlier question .. 

case UDIV:
case MOD:
case UMOD:
	return COSTS_N_INSNS (60);

Did you base the costs on anything concrete such as number of cycles 
or instructions per operation, or are these arbitrary values that 
simply indicate the relative costs? What does the "60" represent?

TIA,


--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-11  7:55           ` Mark Klein
@ 1999-08-12  1:58             ` Jeffrey A Law
  1999-08-12 10:34               ` Joern Rennecke
  1999-08-31 23:20               ` Jeffrey A Law
  1999-08-31 23:20             ` Mark Klein
  1 sibling, 2 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-12  1:58 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990811073640.00ce36f0@garfield.dis.com >you write:
  > case UDIV:
  > case MOD:
  > case UMOD:
  > 	return COSTS_N_INSNS (60);
  > 
  > Did you base the costs on anything concrete such as number of cycles 
  > or instructions per operation, or are these arbitrary values that 
  > simply indicate the relative costs? What does the "60" represent?
I don't remember.  It's been many many years since I even looked at those
cost macros.

Basically they're supposed to provide rough estimates of the number of
instructions/cycles necessary to implement each operation (on average).
Whether or not the PA mul/div stuff follows that convention I'm not sure
(again, it's been a very very long time).

jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-12  1:58             ` Jeffrey A Law
@ 1999-08-12 10:34               ` Joern Rennecke
  1999-08-31 23:20                 ` Joern Rennecke
  1999-08-31 23:20               ` Jeffrey A Law
  1 sibling, 1 reply; 18+ messages in thread
From: Joern Rennecke @ 1999-08-12 10:34 UTC (permalink / raw)
  To: law; +Cc: mklein, gcc

There is also the issue that the distributive law doesn't apply to
COSTS_N_INSNS and +, i.e COSTS_N_INSNS (n) + COSTS_N_INSNS (m) <
COSTS_N_INSNS (n+m)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-09  7:52   ` Mark Klein
  1999-08-10  0:32     ` Jeffrey A Law
@ 1999-08-31 23:20     ` Mark Klein
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-31 23:20 UTC (permalink / raw)
  To: law; +Cc: gcc

At 11:04 PM 8/8/99 -0600, Jeffrey A Law wrote:

>  > In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
>  > that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
>  > instead of simply using the remsi3 pattern. It starts down this
>  > path in expand_divmod when expand_mult_highpart recognizes it can
>  > use the muldi3 pattern. 

>Yes.  This is the division by using widening multiplications.  This is
>precisely what I would expect the compiler to do.

So even though $$remoI is faster than the combination of $$mulo2I and
the shifting, on PA we'll get the less efficient implementation? I
don't quite understand how the current costs have been allocated .. is
there a way to influence this such that I'll use the modulo millicode
instead of the multiply/divide?

Regards,


M.
--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10  8:58       ` Mark Klein
  1999-08-10 22:09         ` Jeffrey A Law
@ 1999-08-31 23:20         ` Mark Klein
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-31 23:20 UTC (permalink / raw)
  To: law; +Cc: gcc

At 01:29 AM 8/10/99 -0600, Jeffrey A Law wrote:

>
>  In message < 4.1.19990809074734.00cdecc0@garfield.dis.com >you write:
>  > So even though $$remoI is faster than the combination of $$mulo2I and
>Why would you even consider calling mul2I?  Use 3 xmpyu instructions to do a
>standard cross-product multiply.  Calling the mul2I routines is totally
>braindead.

PA 1.0 is the primary reason - many of the HP3000's are still PA 1.0
boxes without xmpyu. Regardless, it's the $$rem.. millicode that I'm
trying to add.


--
Mark Klein                                    DIS International, Ltd.
http://www.dis.com                            415-892-8400
PGP Public Key Available
--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-12 10:34               ` Joern Rennecke
@ 1999-08-31 23:20                 ` Joern Rennecke
  0 siblings, 0 replies; 18+ messages in thread
From: Joern Rennecke @ 1999-08-31 23:20 UTC (permalink / raw)
  To: law; +Cc: mklein, gcc

There is also the issue that the distributive law doesn't apply to
COSTS_N_INSNS and +, i.e COSTS_N_INSNS (n) + COSTS_N_INSNS (m) <
COSTS_N_INSNS (n+m)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-11  7:55           ` Mark Klein
  1999-08-12  1:58             ` Jeffrey A Law
@ 1999-08-31 23:20             ` Mark Klein
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-31 23:20 UTC (permalink / raw)
  To: law; +Cc: gcc

At 11:05 PM 8/10/99 -0600, Jeffrey A Law wrote:

>Then you would want to tune the costs of the multiply, divide & remainders.

Let me rephrase my earlier question .. 

case UDIV:
case MOD:
case UMOD:
	return COSTS_N_INSNS (60);

Did you base the costs on anything concrete such as number of cycles 
or instructions per operation, or are these arbitrary values that 
simply indicate the relative costs? What does the "60" represent?

TIA,


--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* muldi3, divdi3 and remdi3 patterns
  1999-08-07 13:56 muldi3, divdi3 and remdi3 patterns Mark Klein
  1999-08-08 22:07 ` Jeffrey A Law
@ 1999-08-31 23:20 ` Mark Klein
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Klein @ 1999-08-31 23:20 UTC (permalink / raw)
  To: gcc

I've got a puzzle that I've been poking at off and on for the past 
number of weeks, and I guess it's time to ask for help. 

In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
instead of simply using the remsi3 pattern. It starts down this
path in expand_divmod when expand_mult_highpart recognizes it can
use the muldi3 pattern. 

Any ideas?

TIA,


M.
--- snip ---

My source:

#define HASH_TABLE_SIZE 31

extern int string_hash_fun(void);

int
internString()
{
  return (string_hash_fun() % HASH_TABLE_SIZE);   
}

--- the rtl ---
;; Function internString

(note 2 0 3 "" NOTE_INSN_DELETED)

(note 3 2 4 "" NOTE_INSN_FUNCTION_BEG)

(note 4 3 6 "" NOTE_INSN_DELETED)

(note 6 4 8 "" NOTE_INSN_DELETED)

(call_insn 8 6 10 (parallel[ 
            (set (reg:SI 28 %r28)
                (call (mem:SI (symbol_ref/v:SI ("@string_hash_fun")) 0)
                    (const_int 16 [0x10])))
            (clobber (reg:SI 2 %r2))
            (use (const_int 0 [0x0]))
        ] ) -1 (nil)
    (nil)
    (nil))

(insn 10 8 14 (set (reg:SI 95)
        (reg:SI 28 %r28)) -1 (nil)
    (nil))

(insn 14 10 17 (set (reg:SI 99)
        (ashiftrt:SI (reg:SI 95)
            (const_int 31 [0x1f]))) -1 (nil)
    (nil))

(insn 17 14 12 (clobber (reg:DI 98)) -1 (nil)
    (insn_list:REG_LIBCALL 19 (nil)))

(insn 12 17 16 (set (subreg:SI (reg:DI 98) 1)
        (reg:SI 95)) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:SI 95)
        (nil)))

(insn 16 12 19 (set (subreg:SI (reg:DI 98) 0)
        (reg:SI 99)) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:SI 95)
        (nil)))

(insn 19 16 20 (set (reg:DI 98)
        (reg:DI 98)) -1 (nil)
    (insn_list:REG_RETVAL 17 (expr_list:REG_EQUAL (sign_extend:DI (reg:SI 95))
            (nil))))

(insn 20 19 21 (set (reg:DI 101)
        (high:DI (const_int -2078209981 [0x84210843]))) -1 (nil)
    (nil))

(insn 21 20 22 (set (reg:DI 100)
        (lo_sum:DI (reg:DI 101)
            (const_int -2078209981 [0x84210843]))) -1 (nil)
    (nil))

(insn 22 21 23 (set (reg:DI 25 %r25)
        (reg:DI 98)) -1 (nil)
    (nil))

(insn 23 22 24 (set (reg:DI 23 %r23)
        (reg:DI 100)) -1 (nil)
    (nil))

(insn 24 23 25 (parallel[ 
            (set (reg:DI 28 %r28)
                (mult:DI (reg:DI 25 %r25)
                    (reg:DI 23 %r23)))
            (clobber (reg:SI 102))
            (clobber (reg:DI 25 %r25))
            (clobber (reg:DI 23 %r23))
            (clobber (reg:SI 31 %r31))
        ] ) -1 (nil)
    (nil))

(insn 25 24 26 (set (reg:DI 97)
        (reg:DI 28 %r28)) -1 (nil)
    (expr_list:REG_EQUAL (mult:DI (reg:DI 98)
            (reg:DI 100))
        (nil)))

(insn 26 25 28 (set (reg:SI 103)
        (plus:SI (reg:SI 95)
            (subreg:SI (reg:DI 97) 0))) -1 (nil)
    (nil))

(insn 28 26 30 (set (reg:SI 104)
        (ashiftrt:SI (reg:SI 103)
            (const_int 4 [0x4]))) -1 (nil)
    (nil))

(insn 30 28 31 (set (reg:SI 105)
        (ashiftrt:SI (reg:SI 95)
            (const_int 31 [0x1f]))) -1 (nil)
    (nil))

(insn 31 30 33 (set (reg:SI 94)
        (minus:SI (reg:SI 104)
            (reg:SI 105))) -1 (nil)
    (expr_list:REG_EQUAL (div:SI (reg:SI 95)
            (const_int 31 [0x1f]))
        (nil)))

(insn 33 31 35 (set (reg:SI 106)
        (reg:SI 94)) -1 (nil)
    (nil))

(insn 35 33 36 (set (reg:SI 107)
        (ashift:SI (reg:SI 106)
            (const_int 5 [0x5]))) -1 (nil)
    (expr_list:REG_EQUAL (mult:SI (reg:SI 94)
            (const_int 32 [0x20]))
        (nil)))

(insn 36 35 37 (set (reg:SI 107)
        (minus:SI (reg:SI 107)
            (reg:SI 94))) -1 (nil)
    (expr_list:REG_EQUAL (mult:SI (reg:SI 94)
            (const_int 31 [0x1f]))
        (nil)))

(insn 37 36 39 (set (reg:SI 94)
        (minus:SI (reg:SI 95)
            (reg:SI 107))) -1 (nil)
    (nil))

(insn 39 37 40 (set (reg/i:SI 28 %r28)
        (reg:SI 94)) -1 (nil)
    (nil))

(insn 40 39 41 (use (reg/i:SI 28 %r28)) -1 (nil)
    (nil))

(jump_insn 41 40 42 (set (pc)
        (label_ref 46)) -1 (nil)
    (nil))

(barrier 42 41 44)

(note 44 42 46 "" NOTE_INSN_FUNCTION_END)

(code_label 46 44 0 2 "" [num uses: 0])

--- snip ---


--
Mark Klein                                 DIS International, Ltd.
http://www.dis.com                         415-892-8400
PGP Public Key Available			

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10  0:32     ` Jeffrey A Law
  1999-08-10  8:58       ` Mark Klein
@ 1999-08-31 23:20       ` Jeffrey A Law
  1 sibling, 0 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990809074734.00cdecc0@garfield.dis.com >you write:
  > So even though $$remoI is faster than the combination of $$mulo2I and
Why would you even consider calling mul2I?  Use 3 xmpyu instructions to do a
standard cross-product multiply.  Calling the mul2I routines is totally
braindead.


  > I don't quite understand how the current costs have been allocated ..
See RTX_COST, CONST_COSTS and friends.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-12  1:58             ` Jeffrey A Law
  1999-08-12 10:34               ` Joern Rennecke
@ 1999-08-31 23:20               ` Jeffrey A Law
  1 sibling, 0 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990811073640.00ce36f0@garfield.dis.com >you write:
  > case UDIV:
  > case MOD:
  > case UMOD:
  > 	return COSTS_N_INSNS (60);
  > 
  > Did you base the costs on anything concrete such as number of cycles 
  > or instructions per operation, or are these arbitrary values that 
  > simply indicate the relative costs? What does the "60" represent?
I don't remember.  It's been many many years since I even looked at those
cost macros.

Basically they're supposed to provide rough estimates of the number of
instructions/cycles necessary to implement each operation (on average).
Whether or not the PA mul/div stuff follows that convention I'm not sure
(again, it's been a very very long time).

jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-08 22:07 ` Jeffrey A Law
  1999-08-09  7:52   ` Mark Klein
@ 1999-08-31 23:20   ` Jeffrey A Law
  1 sibling, 0 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990807132315.00ccc670@garfield.dis.com >you write:
  > I've got a puzzle that I've been poking at off and on for the past 
  > number of weeks, and I guess it's time to ask for help. 
  > 
  > In adding muldi3, divdi3 and remdi3 instruction patterns, I find 
  > that the compiler is choosing to do remsi3 as a widened X-Y*(X/Y)
  > instead of simply using the remsi3 pattern. It starts down this
  > path in expand_divmod when expand_mult_highpart recognizes it can
  > use the muldi3 pattern. 
Yes.  This is the division by using widening multiplications.  This is
precisely what I would expect the compiler to do.
jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: muldi3, divdi3 and remdi3 patterns
  1999-08-10 22:09         ` Jeffrey A Law
  1999-08-11  7:55           ` Mark Klein
@ 1999-08-31 23:20           ` Jeffrey A Law
  1 sibling, 0 replies; 18+ messages in thread
From: Jeffrey A Law @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Mark Klein; +Cc: gcc

  In message < 4.1.19990810085233.00c17300@garfield.dis.com >you write:
  > PA 1.0 is the primary reason - many of the HP3000's are still PA 1.0
  > boxes without xmpyu. Regardless, it's the $$rem.. millicode that I'm
  > trying to add.
Then you would want to tune the costs of the multiply, divide & remainders.

What's out there is already aware of PA1.0 vs PA1.1 processors, but may not
be aware of the relative cost of operations for DImode operations.

jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~1999-08-31 23:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-08-07 13:56 muldi3, divdi3 and remdi3 patterns Mark Klein
1999-08-08 22:07 ` Jeffrey A Law
1999-08-09  7:52   ` Mark Klein
1999-08-10  0:32     ` Jeffrey A Law
1999-08-10  8:58       ` Mark Klein
1999-08-10 22:09         ` Jeffrey A Law
1999-08-11  7:55           ` Mark Klein
1999-08-12  1:58             ` Jeffrey A Law
1999-08-12 10:34               ` Joern Rennecke
1999-08-31 23:20                 ` Joern Rennecke
1999-08-31 23:20               ` Jeffrey A Law
1999-08-31 23:20             ` Mark Klein
1999-08-31 23:20           ` Jeffrey A Law
1999-08-31 23:20         ` Mark Klein
1999-08-31 23:20       ` Jeffrey A Law
1999-08-31 23:20     ` Mark Klein
1999-08-31 23:20   ` Jeffrey A Law
1999-08-31 23:20 ` Mark Klein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).