public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: Mined out of comp.std.c...
       [not found] <616BE6A276E3714788D2AC35C40CD18D5DD739@whale.softwire.co.uk>
@ 2002-04-25  3:16 ` Rupert Wood
  2002-04-25 11:04   ` Peter Barada
  0 siblings, 1 reply; 11+ messages in thread
From: Rupert Wood @ 2002-04-25  3:16 UTC (permalink / raw)
  To: nathan, 'Richard Henderson', 'Tim Hollebeek'
  Cc: 'Zack Weinberg', gcc, 'Jan Hubicka'

> OK, last thought on the matter before I go slit my wrists in 
> shame.

Except that didn't work. I think I'll give up and go home - I can't seem
to manage to redeem myself.

Problem is I put 'setnc' as a quick hack and didn't notice it broke my
non-overflow testcase. What I really needed was 'set carry xor overflow'
but there's no primitive for that and I can't construct something out of
set[n]o and sbb or adc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Mined out of comp.std.c...
  2002-04-25  3:16 ` Mined out of comp.std.c Rupert Wood
@ 2002-04-25 11:04   ` Peter Barada
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Barada @ 2002-04-25 11:04 UTC (permalink / raw)
  To: me; +Cc: nathan, rth, tim, zack, gcc, jh


>> OK, last thought on the matter before I go slit my wrists in 
>> shame.
>
>Except that didn't work. I think I'll give up and go home - I can't seem
>to manage to redeem myself.
>
>Problem is I put 'setnc' as a quick hack and didn't notice it broke my
>non-overflow testcase. What I really needed was 'set carry xor overflow'
>but there's no primitive for that and I can't construct something out of
>set[n]o and sbb or adc.

Go grab superopt-2.5.tar.gz from the gnu tree, config it for intel
and run it to search for a sequence that evaluates 'maxs':

[pbarada: ~/src/superopt-2.5] > ./superopt -fmaxs -max-cost 5
Searching for { r = (signed_word) v0 > (signed_word) v1 ? v0 : v1; }
Superoptimizing at cost 1 2 3 4 5 failure.


Which indicates that the superoptimizer can not find a 5 instruction
sequence to do this.  Obviously if you crank up max-costs a bit and
let it run over the weekend it *might* find a sequence worth
investigating.  If your max function is *unsigned*, superopt finds a
batch of matches quite quickly:

[pbarada: ~/src/superopt-2.5] > ./superopt -fmaxu -max-cost 5
Searching for { r = (unsigned_word) v0 > (unsigned_word) v1 ? v0 : v1; }
Superoptimizing at cost 1 2 3 4 5
1:	r1:=sub_co(r1,r0)
	r0:=add_co(r0,r1)
	r2:=sub_cio(r2,r2)
	r2:=and_rc(r2,r1)
	r0:=sub_co(r0,r2)
2:	r1:=sub_co(r1,r0)
	r0:=add_co(r0,r1)
	r2:=sub_cio(r2,r2)
	r1:=and_rc(r1,r2)
	r0:=sub_co(r0,r1)
...


Have Fun!

-- 
Peter Barada                                   Peter.Barada@motorola.com
Wizard                                         781-852-2768 (direct)
WaveMark Solutions(wholly owned by Motorola)   781-270-0193 (fax)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Mined out of comp.std.c...
       [not found] <616BE6A276E3714788D2AC35C40CD18D5DD723@whale.softwire.co.uk>
@ 2002-04-25  2:56 ` Rupert Wood
  0 siblings, 0 replies; 11+ messages in thread
From: Rupert Wood @ 2002-04-25  2:56 UTC (permalink / raw)
  To: nathan, 'Richard Henderson', 'Tim Hollebeek'
  Cc: 'Zack Weinberg', gcc, 'Jan Hubicka'

OK, last thought on the matter before I go slit my wrists in shame. How
about the following: Alan's fixed version in six x86 instructions
hopefully overflow proof.

    xorl    %edx, %edx
    subl    %ecx, %eax
    setnc   %dl
    dec     %edx
    andl    %edx, %eax
    addl    %ecx, %eax

This is 12 bytes. I've no idea how quick it'd be.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Mined out of comp.std.c...
       [not found] <616BE6A276E3714788D2AC35C40CD18D5DD71B@whale.softwire.co.uk>
@ 2002-04-25  2:12 ` Rupert Wood
  0 siblings, 0 replies; 11+ messages in thread
From: Rupert Wood @ 2002-04-25  2:12 UTC (permalink / raw)
  To: nathan, 'Richard Henderson'
  Cc: 'Zack Weinberg', gcc, 'Jan Hubicka'

I wrote:

> Also, consider a == b. Then d = 0 which we 'and' by. It'll only ever
> return 0.
> 
> As far as I can see this routine is only ever likely to work for d
> negative, i.e. we know that a > b beforehand.

... which is exactly what Alan's correction was about. Sorry. I'll shut
up until I wake up.

(Tim - yes, didn't consider overflow. Been too long since I've done
anything numerical.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Mined out of comp.std.c...
  2002-04-25  1:46 ` Rupert Wood
@ 2002-04-25  2:06   ` Tim Hollebeek
  0 siblings, 0 replies; 11+ messages in thread
From: Tim Hollebeek @ 2002-04-25  2:06 UTC (permalink / raw)
  To: Rupert Wood; +Cc: 'Zack Weinberg', 'Jan Hubicka', gcc

On Thu, Apr 25, 2002 at 09:38:58AM +0100, Rupert Wood wrote:
> Zack Weinberg wrote:
> 
> > static int int_max (int a, int b) {
> >     int d = (b - a);
> >     return (b - d) & (d >> (CHAR_BIT*sizeof(int) - 1));
> > }
> 
> Maybe I need more coffee, but isn't (b-d) == a?

Consider overflow.

-Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Mined out of comp.std.c...
       [not found] <616BE6A276E3714788D2AC35C40CD18D5DD717@whale.softwire.co.uk>
@ 2002-04-25  1:59 ` Rupert Wood
  0 siblings, 0 replies; 11+ messages in thread
From: Rupert Wood @ 2002-04-25  1:59 UTC (permalink / raw)
  To: nathan, 'Richard Henderson'
  Cc: 'Zack Weinberg', gcc, 'Jan Hubicka'

Nathan Sidwell wrote:

> this optimization is relying on sign (b - a) indicating the
> truthfulness of a > b. That is not correct, as b - a might overflow.

Also, consider a == b. Then d = 0 which we 'and' by. It'll only ever
return 0.

As far as I can see this routine is only ever likely to work for d
negative, i.e. we know that a > b beforehand.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Mined out of comp.std.c...
  2002-04-25  1:12 ` Richard Henderson
@ 2002-04-25  1:53   ` Nathan Sidwell
  0 siblings, 0 replies; 11+ messages in thread
From: Nathan Sidwell @ 2002-04-25  1:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Zack Weinberg, gcc, Jan Hubicka

Richard Henderson wrote:
> 
> On Wed, Apr 24, 2002 at 11:51:17PM -0700, Zack Weinberg wrote:
> > More interesting to me is the observation that - since we know the
> > arithmetic properties of the target - we could transform
> > ((a > b) ? a : b) into the above, if it were a win.  What I'm
> > wondering is if it is.

this optimization is relying on sign (b - a) indicating the truthfulness
of a > b. That is not correct, as b - a might overflow. For instance
a = 7fffffff (most pos), b=80000000 (most neg).
a is most definitely > b, but b - a will be 1 (or some overflow trap)

nathan

-- 
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
           The voices in my head told me to say this
nathan@acm.org  http://www.cs.bris.ac.uk/~nathan/  nathan@cs.bris.ac.uk

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Mined out of comp.std.c...
       [not found] <616BE6A276E3714788D2AC35C40CD18D5DD6EF@whale.softwire.co.uk>
@ 2002-04-25  1:46 ` Rupert Wood
  2002-04-25  2:06   ` Tim Hollebeek
  0 siblings, 1 reply; 11+ messages in thread
From: Rupert Wood @ 2002-04-25  1:46 UTC (permalink / raw)
  To: 'Zack Weinberg'; +Cc: 'Jan Hubicka', gcc

Zack Weinberg wrote:

> static int int_max (int a, int b) {
>     int d = (b - a);
>     return (b - d) & (d >> (CHAR_BIT*sizeof(int) - 1));
> }

Maybe I need more coffee, but isn't (b-d) == a?

>    8:   89 c2                   mov    %eax,%edx
>    a:   29 ca                   sub    %ecx,%edx
>    c:   29 d0                   sub    %edx,%eax
>    e:   c1 fa 1f                sar    $0x1f,%edx
>   11:   21 d0                   and    %edx,%eax

so rather

        subl    %eax, %ecx
        sarl    $31, %ecx
        andl    %ecx, %eax

Rup.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Mined out of comp.std.c...
  2002-04-25  0:08 Zack Weinberg
  2002-04-25  0:41 ` Alan Modra
@ 2002-04-25  1:12 ` Richard Henderson
  2002-04-25  1:53   ` Nathan Sidwell
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2002-04-25  1:12 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc, Jan Hubicka

On Wed, Apr 24, 2002 at 11:51:17PM -0700, Zack Weinberg wrote:
> More interesting to me is the observation that - since we know the
> arithmetic properties of the target - we could transform 
> ((a > b) ? a : b) into the above, if it were a win.  What I'm
> wondering is if it is.

It is almost certainly on all 3-address isa's that don't implement
conditional move (sparc, mips3, ppc).  Because it's 4 insns for
the sub+sub+shift+and, and 3 for cmp+jmp+move.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Mined out of comp.std.c...
  2002-04-25  0:08 Zack Weinberg
@ 2002-04-25  0:41 ` Alan Modra
  2002-04-25  1:12 ` Richard Henderson
  1 sibling, 0 replies; 11+ messages in thread
From: Alan Modra @ 2002-04-25  0:41 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc, Jan Hubicka

On Wed, Apr 24, 2002 at 11:51:17PM -0700, Zack Weinberg wrote:
> ... an interesting microoptimization.
> 
> The original post on comp.std.c is about whether
> 
> static int int_max (int a, int b) {
>     int d = (b - a);
>     return (b - d) & (d >> (CHAR_BIT*sizeof(int) - 1));
> }

This might work better :)

static int int_max (int a, int b) {
    int d = (b - a);
    return b - (d & (d >> (CHAR_BIT*sizeof(int) - 1)));
}

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Mined out of comp.std.c...
@ 2002-04-25  0:08 Zack Weinberg
  2002-04-25  0:41 ` Alan Modra
  2002-04-25  1:12 ` Richard Henderson
  0 siblings, 2 replies; 11+ messages in thread
From: Zack Weinberg @ 2002-04-25  0:08 UTC (permalink / raw)
  To: gcc; +Cc: Jan Hubicka

... an interesting microoptimization.

The original post on comp.std.c is about whether

static int int_max (int a, int b) {
    int d = (b - a);
    return (b - d) & (d >> (CHAR_BIT*sizeof(int) - 1));
}

is or is not portable within the standard.  If you have twos-
complement arithmetic and >> does arithmetic right shift, it's
equivalent to the usual ((a > b) ? a : b).

More interesting to me is the observation that - since we know the
arithmetic properties of the target - we could transform 
((a > b) ? a : b) into the above, if it were a win.  What I'm
wondering is if it is.  On an i386, it's definitely not a space win:

   8:   89 c2                   mov    %eax,%edx
   a:   29 ca                   sub    %ecx,%edx
   c:   29 d0                   sub    %edx,%eax
   e:   c1 fa 1f                sar    $0x1f,%edx
  11:   21 d0                   and    %edx,%eax

vs.

  28:   39 d0                   cmp    %ecx,%eax
  2a:   7d 02                   jge    2e <int_max_2+0xe>
  2c:   89 d0                   mov    %ecx,%eax

or

  28:   39 c2                   cmp    %eax,%ecx
  2a:   0f 4d c2                cmovge %ecx,%eax

(input parameters in eax, ecx in all three cases).  It's unlikely that
the clever bit shifts are faster than the conditional move, especially
in real life where we would be able to schedule other ops between the
cmp and the cmovge.  However, shifts might well be faster than jumps.
Anyone know for sure?

zw

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-04-25 17:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <616BE6A276E3714788D2AC35C40CD18D5DD739@whale.softwire.co.uk>
2002-04-25  3:16 ` Mined out of comp.std.c Rupert Wood
2002-04-25 11:04   ` Peter Barada
     [not found] <616BE6A276E3714788D2AC35C40CD18D5DD723@whale.softwire.co.uk>
2002-04-25  2:56 ` Rupert Wood
     [not found] <616BE6A276E3714788D2AC35C40CD18D5DD71B@whale.softwire.co.uk>
2002-04-25  2:12 ` Rupert Wood
     [not found] <616BE6A276E3714788D2AC35C40CD18D5DD717@whale.softwire.co.uk>
2002-04-25  1:59 ` Rupert Wood
     [not found] <616BE6A276E3714788D2AC35C40CD18D5DD6EF@whale.softwire.co.uk>
2002-04-25  1:46 ` Rupert Wood
2002-04-25  2:06   ` Tim Hollebeek
2002-04-25  0:08 Zack Weinberg
2002-04-25  0:41 ` Alan Modra
2002-04-25  1:12 ` Richard Henderson
2002-04-25  1:53   ` Nathan Sidwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).