* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
@ 2004-03-09 17:35 ` pinskia at gcc dot gnu dot org
2004-03-09 18:01 ` [Bug optimization/14504] New: " Falk Hueffner
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-09 17:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-09 17:35 -------
Confirmed.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Keywords| |pessimizes-code
Last reconfirmed|0000-00-00 00:00:00 |2004-03-09 17:35:06
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug optimization/14504] New: Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
2004-03-09 17:35 ` [Bug optimization/14504] " pinskia at gcc dot gnu dot org
@ 2004-03-09 18:01 ` Falk Hueffner
2004-03-11 0:09 ` [Bug optimization/14504] " kazu at cs dot umass dot edu
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Falk Hueffner @ 2004-03-09 18:01 UTC (permalink / raw)
To: gcc-bugs
"stl at caltech dot edu" <gcc-bugzilla@gcc.gnu.org> writes:
> unsigned long cond_mask(bool flag, unsigned long mask, unsigned long target) {
> return flag ? target | mask : target & ~mask;
> }
>
> unsigned long cond_mask(bool flag, unsigned long mask, unsigned long target) {
> return (mask | target ^ 0xFFFFFFFFUL + flag) ^ 0xFFFFFFFFUL + flag;
> }
This is not universally a win. For example, on an Alpha EV5, the first:
andnot a2,a1,t1
or a2,a1,v0
cmoveq a0,t1,v0
takes 2 cycles, and the second:
lda t1,-1(a0)
xor a2,t1,v0
or a1,v0,t0
xor t0,t1,v0
takes 3 cycles and 1 more insn. I suspect on i386 it'd be similar if
you enable conditional moves.
--
Falk
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
2004-03-09 17:35 ` [Bug optimization/14504] " pinskia at gcc dot gnu dot org
2004-03-09 18:01 ` [Bug optimization/14504] New: " Falk Hueffner
@ 2004-03-11 0:09 ` kazu at cs dot umass dot edu
2004-03-11 21:33 ` stl at caltech dot edu
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-03-11 0:09 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-03-11 00:09 -------
I've got a patch.
/* There are four cases to handle modulo whether FLAG appears
positively and ordering of operands TARGET and MASK in the
expression.
With my patch, gcc -O2 -fomit-frame-pointer -mregparm=3 generates
the following assembly code. */
typedef _Bool bool;
unsigned long
cond_mask_0 (bool flag, unsigned long mask, unsigned long target)
{
return flag ? target | mask : target & ~mask;
#if 0
movzbl %al, %eax
decl %eax
xorl %eax, %ecx
orl %edx, %ecx
xorl %ecx, %eax
#endif
}
unsigned long
cond_mask_1 (bool flag, unsigned long mask, unsigned long target)
{
return flag ? target | ~mask : target & mask;
#if 0
movzbl %al, %eax
negl %eax
xorl %eax, %ecx
andl %edx, %ecx
xorl %ecx, %eax
#endif
}
unsigned long
cond_mask_2 (bool flag, unsigned long mask, unsigned long target)
{
return flag ? target & mask : target | ~mask;
#if 0
movzbl %al, %eax
decl %eax
xorl %eax, %ecx
andl %edx, %ecx
xorl %ecx, %eax
#endif
}
unsigned long
cond_mask_3 (bool flag, unsigned long mask, unsigned long target)
{
return flag ? target & ~mask : target | mask;
#if 0
movzbl %al, %eax
negl %eax
xorl %eax, %ecx
orl %edx, %ecx
xorl %ecx, %eax
#endif
}
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |kazu at cs dot umass dot edu
AssignedTo|unassigned at gcc dot gnu |kazu at cs dot umass dot edu
|dot org |
Status|NEW |ASSIGNED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (2 preceding siblings ...)
2004-03-11 0:09 ` [Bug optimization/14504] " kazu at cs dot umass dot edu
@ 2004-03-11 21:33 ` stl at caltech dot edu
2004-03-11 21:35 ` kazu at cs dot umass dot edu
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: stl at caltech dot edu @ 2004-03-11 21:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From stl at caltech dot edu 2004-03-11 21:33 -------
Neat. Is this patch for 3.3.x, 3.4.0, mainline, or some combination?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (3 preceding siblings ...)
2004-03-11 21:33 ` stl at caltech dot edu
@ 2004-03-11 21:35 ` kazu at cs dot umass dot edu
2004-03-11 21:41 ` stl at caltech dot edu
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-03-11 21:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-03-11 21:35 -------
Mainline as this is not a regression or bug fix.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (4 preceding siblings ...)
2004-03-11 21:35 ` kazu at cs dot umass dot edu
@ 2004-03-11 21:41 ` stl at caltech dot edu
2004-03-11 22:10 ` falk at debian dot org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: stl at caltech dot edu @ 2004-03-11 21:41 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From stl at caltech dot edu 2004-03-11 21:41 -------
Ok. (I thought that 3.3.4 was open for non-invasive enhancements; now I see
that it's open for non-invasive non-regression bugfixes.)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (5 preceding siblings ...)
2004-03-11 21:41 ` stl at caltech dot edu
@ 2004-03-11 22:10 ` falk at debian dot org
2004-03-11 22:21 ` stl at caltech dot edu
2004-03-11 22:29 ` kazu at cs dot umass dot edu
8 siblings, 0 replies; 10+ messages in thread
From: falk at debian dot org @ 2004-03-11 22:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From falk at debian dot org 2004-03-11 22:10 -------
My comment posted to gcc-bugs didn't make it into bugzilla, so I'll repeat it.
This is not universally a win. For example, on an Alpha EV5, the first:
andnot a2,a1,t1
or a2,a1,v0
cmoveq a0,t1,v0
takes 2 cycles, and the second:
lda t1,-1(a0)
xor a2,t1,v0
or a1,v0,t0
xor t0,t1,v0
takes 4 cycles and 1 more insn. I suspect on i386 it'd be similar if
you enable conditional moves.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (6 preceding siblings ...)
2004-03-11 22:10 ` falk at debian dot org
@ 2004-03-11 22:21 ` stl at caltech dot edu
2004-03-11 22:29 ` kazu at cs dot umass dot edu
8 siblings, 0 replies; 10+ messages in thread
From: stl at caltech dot edu @ 2004-03-11 22:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From stl at caltech dot edu 2004-03-11 22:21 -------
Well, the Right Thing to do is to perform the transformation when it's a win,
and also perform the /reverse transformation/ when it's a win, and not
transform anything otherwise.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug optimization/14504] Missed Bit Twiddling Optimization
2004-03-09 17:17 [Bug optimization/14504] New: Missed Bit Twiddling Optimization stl at caltech dot edu
` (7 preceding siblings ...)
2004-03-11 22:21 ` stl at caltech dot edu
@ 2004-03-11 22:29 ` kazu at cs dot umass dot edu
8 siblings, 0 replies; 10+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-03-11 22:29 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-03-11 22:29 -------
Err, i386 is a two-address machine with only a few registers,
so cmove doesn't work as well in this case.
typedef _Bool bool;
unsigned long
cond_mask (bool flag, unsigned long mask, unsigned long target)
{
unsigned long l = target | mask;
unsigned long r = target & ~mask;
return flag ? l : r;
#if 0
pushl %ebx
movl %eax, %ebx ; %bl = flag
movl %ecx, %eax
orl %edx, %eax ; %eax = target | mask
notl %edx ; ~mask
andl %ecx, %edx ; %edx = target & ~mask
testb %bl, %bl
popl %ebx
cmove %edx, %eax
#endif
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14504
^ permalink raw reply [flat|nested] 10+ messages in thread