* [Bug rtl-optimization/17935] [4.0 Regression] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
@ 2004-10-11 17:24 ` pinskia at gcc dot gnu dot org
2004-10-12 19:20 ` [Bug rtl-optimization/17935] " pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-11 17:24 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-11 17:24 -------
Confirmed, this is 4.0 Regression again.
3.4.0 gives:
bar:
movl 4(%esp), %eax
xorl %edx, %edx
testb $1, (%eax)
jne .L2
movl 8(%esp), %ecx
testb $1, (%ecx)
jne .L2
movb $1, %dl
.p2align 2,,3
.L2:
movzbl %dl, %eax
ret
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|enhancement |minor
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Known to fail| |4.0.0
Known to work| |3.4.0
Last reconfirmed|0000-00-00 00:00:00 |2004-10-11 17:24:44
date| |
Summary|Two consecutive movzbl are |[4.0 Regression] Two
|generated |consecutive movzbl are
| |generated
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
@ 2004-10-12 19:20 ` pinskia at gcc dot gnu dot org
2004-12-12 17:57 ` kazu at cs dot umass dot edu
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-12 19:20 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-12 19:20 -------
Note this is not a really a regression, the main issue I see here is that we are ifcvt if(a) 1 else 0; to a and
then we copy the return part (the movzbl and ret) on the other branch but we don't combine them.
Maybe we should rerun combine after reload.
--
What |Removed |Added
----------------------------------------------------------------------------
Known to work|3.4.0 |
Summary|[4.0 Regression] Two |Two consecutive movzbl are
|consecutive movzbl are |generated
|generated |
Target Milestone|4.0.0 |---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
2004-10-12 19:20 ` [Bug rtl-optimization/17935] " pinskia at gcc dot gnu dot org
@ 2004-12-12 17:57 ` kazu at cs dot umass dot edu
2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-12-12 17:57 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-12-12 17:57 -------
With today's mainline, I get
bar:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L2
movl 8(%esp), %eax
testb $1, (%eax)
jne .L2
movl $1, %eax
movzbl %al, %eax
ret
.p2align 2,,3
.L2:
xorl %eax, %eax
movzbl %al, %eax
ret
Still we have a similar problem. movl (or xorl) followed by movzbl is
meaningless.
It turns out that the fix for PR 14843 can fix this at tree level.
Here is the resulting assembly with my patch for PR 14843.
bar:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L2
movl 8(%esp), %eax
testb $1, (%eax)
jne .L2
movl $1, %eax
ret
.p2align 2,,3
.L2:
xorl %eax, %eax
ret
The reason my patch fixes this is because it removes casts before expansion.
Without my patch:
bar (p, q)
{
int iftmp.0;
<bb 0>:
if (p->f0 != 0) goto <L3>; else goto <L0>;
<L0>:;
if (q->f0 != 0) goto <L3>; else goto <L7>;
<L7>:;
iftmp.0 = 1;
goto <bb 4> (<L8>);
<L3>:;
iftmp.0 = 0;
<L8>:;
return (int) (_Bool) iftmp.0; <-- Notice these casts
}
With my patch:
bar (p, q)
{
int D.1122;
<bb 0>:
if (p->f0 != 0) goto <L3>; else goto <L0>;
<L0>:;
if (q->f0 != 0) goto <L3>; else goto <L7>;
<L7>:;
D.1122 = 1;
goto <bb 4> (<L8>);
<L3>:;
D.1122 = 0;
<L8>:;
return D.1122; <-- Casts are gone!
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
` (2 preceding siblings ...)
2004-12-12 17:57 ` kazu at cs dot umass dot edu
@ 2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-12 18:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-12 18:06 -------
This seems like someone should be calling fold (and fold should be changing the types back of the
orginal tree):
(int) (_Bool) ((int) p->f0 == 0 && (int) q->f0 == 0)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
` (3 preceding siblings ...)
2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
@ 2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
2005-05-13 10:07 ` uros at kss-loka dot si
6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-05-12 17:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-05-12 17:52 -------
With the patch for PR 21520, we will get better code.
Right now on the mainline we get:
bar:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L7
movl 8(%esp), %eax
movb (%eax), %al
andl $1, %eax
xorl $1, %eax
movzbl %al, %eax
ret
.p2align 2,,3
.L7:
xorl %eax, %eax
movzbl %al, %eax
ret
but after that patch we should be able to remove the extra movzbl in the second branch.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |21520
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
` (4 preceding siblings ...)
2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
@ 2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
2005-05-13 10:07 ` uros at kss-loka dot si
6 siblings, 0 replies; 11+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2005-05-12 23:34 UTC (permalink / raw)
To: gcc-bugs
--
Bug 17935 depends on bug 21520, which changed state.
Bug 21520 Summary: missing PRE opportunity with operand after operand
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21520
What |Old Value |New Value
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
` (5 preceding siblings ...)
2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
@ 2005-05-13 10:07 ` uros at kss-loka dot si
6 siblings, 0 replies; 11+ messages in thread
From: uros at kss-loka dot si @ 2005-05-13 10:07 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at kss-loka dot si 2005-05-13 10:07 -------
I think there is another optimization opportunity regarding movzbl following andl.
Consider this part:
movb (%eax), %al EAX = x...xbbbbbbbb
andl $1, %eax EAX = 0...00000000b
movzbl %al, %eax (not needed)
ret
If the value of the constant to andl operator is less than 2 ^ bit width of the
register we would like to extend, then andl instruction inherently performs
zero-extension.
So if the xorl in comment #5 is moved before andl, we could apply above
simplification to get:
movb (%eax), %al
xorl $1, %eax
andl $1, %eax
ret
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
[not found] <bug-17935-4@http.gcc.gnu.org/bugzilla/>
@ 2021-09-06 8:29 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-06 8:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
Target Milestone|--- |4.4.0
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC 4.4.0 produces (which is like future versions of GCC):
bar:
movl 4(%esp), %eax
testb $1, (%eax)
je .L2
xorl %eax, %eax
ret
.L2:
movl 8(%esp), %eax
movb (%eax), %al
xorl $1, %eax
andl $1, %eax
ret
So GCC 4.5.0 and 4.6.0 produces:
bar:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L3
movl 8(%esp), %eax
testb $1, (%eax)
sete %al
ret
.L3:
xorl %eax, %eax
ret
Which is what we want.
GCC 4.7-8.5.0 changed the testb to movb, xor and and:
bar:
.LFB0:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L3
movl 8(%esp), %eax
movb (%eax), %al
xorl $1, %eax
andl $1, %eax
ret
.L3:
xorl %eax, %eax
ret
GCC 9+ changes the xorl to a not:
bar:
movl 4(%esp), %eax
testb $1, (%eax)
jne .L3
movl 8(%esp), %eax
movb (%eax), %al
notl %eax
andl $1, %eax
ret
.L3:
xorl %eax, %eax
ret
So this was fixed back in GCC 4.4.0.
I am not going to look what fixed it either.
^ permalink raw reply [flat|nested] 11+ messages in thread