public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated
@ 2004-10-11 16:38 kazu at cs dot umass dot edu
  2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-10-11 16:38 UTC (permalink / raw)
  To: gcc-bugs

Consider:

struct flags {
  unsigned f0 : 1;
};

_Bool
bar (struct flags *p, struct flags *q)
{
  return (!p->f0 && !q->f0);
}

With "cc1 -O2 -fomit-frame-pointer -march=i386", I get

bar:
	movl	4(%esp), %eax
	testb	$1, (%eax)
	jne	.L9
	movl	8(%esp), %eax
	testb	$1, (%eax)
	sete	%al
	movzbl	%al, %eax
	movzbl	%al, %eax
	ret
	.p2align 2,,3
.L9:
	xorl	%eax, %eax
	movzbl	%al, %eax
	ret

Note the two consecutive movzbl.  We don't need the second one.

Also note the xorl followed by movzbl.  We don't need the movzbl.

-- 
           Summary: Two consecutive movzbl are generated
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: kazu at cs dot umass dot edu
                CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] [4.0 Regression] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
@ 2004-10-11 17:24 ` pinskia at gcc dot gnu dot org
  2004-10-12 19:20 ` [Bug rtl-optimization/17935] " pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-11 17:24 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-10-11 17:24 -------
Confirmed, this is 4.0 Regression again.
3.4.0 gives:
bar:
        movl    4(%esp), %eax
        xorl    %edx, %edx
        testb   $1, (%eax)
        jne     .L2
        movl    8(%esp), %ecx
        testb   $1, (%ecx)
        jne     .L2
        movb    $1, %dl
        .p2align 2,,3
.L2:
        movzbl  %dl, %eax
        ret

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|enhancement                 |minor
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
      Known to fail|                            |4.0.0
      Known to work|                            |3.4.0
   Last reconfirmed|0000-00-00 00:00:00         |2004-10-11 17:24:44
               date|                            |
            Summary|Two consecutive movzbl are  |[4.0 Regression] Two
                   |generated                   |consecutive movzbl are
                   |                            |generated
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
  2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
@ 2004-10-12 19:20 ` pinskia at gcc dot gnu dot org
  2004-12-12 17:57 ` kazu at cs dot umass dot edu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-12 19:20 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-10-12 19:20 -------
Note this is not a really a regression, the main issue I see here is that we are ifcvt if(a) 1 else 0; to a and 
then we copy the return part (the movzbl and ret) on the other branch but we don't combine them.  
Maybe we should rerun combine after reload.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|3.4.0                       |
            Summary|[4.0 Regression] Two        |Two consecutive movzbl are
                   |consecutive movzbl are      |generated
                   |generated                   |
   Target Milestone|4.0.0                       |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
  2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
  2004-10-12 19:20 ` [Bug rtl-optimization/17935] " pinskia at gcc dot gnu dot org
@ 2004-12-12 17:57 ` kazu at cs dot umass dot edu
  2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-12-12 17:57 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From kazu at cs dot umass dot edu  2004-12-12 17:57 -------
With today's mainline, I get

bar:
	movl	4(%esp), %eax
	testb	$1, (%eax)
	jne	.L2
	movl	8(%esp), %eax
	testb	$1, (%eax)
	jne	.L2
	movl	$1, %eax
	movzbl	%al, %eax
	ret
	.p2align 2,,3
.L2:
	xorl	%eax, %eax
	movzbl	%al, %eax
	ret

Still we have a similar problem.  movl (or xorl) followed by movzbl is
meaningless.

It turns out that the fix for PR 14843 can fix this at tree level.
Here is the resulting assembly with my patch for PR 14843.

bar:
	movl	4(%esp), %eax
	testb	$1, (%eax)
	jne	.L2
	movl	8(%esp), %eax
	testb	$1, (%eax)
	jne	.L2
	movl	$1, %eax
	ret
	.p2align 2,,3
.L2:
	xorl	%eax, %eax
	ret

The reason my patch fixes this is because it removes casts before expansion.

Without my patch:

bar (p, q)
{
  int iftmp.0;

<bb 0>:
  if (p->f0 != 0) goto <L3>; else goto <L0>;

<L0>:;
  if (q->f0 != 0) goto <L3>; else goto <L7>;

<L7>:;
  iftmp.0 = 1;
  goto <bb 4> (<L8>);

<L3>:;
  iftmp.0 = 0;

<L8>:;
  return (int) (_Bool) iftmp.0;  <-- Notice these casts

}

With my patch:

bar (p, q)
{
  int D.1122;

<bb 0>:
  if (p->f0 != 0) goto <L3>; else goto <L0>;

<L0>:;
  if (q->f0 != 0) goto <L3>; else goto <L7>;

<L7>:;
  D.1122 = 1;
  goto <bb 4> (<L8>);

<L3>:;
  D.1122 = 0;

<L8>:;
  return D.1122;  <-- Casts are gone!

}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
                   ` (2 preceding siblings ...)
  2004-12-12 17:57 ` kazu at cs dot umass dot edu
@ 2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
  2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-12 18:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-12 18:06 -------
This seems like someone should be calling fold (and fold should be changing the types back of the 
orginal tree):
(int) (_Bool) ((int) p->f0 == 0 && (int) q->f0 == 0)

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
                   ` (3 preceding siblings ...)
  2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
@ 2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
  2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
  2005-05-13 10:07 ` uros at kss-loka dot si
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-05-12 17:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-05-12 17:52 -------
With the patch for PR 21520, we will get better code.
Right now on the mainline we get:
bar:
        movl    4(%esp), %eax
        testb   $1, (%eax)
        jne     .L7
        movl    8(%esp), %eax
        movb    (%eax), %al
        andl    $1, %eax
        xorl    $1, %eax
        movzbl  %al, %eax
        ret
        .p2align 2,,3
.L7:
        xorl    %eax, %eax
        movzbl  %al, %eax
        ret

but after that patch we should be able to remove the extra movzbl in the second branch.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |21520


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
                   ` (4 preceding siblings ...)
  2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
@ 2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
  2005-05-13 10:07 ` uros at kss-loka dot si
  6 siblings, 0 replies; 11+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2005-05-12 23:34 UTC (permalink / raw)
  To: gcc-bugs



-- 
Bug 17935 depends on bug 21520, which changed state.

Bug 21520 Summary: missing PRE opportunity with operand after operand
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21520

           What    |Old Value                   |New Value
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
  2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
                   ` (5 preceding siblings ...)
  2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
@ 2005-05-13 10:07 ` uros at kss-loka dot si
  6 siblings, 0 replies; 11+ messages in thread
From: uros at kss-loka dot si @ 2005-05-13 10:07 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2005-05-13 10:07 -------
I think there is another optimization opportunity regarding movzbl following andl.

Consider this part:

        movb    (%eax), %al        EAX = x...xbbbbbbbb
        andl    $1, %eax           EAX = 0...00000000b
        movzbl  %al, %eax          (not needed)
        ret

If the value of the constant to andl operator is less than 2 ^ bit width of the
register we would like to extend, then andl instruction inherently performs
zero-extension.

So if the xorl in comment #5 is moved before andl, we could apply above
simplification to get:

        movb    (%eax), %al
        xorl    $1, %eax
        andl    $1, %eax
        ret

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
       [not found] <bug-17935-4@http.gcc.gnu.org/bugzilla/>
@ 2021-09-06  8:29 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-06  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |4.4.0

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC 4.4.0 produces (which is like future versions of GCC):
bar:
        movl    4(%esp), %eax
        testb   $1, (%eax)
        je      .L2
        xorl    %eax, %eax
        ret
.L2:
        movl    8(%esp), %eax
        movb    (%eax), %al
        xorl    $1, %eax
        andl    $1, %eax
        ret

So GCC 4.5.0 and 4.6.0 produces:
bar:
        movl    4(%esp), %eax
        testb   $1, (%eax)
        jne     .L3
        movl    8(%esp), %eax
        testb   $1, (%eax)
        sete    %al
        ret
.L3:
        xorl    %eax, %eax
        ret

Which is what we want.

GCC 4.7-8.5.0 changed the testb to movb, xor and and:
bar:
.LFB0:
        movl    4(%esp), %eax
        testb   $1, (%eax)
        jne     .L3
        movl    8(%esp), %eax
        movb    (%eax), %al
        xorl    $1, %eax
        andl    $1, %eax
        ret
.L3:
        xorl    %eax, %eax
        ret

GCC 9+ changes the xorl to a not:
bar:
        movl    4(%esp), %eax
        testb   $1, (%eax)
        jne     .L3
        movl    8(%esp), %eax
        movb    (%eax), %al
        notl    %eax
        andl    $1, %eax
        ret
.L3:
        xorl    %eax, %eax
        ret

So this was fixed back in GCC 4.4.0.
I am not going to look what fixed it either.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
       [not found] <bug-17935-5009@http.gcc.gnu.org/bugzilla/>
@ 2006-01-18  5:07 ` pinskia at gcc dot gnu dot org
  0 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-18  5:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from pinskia at gcc dot gnu dot org  2006-01-18 05:07 -------
We get now:
        movb    (%eax), %al
        andl    $1, %eax
        xorl    $1, %eax
        andl    $1, %eax
        ret

(insn 23 22 24 4 (parallel [
            (set (reg:QI 63)
                (and:QI (mem/s:QI (reg/v/f:SI 61 [ q ]) [0 S1 A32])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) 212 {*andqi_1} (nil)
    (expr_list:REG_DEAD (reg/v/f:SI 61 [ q ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

(insn 24 23 25 4 (parallel [
            (set (reg:QI 64)
                (xor:QI (reg:QI 63)
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) 241 {*xorqi_1} (insn_list:REG_DEP_TRUE 23 (nil))
    (expr_list:REG_DEAD (reg:QI 63)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

(note 25 24 26 4 NOTE_INSN_DELETED)

(insn 26 25 27 4 (parallel [
            (set (reg:SI 58 [ prephitmp.25 ])
                (and:SI (subreg:SI (reg:QI 64) 0)
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) 208 {*andsi_1} (insn_list:REG_DEP_TRUE 24 (nil))
    (expr_list:REG_DEAD (reg:QI 64)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

We must lose that (a&1)^1 has only the the one bit set.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/17935] Two consecutive movzbl are generated
       [not found] <20041011163842.17935.kazu@gcc.gnu.org>
@ 2005-09-29  3:47 ` pinskia at gcc dot gnu dot org
  0 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-29  3:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-29 03:47 -------
(In reply to comment #6)
> I think there is another optimization opportunity regarding movzbl following andl.

I think that is that the tracer pass runs late which causes the movzbl to be there late.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|minor                       |enhancement


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17935


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-06  8:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-11 16:38 [Bug rtl-optimization/17935] New: Two consecutive movzbl are generated kazu at cs dot umass dot edu
2004-10-11 17:24 ` [Bug rtl-optimization/17935] [4.0 Regression] " pinskia at gcc dot gnu dot org
2004-10-12 19:20 ` [Bug rtl-optimization/17935] " pinskia at gcc dot gnu dot org
2004-12-12 17:57 ` kazu at cs dot umass dot edu
2004-12-12 18:06 ` pinskia at gcc dot gnu dot org
2005-05-12 17:52 ` pinskia at gcc dot gnu dot org
2005-05-12 23:34 ` dberlin at gcc dot gnu dot org
2005-05-13 10:07 ` uros at kss-loka dot si
     [not found] <20041011163842.17935.kazu@gcc.gnu.org>
2005-09-29  3:47 ` pinskia at gcc dot gnu dot org
     [not found] <bug-17935-5009@http.gcc.gnu.org/bugzilla/>
2006-01-18  5:07 ` pinskia at gcc dot gnu dot org
     [not found] <bug-17935-4@http.gcc.gnu.org/bugzilla/>
2021-09-06  8:29 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).