public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/30315]  New: optimize unsigned-add overflow test on x86 to use cpu flags from addl
@ 2006-12-28  7:09 raeburn at raeburn dot org
  2007-06-06  3:39 ` [Bug target/30315] " scovich at gmail dot com
                   ` (18 more replies)
  0 siblings, 19 replies; 21+ messages in thread
From: raeburn at raeburn dot org @ 2006-12-28  7:09 UTC (permalink / raw)
  To: gcc-bugs

Using gcc-4.3-20061223 targeting i386-linux and options "-O9
-fomit-frame-pointer -fexpensive-optimizations -S", my test program to detect
unsigned integer overflow:

extern void abort(void);
unsigned foo(unsigned a, unsigned b) {
    unsigned sum = a + b;
    if (sum < a) abort(); /* check for overflow (wrapping) */
    if (sum < b) abort(); /* redundant */
    return sum;
}

compiles to:

foo:
        subl    $12, %esp
        movl    16(%esp), %eax
        movl    20(%esp), %edx
        addl    %eax, %edx
        cmpl    %edx, %eax
        ja      .L7
        cmpl    %edx, 20(%esp)
        ja      .L7
        movl    %edx, %eax
        addl    $12, %esp
        ret
.L7:
        call    abort

After the addition, which sets the ALU flags, the compiler issues two compare
instructions and conditional branches.  This sequence could be replaced by a
conditional branch following the addl, testing one of the flags (overflow?
carry? I forget which) set by it.

I realize for other processors (e.g., with -march=pentiumpro), the addl may be
replaced by something like leal, which may not set the flags the same way.  But
if the flags are set when building for a certain architecture, use them.... 
And is leal+cmp with data dependence still going to be better than addl?

(Also, I think a different set of register selections could eliminate the last
"movl" instruction, by putting the result of the addition into the return-value
register.)


-- 
           Summary: optimize unsigned-add overflow test on x86 to use cpu
                    flags from addl
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: raeburn at raeburn dot org
 GCC build triplet: powerpc-apple-darwin
  GCC host triplet: powerpc-apple-darwin
GCC target triplet: i386-unknown-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
@ 2007-06-06  3:39 ` scovich at gmail dot com
  2007-06-06  8:16 ` ubizjak at gmail dot com
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: scovich at gmail dot com @ 2007-06-06  3:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from scovich at gmail dot com  2007-06-06 03:39 -------
Happens on x86_64-unknown-linux-gnu as well, for both 4.2.0 and 4.3 (20070605)

The problem is even worse for 128-bit arithmetic because it has to check two
registers (with associated branches) before making a decision. This in spite of
the fact that sbb sets the flags properly AFAIK:

bool sub128(__uint128_t &dest, __uint128_t a, __uint128_t b) {
  dest = a - b;
  if(dest > a) abort();
}

_Z6sub128Rooo:
.LFB557:
        movq    %rsi, %rax
        movq    %rdx, %r10
        pushq   %rbx
.LCFI0:
        subq    %rcx, %rax
        sbbq    %r8, %rdx
        movq    %rax, (%rdi)
        cmpq    %rdx, %r10
        movq    %rdx, 8(%rdi)
        ja      .L23
        jae     .L24
.L21:
        call    abort
        .p2align 4,,7
.L24:
        cmpq    %rax, %rsi
        .p2align 4,,6
        jb      .L21
        .p2align 4,,7
.L23:
        popq    %rbx
        .p2align 4,,5
        ret

There's not really a way to work around it with inline asm, either, because of
the branch on overflow that will most likely come right afterward...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
  2007-06-06  3:39 ` [Bug target/30315] " scovich at gmail dot com
@ 2007-06-06  8:16 ` ubizjak at gmail dot com
  2007-06-06  9:10   ` Andrew Pinski
  2007-06-06  9:10 ` pinskia at gmail dot com
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 21+ messages in thread
From: ubizjak at gmail dot com @ 2007-06-06  8:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from ubizjak at gmail dot com  2007-06-06 08:15 -------
(In reply to comment #0)

> After the addition, which sets the ALU flags, the compiler issues two compare
> instructions and conditional branches.  This sequence could be replaced by a
> conditional branch following the addl, testing one of the flags (overflow?
> carry? I forget which) set by it.

in config/i386/i386-modes.def, documentation says:

   Add CCGOC to indicate comparisons against zero that allows
   unspecified garbage in the Carry and Overflow flag. This
   mode is used to simulate comparisons of (a-b) and (a+b)
   against zero using sub/cmp/add operations.

addl operates in CCGOCmode, but you are requesting GTU conditional jump that
requires Carry flag=0 and Zero flag=0. GTU is incompatible with CCGOCmode due
to Carry flag handling.

Change one of conditions to "if (sum == 0) abort();" and you will see insn
elimination in action.

And BTW - wrapping is undefined operation.


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
  2007-06-06  3:39 ` [Bug target/30315] " scovich at gmail dot com
  2007-06-06  8:16 ` ubizjak at gmail dot com
@ 2007-06-06  9:10 ` pinskia at gmail dot com
  2007-06-06 10:33 ` rask at sygehus dot dk
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: pinskia at gmail dot com @ 2007-06-06  9:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from pinskia at gmail dot com  2007-06-06 09:10 -------
Subject: Re:  optimize unsigned-add overflow test on x86 to use cpu flags from
addl

> And BTW - wrapping is undefined operation.

One for signed integers for unsigned it is actually defined as
wrapping and the reporter used unsigned integers here.

Thanks,
Andrew Pinski


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2007-06-06  8:16 ` ubizjak at gmail dot com
@ 2007-06-06  9:10   ` Andrew Pinski
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Pinski @ 2007-06-06  9:10 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

> And BTW - wrapping is undefined operation.

One for signed integers for unsigned it is actually defined as
wrapping and the reporter used unsigned integers here.

Thanks,
Andrew Pinski


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (2 preceding siblings ...)
  2007-06-06  9:10 ` pinskia at gmail dot com
@ 2007-06-06 10:33 ` rask at sygehus dot dk
  2007-06-06 13:00 ` ubizjak at gmail dot com
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at sygehus dot dk @ 2007-06-06 10:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rask at sygehus dot dk  2007-06-06 10:33 -------
   I see no reason to mark this enhancement request as invalid.

   As to generating reasonable x86 code for overflow checks written in C, it
isn't completely hopeless. I did an experiment with my 16-bit x86 port. First,
I wrote the overflow check differently (and of course changed the data size to
16-bits):

#include <limits.h>

void abort (void);

unsigned short int foo (unsigned short int a, unsigned short int b)
{
  unsigned short int sum;

  sum = a + b;
  if ((unsigned long int) a + (unsigned long int) b > USHRT_MAX)
    abort ();
  return sum;
}

   Second, I added new instructions, of which combine was asking for the first
one:

(define_insn_and_split "*addsi3_cconly_highpart_ccz_nc"
  [(set (reg:CCZ_NC CC_REG)
        (compare:CCZ_NC
            (subreg:HI (plus:SI (match_operand:SI 1 "general_operand" "0")
                                (match_operand:SI 2 "general_operand" "g")) 2)
            (const_int 0)))
   (clobber (match_scratch:HI 0 "=r"))]
  "ia16_arith_operands_p (PLUS, operands)"
  "#"
  ""
  [(parallel
  [(set (match_dup 0) (plus:HI (match_dup 3) (match_dup 4)))
   (set (reg:CCZ_NC CC_REG)
        (compare:CCZ_NC (subreg:HI (plus:SI (zero_extend:SI (match_dup 3))
                                            (zero_extend:SI (match_dup 4))) 2)
                        (const_int 0)))]
  )]
{
  operands[3] = simplify_gen_subreg (HImode, operands[1], SImode, 0);
  operands[4] = simplify_gen_subreg (HImode, operands[2], SImode, 0);
})

(define_insn "*addhi3_cc_carry_ccz_nc"
  [(set (match_operand:HI 0 "nonimmediate_operand" "=qm,r,m")
        (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,0")
                 (match_operand:HI 2 "general_operand" "Uo,g,ri")))
   (set (reg:CCZ_NC CC_REG)
        (compare:CCZ_NC
            (subreg:HI (plus:SI (zero_extend:SI (match_dup 1))
                                (zero_extend:SI (match_dup 2))) 2)
            (const_int 0)))]
  "ia16_arith_operands_p (PLUS, operands)"
  "@
   addb\t%H2,\t%H0
   addw\t%2,\t%0
   addw\t%2,\t%0"
)

(define_insn "*addhi3_cconly_carry_ccz_nc"
  [(set (match_scratch:HI 0 "=qm,r,m")
        (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,0")
                 (match_operand:HI 2 "general_operand" "Uo,g,ri")))
   (set (reg:CCZ_NC CC_REG)
        (compare:CCZ_NC
            (subreg:HI (plus:SI (zero_extend:SI (match_dup 1))
                                (zero_extend:SI (match_dup 2))) 2)
            (const_int 0)))]
  "ia16_arith_operands_p (PLUS, operands)"
  "@
   addb\t%H2,\t%H0
   addw\t%2,\t%0
   addw\t%2,\t%0"
)

(define_insn "*beq_ccz_nc"
        [(set (pc)
              (if_then_else (eq (reg:CCZ_NC CC_REG)
                                (const_int 0))
                            (label_ref (match_operand 0))
                            (pc)))]
        ""
        "jnc\t%l0"
)

   Third, I created CCZ_NCmode and modified my SELECT_CC_MODE callback to use
it for the comparison in "*addsi3_cconly_highpart_ccz_nc". Result:

foo:
        pushw   %cx             ;# 69   *pushhi1_nonimm
        pushw   %si             ;# 70   *pushhi1_nonimm
        pushw   %di             ;# 71   *pushhi1_nonimm
        pushw   %bp             ;# 72   *pushhi1_nonimm
        movw    %sp,    %bp     ;# 67   *movhi/1
        movw    10(%bp),%di     ;# 2    *movhi/1
        movw    %sp,    %si     ;# 68   *movhi/1
        movw    12(%si),%bp     ;# 3    *movhi/1
        movw    %bp,    %ax     ;# 7    *movhi/1
        addw    %di,    %ax     ;# 63   *addhi3_cc_carry_ccz_nc/2
        jnc     .L7             ;# 15   *beq_ccz_nc
        call    abort           ;# 25   *call
.L7:
        popw    %bp             ;# 75   *pophi1
        popw    %di             ;# 76   *pophi1
        popw    %si             ;# 77   *pophi1
        popw    %cx             ;# 78   *pophi1
        ret                     ;# 79   *return

   I see no reason the i386 back end can't be improved in a similiar fashion.

   The original example is a lot more difficult because it uses two
comparisons. The middle end would have to recognize that the two comparisons
test the same thing.


-- 

rask at sygehus dot dk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rask at sygehus dot dk


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (3 preceding siblings ...)
  2007-06-06 10:33 ` rask at sygehus dot dk
@ 2007-06-06 13:00 ` ubizjak at gmail dot com
  2007-06-06 14:51 ` raeburn at raeburn dot org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: ubizjak at gmail dot com @ 2007-06-06 13:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from ubizjak at gmail dot com  2007-06-06 13:00 -------
(In reply to comment #4)

>    I see no reason the i386 back end can't be improved in a similiar fashion.

Interesting...

I agree that this bugreport can be an enhancement request for the code you
provided. We can perhaps operate in just introduced CCCmode, but it looks a lot
of added functionality to cover this corner case. I'll try to code something.


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (4 preceding siblings ...)
  2007-06-06 13:00 ` ubizjak at gmail dot com
@ 2007-06-06 14:51 ` raeburn at raeburn dot org
  2007-06-13 13:37 ` rask at sygehus dot dk
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: raeburn at raeburn dot org @ 2007-06-06 14:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from raeburn at raeburn dot org  2007-06-06 14:51 -------
Subject: Re:  optimize unsigned-add overflow test on x86 to use cpu flags from
addl

On Jun 6, 2007, at 04:15, ubizjak at gmail dot com wrote:
> in config/i386/i386-modes.def, documentation says:
>
>    Add CCGOC to indicate comparisons against zero that allows
>    unspecified garbage in the Carry and Overflow flag. This
>    mode is used to simulate comparisons of (a-b) and (a+b)
>    against zero using sub/cmp/add operations.
>
> addl operates in CCGOCmode, but you are requesting GTU conditional  
> jump that
> requires Carry flag=0 and Zero flag=0. GTU is incompatible with  
> CCGOCmode due
> to Carry flag handling.

Actually, I think the test I want is just for the carry flag, which  
can be done with one branch instruction using the flags set by addl.   
If gcc's description of addl doesn't support that, well, then that's  
what I'm requesting be enhanced. :-)

Rask's approach looks interesting, though I wouldn't want to have to  
rewrite simple, portable tests for unsigned overflow to use wider  
types, especially if I'm adding unsigned long long values, where I  
don't have portable wider types.

Ken


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (5 preceding siblings ...)
  2007-06-06 14:51 ` raeburn at raeburn dot org
@ 2007-06-13 13:37 ` rask at sygehus dot dk
  2007-08-03 19:40 ` patchapp at dberlin dot org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at sygehus dot dk @ 2007-06-13 13:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rask at sygehus dot dk  2007-06-13 13:36 -------
Looking at this again, I don't think the transformation I'm making with the
splitter is valid, because I'm making up a zero extension which wasn't there to
begin with. The upper part could have been non-zero before the instruction.
As to the original example, it should be possible to optimize

         addl    %eax, %edx
         cmpl    %edx, %eax
         ja      .L7
into
         addl    %eax, %edx
         jc      .L7

with a CCmode expressing that the carry flag is set if %edx is smaller than
%eax.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (6 preceding siblings ...)
  2007-06-13 13:37 ` rask at sygehus dot dk
@ 2007-08-03 19:40 ` patchapp at dberlin dot org
  2007-08-04 13:10 ` patchapp at dberlin dot org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: patchapp at dberlin dot org @ 2007-08-03 19:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from patchapp at dberlin dot org  2007-08-03 19:40 -------
Subject: Bug number PR target/30315

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00204.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (7 preceding siblings ...)
  2007-08-03 19:40 ` patchapp at dberlin dot org
@ 2007-08-04 13:10 ` patchapp at dberlin dot org
  2007-08-14 14:39 ` rask at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: patchapp at dberlin dot org @ 2007-08-04 13:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from patchapp at dberlin dot org  2007-08-04 13:10 -------
Subject: Bug number PR target/30315

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00242.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (8 preceding siblings ...)
  2007-08-04 13:10 ` patchapp at dberlin dot org
@ 2007-08-14 14:39 ` rask at gcc dot gnu dot org
  2007-08-18 17:56 ` raeburn at raeburn dot org
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-08-14 14:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rask at gcc dot gnu dot org  2007-08-14 14:39 -------
Subject: Bug 30315

Author: rask
Date: Tue Aug 14 14:39:24 2007
New Revision: 127481

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=127481
Log:
        PR target/30315
        * config/i386/i386.h (CANONICALIZE_COMPARISON): New.
        * config/i386/i386.md (plusminus)(addsub)(SWI): New.
        (*<addsub><mode>3_cc_overflow): New.
        (*add<mode>3_cconly_overflow): New.
        (*sub<mode>3_cconly_overflow): New.
        (*<addsub>si3_zext_cc_overflow): New.
        * config/i386/predicates.md (fcmov_comparison_operator): Accept
        CCCmode for LTU, GTU, LEU and GEU.
        (ix86_comparison_operator): Likewise.
        (ix86_carry_flag_operator): Carry flag is set if LTU or GTU in CCCmode.
        * gcc/config/i386/i386.c (put_condition_code): Support CCCmode.
        (ix86_cc_mode): Use CCCmode when testing for overflow of PLUS
        or MINUS expressions.

testsuite/
        PR target/30315
        * gcc.target/i386/pr30315.c: New.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr30315.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/config/i386/i386.md
    trunk/gcc/config/i386/predicates.md
    trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (9 preceding siblings ...)
  2007-08-14 14:39 ` rask at gcc dot gnu dot org
@ 2007-08-18 17:56 ` raeburn at raeburn dot org
  2007-08-19 13:01 ` rask at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: raeburn at raeburn dot org @ 2007-08-18 17:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from raeburn at raeburn dot org  2007-08-18 17:55 -------
Snapshot gcc-4.3-20070817 seems to do just fine with the sample code I
supplied.

Though perhaps I should've given a bigger test case... :-)

One of the places where I'd want to check for overflow is in computing memory
allocation sizes (hence my filing #30314 at the same time, and wanting a
carry-flag check there too).  This patch lets gcc check the carry flag after
addition, which is great.  However, it seems to prevent gcc from being able to
look at the zero flag afterwards.  As in, "if the allocation size is zero, ask
for one byte so malloc won't return a null pointer":

void *foo(unsigned a, unsigned b) {
    unsigned sum = a + b;
    if (sum < a) return 0;
    if (sum == 0) sum = 1;
    return malloc(sum);
}

Without the overflow check, gcc just uses the zero flag resulting from the
addition.  With the overflow check in there, gcc uses the carry flag for the
overflow check, but then re-tests the value to see if it's zero:

        addl    %eax, %edx
        jb      L3 ; L3 just returns null
        testl   %edx, %edx
        movl    $1, %eax
        cmove   %eax, %edx

Still, this is better than doing an explicit comparison for the overflow check
and then the test for zero.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (10 preceding siblings ...)
  2007-08-18 17:56 ` raeburn at raeburn dot org
@ 2007-08-19 13:01 ` rask at gcc dot gnu dot org
  2007-08-19 18:04 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-08-19 13:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rask at gcc dot gnu dot org  2007-08-19 13:00 -------
 void *foo(unsigned a, unsigned b) {
     unsigned sum = a + b;
     if (sum < a) return 0;
     if (sum == 0) sum = 1;

To be able to optimize both comparisons, we need to have the same comparison
operands. The only way I can see this happening is to canonicalize the overflow
check as "a + b < 0" with the obvious fear that since a and b are both
unsigned, GCC will optimize the comparison away as always false - unless taught
not to do so. I'll see what I can do.

Additionally, the resulting asm seems to be a bit stupid:

        testl   %edx,   %edx
        movl    $1,     %eax
        cmove   %eax,   %edx

should be just

        testl   %edx,   %edx
        sete    %dl


-- 

rask at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rask at gcc dot gnu dot org
                   |dot org                     |
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2007-08-19 13:00:47
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (11 preceding siblings ...)
  2007-08-19 13:01 ` rask at gcc dot gnu dot org
@ 2007-08-19 18:04 ` ubizjak at gmail dot com
  2007-08-30 19:45 ` patchapp at dberlin dot org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: ubizjak at gmail dot com @ 2007-08-19 18:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from ubizjak at gmail dot com  2007-08-19 18:03 -------
(In reply to comment #12)

>      if (sum == 0) sum = 1;
> 

> Additionally, the resulting asm seems to be a bit stupid:
> 
>         testl   %edx,   %edx
>         movl    $1,     %eax
>         cmove   %eax,   %edx
> 
> should be just
> 
>         testl   %edx,   %edx
>         sete    %dl

Not, because sete sets dl to 0 (zero) when condition is not met. The c code
above doesn't change sum when sum != 0.

Regarding the problem, described in comment #11:

Perhaps we can add CCCZmode as a common mode to otherwise orthogonal CCCmode
and CCZmode. Playing with i386_cc_modes_compatible() and ix86_match_cc_mode()
we can perhaps teach combine to CSE and merge two comparisons. To achieve this,
I think we should use (enhanced) ix86_match_cc_mode() in just added RTL
patterns.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (12 preceding siblings ...)
  2007-08-19 18:04 ` ubizjak at gmail dot com
@ 2007-08-30 19:45 ` patchapp at dberlin dot org
  2007-09-09 19:22 ` rask at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: patchapp at dberlin dot org @ 2007-08-30 19:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from patchapp at dberlin dot org  2007-08-30 19:45 -------
Subject: Bug number PR 30315

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2007-08/msg02217.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (13 preceding siblings ...)
  2007-08-30 19:45 ` patchapp at dberlin dot org
@ 2007-09-09 19:22 ` rask at gcc dot gnu dot org
  2007-11-10  1:32 ` rask at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-09-09 19:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from rask at gcc dot gnu dot org  2007-09-09 19:22 -------
Subject: Bug 30315

Author: rask
Date: Sun Sep  9 19:21:59 2007
New Revision: 128305

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128305
Log:
        PR target/30315
        * config/i386/i386.h (CANONICALIZE_COMPARISON): Delete.
        * simplify-rtx.c (simplify_relational_operation_1): Add the
        canonicalization from i386.h.
        * doc/md.texi (Canonicalization of Instructions): Document it.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.h
    trunk/gcc/doc/md.texi
    trunk/gcc/simplify-rtx.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (14 preceding siblings ...)
  2007-09-09 19:22 ` rask at gcc dot gnu dot org
@ 2007-11-10  1:32 ` rask at gcc dot gnu dot org
  2009-01-02 19:30 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-11-10  1:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from rask at gcc dot gnu dot org  2007-11-10 01:32 -------
Two testcases which aren't optimized:

unsigned int bad1 (unsigned int a)
{
  unsigned int c = a - 1;
  if (c > a)
    abort ();
  else
    return c;
}

unsigned int bad2 (unsigned int a)
{
  unsigned int c = a - 2;
  if (c > a)
    abort ();
  else
    return c;
}

See also <URL:http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01359.html>.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (15 preceding siblings ...)
  2007-11-10  1:32 ` rask at gcc dot gnu dot org
@ 2009-01-02 19:30 ` pinskia at gcc dot gnu dot org
  2009-11-12 17:29 ` jparmele at wildbear dot com
  2009-11-16 19:05 ` ubizjak at gmail dot com
  18 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-02 19:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from pinskia at gcc dot gnu dot org  2009-01-02 19:29 -------
>* gcc.target/i386/pr30315.c: New.

This testcase fails with -march=pentium-m turned on.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (16 preceding siblings ...)
  2009-01-02 19:30 ` pinskia at gcc dot gnu dot org
@ 2009-11-12 17:29 ` jparmele at wildbear dot com
  2009-11-16 19:05 ` ubizjak at gmail dot com
  18 siblings, 0 replies; 21+ messages in thread
From: jparmele at wildbear dot com @ 2009-11-12 17:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from jparmele at wildbear dot com  2009-11-12 17:28 -------
This bug also still appears in 4.4.2 with --with-arch=pentium3.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
  2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
                   ` (17 preceding siblings ...)
  2009-11-12 17:29 ` jparmele at wildbear dot com
@ 2009-11-16 19:05 ` ubizjak at gmail dot com
  18 siblings, 0 replies; 21+ messages in thread
From: ubizjak at gmail dot com @ 2009-11-16 19:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from ubizjak at gmail dot com  2009-11-16 19:05 -------
(In reply to comment #18)
> This bug also still appears in 4.4.2 with --with-arch=pentium3.

pentium3 fails because it is not TARGET_HIMODE_MATH and TARGET_QIMODE_MATH.

So, it fails following testcase:

unsigned short
plusccsa (unsigned short a, unsigned short b)
{
  unsigned short sum = a + b;
  if (sum < a)
    abort ();
  return sum;
}

It used to generate:

plusccsa:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        movzwl  8(%ebp), %edx
        movzwl  12(%ebp), %eax
        addl    %edx, %eax
        movzwl  %ax, %eax
>>	cmpw	%ax, %dx
        ja      .L97
        leave
        ret
.L97:
        call    abort

And after the fix for wrong mode of compare insn for these targets [1]:

plusccsa:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        movzwl  8(%ebp), %edx
        movzwl  12(%ebp), %eax
        addl    %edx, %eax
        movzwl  %ax, %eax
>>	cmpl	%eax, %edx
        ja      .L52
        leave
        ret
.L52:
        call    abort

You don't want instructions that operate on bytes or words on pentium3...

At the end of the day, no bug here.

[1] http://gcc.gnu.org/ml/gcc-patches/2009-11/msg00787.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-11-16 19:05 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-12-28  7:09 [Bug target/30315] New: optimize unsigned-add overflow test on x86 to use cpu flags from addl raeburn at raeburn dot org
2007-06-06  3:39 ` [Bug target/30315] " scovich at gmail dot com
2007-06-06  8:16 ` ubizjak at gmail dot com
2007-06-06  9:10   ` Andrew Pinski
2007-06-06  9:10 ` pinskia at gmail dot com
2007-06-06 10:33 ` rask at sygehus dot dk
2007-06-06 13:00 ` ubizjak at gmail dot com
2007-06-06 14:51 ` raeburn at raeburn dot org
2007-06-13 13:37 ` rask at sygehus dot dk
2007-08-03 19:40 ` patchapp at dberlin dot org
2007-08-04 13:10 ` patchapp at dberlin dot org
2007-08-14 14:39 ` rask at gcc dot gnu dot org
2007-08-18 17:56 ` raeburn at raeburn dot org
2007-08-19 13:01 ` rask at gcc dot gnu dot org
2007-08-19 18:04 ` ubizjak at gmail dot com
2007-08-30 19:45 ` patchapp at dberlin dot org
2007-09-09 19:22 ` rask at gcc dot gnu dot org
2007-11-10  1:32 ` rask at gcc dot gnu dot org
2009-01-02 19:30 ` pinskia at gcc dot gnu dot org
2009-11-12 17:29 ` jparmele at wildbear dot com
2009-11-16 19:05 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).