From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-477293-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 22436 invoked by alias); 15 Feb 2015 06:17:35 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 22379 invoked by uid 48); 15 Feb 2015 06:17:30 -0000
From: "daniel.santos at pobox dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/54829] bad optimization: sub followed by cmp w/ zero (x86 & ARM)
Date: Sun, 15 Feb 2015 06:17:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 4.7.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: daniel.santos at pobox dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-54829-4-jyKp7Cse4u@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-54829-4@http.gcc.gnu.org/bugzilla/>
References: <bug-54829-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-02/txt/msg01626.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54829

--- Comment #9 from Daniel Santos <daniel.santos at pobox dot com> ---
I appologize for my late response.

(In reply to Richard Earnshaw from comment #8)
> Unfortunately, computers don't to infinite precision arithmetic by default. 
> That would perform a different comparison in that it checks that r0 > r1,
> not whether r0 - r1 > 0.  The difference, for signed comparisons, is when
> overflow occurs.
> 
> Consider the case where (in your original code) a has the value INT_MIN (ie
> -2147483648) and b has the value 1.
> 
> Now clearly a < b and by the normal rules of arithmetic (infinite precision)
> we would expect a - b to be less than zero.
> 
> However, INT_MIN - 1 cannot be represented in a 32-bit long value and
> becomes INT_MAX due to overflow; the result is that for these values a - b >
> 0!
> 
> On ARM and x86, the flag setting that results from a subtract operation is,
> in effect a comparison of the original operands, rather than a comparison of
> the result; that is on ARM
> 
>    subs rd, rn, rm
> 
> is equivalent to 
> 
>    cmp rn, rm
> 
> except that the register rd is not written by the comparison.
> 
> Power PC is different: it's subtract and compare instruction really does use
> the result of the subtraction to form the comparison.

Thank you very much for your work on this. In re-examining, I'm suspecting that
this may be an invalid bug. :( I have modified the test program slightly:

extern print_gt(void);
extern print_lt(void);
extern print_eq(void);

void cmp_and_branch(long a, long b)
{
    long diff = a > b ? 1 : (a < b ? -1 : 0);

    if (diff > 0) {
        print_gt();
    } else if (diff < 0) {
        print_lt();
    } else {
        print_eq();
    }
}

I thought that I had originally tried this and gotten worse results (although
the diff was being done via a complicated -findirect-inline situation), but
this version of the program leaves a finite number of options. When compiled on
x86_64 and ARM, both are flawless:

ARM
cmp_and_branch:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        cmp     r0, r1
        bgt     .L2
        blt     .L5
        b       print_eq
.L2:
        b       print_gt
.L5:
        b       print_lt
        .size   cmp_and_branch, .-cmp_and_branch
        .ident  "GCC: (Gentoo 4.8.3 p1.1, pie-0.5.9) 4.8.3"
        .section        .note.GNU-stack,"",%progbits


x86_64
cmp_and_branch:
.LFB0:
        .cfi_startproc
        cmpq    %rsi, %rdi
        jg      .L2
        jl      .L5
        jmp     print_eq
        .p2align 4,,10
        .p2align 3
.L2:
        jmp     print_gt
        .p2align 4,,10
        .p2align 3
.L5:
        jmp     print_lt
        .cfi_endproc

I don't want to close this bug just yet, I want to reset in my other code. This
will certainly help clean up some of my code!!