public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/59371] New: Performance regression in GCC 4.8 and later versions.
@ 2013-12-02 17:59 sje at gcc dot gnu.org
  2013-12-03  9:37 ` [Bug target/59371] [4.8/4.9 Regression] " rguenth at gcc dot gnu.org
                   ` (22 more replies)
  0 siblings, 23 replies; 24+ messages in thread
From: sje at gcc dot gnu.org @ 2013-12-02 17:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

            Bug ID: 59371
           Summary: Performance regression in GCC 4.8 and later versions.
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sje at gcc dot gnu.org
            Target: mips*-*-*

If I compile this program with -O2 on MIPS:

int foo(int *p, unsigned short c)
{
    signed short i;
    int x = 0;
    for (i = 0; i < c; i++) {
        x = x + *p; p++;
    }
    return x;
}

With GCC 4.7.* or earlier I get loop code that looks like:

$L3:
    lw    $5,0($4)
    addiu    $3,$3,1
    seh    $3,$3
    addu    $2,$2,$5
    bne    $3,$6,$L3
    addiu    $4,$4,4

With GCC 4.8 and later I get:

$L3:
    lw    $7,0($4)
    addiu    $3,$3,1
    seh    $3,$3
    slt    $6,$3,$5
    addu    $2,$2,$7
    bne    $6,$0,$L3
    addiu    $4,$4,4

This loop has one more instruction in it and is slower.
A version of this bug appears in EEMBC 1.1.  If I change the loop
index to be unsigned then I get the better code but I can't change
the benchmark I am testing so I am trying to figure out what changed
in GCC and how to generate the faster code.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
@ 2013-12-03  9:37 ` rguenth at gcc dot gnu.org
  2013-12-03 16:42 ` sje at gcc dot gnu.org
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-12-03  9:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |4.8.3
            Summary|Performance regression in   |[4.8/4.9 Regression]
                   |GCC 4.8 and later versions. |Performance regression in
                   |                            |GCC 4.8 and later versions.

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The IVOPTs dump looks different on x86_64, it looks like 4.7 was able to do
a better job here.

Can you provide a runnable testcase so we can see if x86_64 regressed
in speed as well?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
  2013-12-03  9:37 ` [Bug target/59371] [4.8/4.9 Regression] " rguenth at gcc dot gnu.org
@ 2013-12-03 16:42 ` sje at gcc dot gnu.org
  2013-12-03 23:10 ` pinskia at gcc dot gnu.org
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: sje at gcc dot gnu.org @ 2013-12-03 16:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #2 from Steve Ellcey <sje at gcc dot gnu.org> ---
Created attachment 31365
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31365&action=edit
runnnable performance test case

Here is a runnable test case.  You may need to increase the loop counts
depending on the speed of your machine.  On the MIPS system I used a 4.7.3 GCC
took 36.127 seconds and a 4.8.1 GCC took 38.435 seconds (Little endian, -O2,
static linking).


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
  2013-12-03  9:37 ` [Bug target/59371] [4.8/4.9 Regression] " rguenth at gcc dot gnu.org
  2013-12-03 16:42 ` sje at gcc dot gnu.org
@ 2013-12-03 23:10 ` pinskia at gcc dot gnu.org
  2013-12-04  1:49 ` macro@linux-mips.org
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-12-03 23:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Maciej W. Rozycki from comment #3)
> Caused by:

Well that corrects how i++ is done.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2013-12-03 23:10 ` pinskia at gcc dot gnu.org
@ 2013-12-04  1:49 ` macro@linux-mips.org
  2013-12-04  1:52 ` pinskia at gcc dot gnu.org
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: macro@linux-mips.org @ 2013-12-04  1:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #5 from Maciej W. Rozycki <macro@linux-mips.org> ---
(In reply to Andrew Pinski from comment #4)
> Well that corrects how i++ is done.

Old MIPS assembly code produced was AFAICT correct.  The loop termination
condition was expressed as:

    bne    $3,$6,$L3

that represented (i != c) rather than (i < c), but we start `i' from 0
and increment by one at a time, so both expressions are equivalent in
this context.

Here I believe the following C language standard clause applies[1]:

"Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then the
operand with signed integer type is converted to the type of the operand
with unsigned integer type."

so for both operands the expression is supposed to use the "unsigned
short" type, that is 16-bit on the MIPS target.  There are no 16-bit ALU
operations defined in the MIPS architecture though, so at the assembly
(and therefore machine-level) level both `c' and `i' were sign-extended
to 32-bits:

    andi    $5,$5,0xffff
    seh    $6,$5

and:

    seh    $3,$3

respectively (of course ANDI is redundant here, there's no need to
zero-extend before sign-extending, SEH does not require it), before the
BNE comparison quoted above was made.  That correctly mimicked 16-bit
operations required by the language here (of course zero-extension of
both `c' and `i' would do as well).

Now after the change `c' is zero-extended only (no sign-extension
afterwards):

    andi    $5,$5,0xffff

while `i' is still sign-extended:

    seh    $3,$3

Then the loop termination condition is expressed as:

    slt    $6,$3,$5
    bne    $6,$0,$L3

instead.  Notice the SLT instruction, that accurately represents the
(i < c) termination condition, however using *signed* arithmetic.  Which
means that for `c' equal e.g. to 32768 the loop will never terminate.  I
believe this is not what the clause of the C language standard quoted
above implies.  For unsigned arithmetic SLTU would have to be used
instead.

So it looks to me like the performance regression merely happens to be
a visible sign of a bigger correctness problem.  Have I missed anything?

[1] "Programming languages -- C", ISO/IEC 9899:1999(E), Section 6.3.1.8
    "Usual arithmetic conversions".


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2013-12-04  1:49 ` macro@linux-mips.org
@ 2013-12-04  1:52 ` pinskia at gcc dot gnu.org
  2013-12-05 11:12 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-12-04  1:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Maciej W. Rozycki from comment #5)
> (In reply to Andrew Pinski from comment #4)
> > Well that corrects how i++ is done.
> 
> Old MIPS assembly code produced was AFAICT correct.  The loop termination
> condition was expressed as:

Correctly representative in the gimple IR rather than in the final assembly.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2013-12-04  1:52 ` pinskia at gcc dot gnu.org
@ 2013-12-05 11:12 ` rguenth at gcc dot gnu.org
  2013-12-05 23:43 ` macro@linux-mips.org
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-12-05 11:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-12-05
     Ever confirmed|0                           |1

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, the change suggests a workaround - use -fwrapv.  And indeed, the
loop does not terminate for c == 32768, but that's a "bug" in the testcase,
  signed short < 32768
is always true (both promote to int, and i++ wraps at 32767).

That i++ now correctly wraps 'defined' also means that IV analysis is
pessimized as it can not assume that 'i' does not wrap nor can it assume
that the loop terminates.

But that's just the awkward way the testcase is written (without the
C standard in mind ...).

You may also try -funsafe-loop-optimizations that makes us optimistically
assume IVs don't overflow and loops terminate.

"confirmed", but I think this one needs more analysis on _what_ transform
we want to happen and then get assessment on if that is a valid transform
at all.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2013-12-05 11:12 ` rguenth at gcc dot gnu.org
@ 2013-12-05 23:43 ` macro@linux-mips.org
  2013-12-17 10:33 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: macro@linux-mips.org @ 2013-12-05 23:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #8 from Maciej W. Rozycki <macro@linux-mips.org> ---
Richard,

 I wasn't aware integer promotions applied here, thanks for pointing it
out.  New code is therefore correct while old one was not.  Unfortunately
neither -fwrapv nor -funsafe-loop-optimizations changes anything.

  Maciej


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2013-12-05 23:43 ` macro@linux-mips.org
@ 2013-12-17 10:33 ` jakub at gcc dot gnu.org
  2013-12-19 15:37 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-12-17 10:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Maciej W. Rozycki from comment #8)
> Richard,
> 
>  I wasn't aware integer promotions applied here, thanks for pointing it
> out.  New code is therefore correct while old one was not.  Unfortunately
> neither -fwrapv nor -funsafe-loop-optimizations changes anything.

But then it must be target specific thing.  Because, -fwrapv certainly changes
it to the same IL as has been emitted before that change (also with -fwrapv, of
course).

So, any reason not to close this PR, because while we generate slower code, the
slower code is actually correct while the old one was wrong?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2013-12-17 10:33 ` jakub at gcc dot gnu.org
@ 2013-12-19 15:37 ` rguenth at gcc dot gnu.org
  2014-01-09 20:40 ` macro@linux-mips.org
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-12-19 15:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2013-12-19 15:37 ` rguenth at gcc dot gnu.org
@ 2014-01-09 20:40 ` macro@linux-mips.org
  2014-05-22  9:07 ` [Bug target/59371] [4.8/4.9/4.10 " rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: macro@linux-mips.org @ 2014-01-09 20:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #10 from Maciej W. Rozycki <macro@linux-mips.org> ---
(In reply to Jakub Jelinek from comment #9)

Jakub,

 The fix has corrected the evaluation of `i++' however it has regressed
the evaluation of `i < c'.  This is because in the loop `i' is only ever
assigned values that are lower than or equal to the value of `c' and
here the `int' type can represent all values representable with the
`signed short' and `unsigned short' types.

 Specifically, assuming the properties of the MIPS target, where `int' is
32-bit wide and `short' is 16-bit wide, we have the following cases of
the `c' input value here:

1. 0 -- the loop is not entered because `i' is preset to 0 and equals `c'
   right away.

2. [1,32767] -- `i' is incremented from 0 until it reaches the value of
   `c', at which point the loop terminates.

3. [32768,65535] -- `i' is incremented from 0 up to 32767, at which point
   it wraps around to -32768 and continues being incremented to 32767
   again.  And so on, and so on.  It never reaches the value of `c' or
   any higher value and therefore the loop never terminates.

Based on the above observations it is enough to check for `i == c' as the
loop termination condition.

 So I think this is still a performance regression from the user's point
of view even though I realise this may not necessarily be an optimisation
GCC has been previously designed for.  Therefore I'd prefer to keep the
bug open until/unless we decide it's impractical to implement a code
transformation that would restore previous performance.

  Maciej


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9/4.10 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2014-01-09 20:40 ` macro@linux-mips.org
@ 2014-05-22  9:07 ` rguenth at gcc dot gnu.org
  2014-12-19 13:34 ` [Bug target/59371] [4.8/4.9/5 " jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-05-22  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.3                       |4.8.4

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 4.8.3 is being released, adjusting target milestone.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9/5 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2014-05-22  9:07 ` [Bug target/59371] [4.8/4.9/4.10 " rguenth at gcc dot gnu.org
@ 2014-12-19 13:34 ` jakub at gcc dot gnu.org
  2015-03-13 20:18 ` sje at gcc dot gnu.org
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-12-19 13:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.4                       |4.8.5

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.4 has been released.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9/5 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2014-12-19 13:34 ` [Bug target/59371] [4.8/4.9/5 " jakub at gcc dot gnu.org
@ 2015-03-13 20:18 ` sje at gcc dot gnu.org
  2015-06-23  8:28 ` [Bug target/59371] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: sje at gcc dot gnu.org @ 2015-03-13 20:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #15 from Steve Ellcey <sje at gcc dot gnu.org> ---
I am not sure yet where and how to improve this automatically but I have found
an interesting hand optimization that could point to a way to fix this.  If I
change the original function:

int foo(int *p, unsigned short c)
{
    signed short i;
    int x = 0;
    for (i = 0; i < c; i++) {
        x = x + *p; p++;
    }
    return x;
}

To:

int foo(int *p, unsigned short c)
{
    signed short i;
    unsigned short new_i;
    int x = 0;

    if (c > 32767)
        for (i = 0; i < c; i++) {
            x = x + *p; p++;
        }
    else
        for (new_i = 0; new_i < c; new_i++) {
            x = x + *p; p++;
        }
    return x;
}


Then GCC 5.0 generates an empty infinite loop for the first for loop and
a compact 4 instruction loop (better even then 4.7) for the second for loop.

I am not sure where or if we can do this optimization in GCC but I am going
to investigate some more.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.8/4.9/5/6 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2015-03-13 20:18 ` sje at gcc dot gnu.org
@ 2015-06-23  8:28 ` rguenth at gcc dot gnu.org
  2015-06-26 20:02 ` [Bug target/59371] [4.9/5/6 " jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-23  8:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.5                       |4.9.3

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.9/5/6 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2015-06-23  8:28 ` [Bug target/59371] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
@ 2015-06-26 20:02 ` jakub at gcc dot gnu.org
  2015-06-26 20:32 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.9.3 has been released.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [4.9/5/6 Regression] Performance regression in GCC 4.8 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2015-06-26 20:02 ` [Bug target/59371] [4.9/5/6 " jakub at gcc dot gnu.org
@ 2015-06-26 20:32 ` jakub at gcc dot gnu.org
  2021-05-14  9:47 ` [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 " jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.9.3                       |4.9.4


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2015-06-26 20:32 ` jakub at gcc dot gnu.org
@ 2021-05-14  9:47 ` jakub at gcc dot gnu.org
  2021-05-17  2:57 ` guojiufu at gcc dot gnu.org
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|8.5                         |9.4

--- Comment #26 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2021-05-14  9:47 ` [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-05-17  2:57 ` guojiufu at gcc dot gnu.org
  2021-05-17  3:11 ` guojiufu at gcc dot gnu.org
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-05-17  2:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jiu Fu Guo <guojiufu at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |guojiufu at gcc dot gnu.org

--- Comment #27 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
For -O2, since a few optimizations are not enabled (e.g. some loop-based
optimizations), the code was not optimized too much.

At -O3, now, GCC could vectorize it.  While with GCC 4.8, the code was not
vectorized.  I guess the pain in performance may be mitigated.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2021-05-17  2:57 ` guojiufu at gcc dot gnu.org
@ 2021-05-17  3:11 ` guojiufu at gcc dot gnu.org
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-05-17  3:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

--- Comment #28 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
If change code as below, 'i' is not starting from '0', and 'compare code' is
'!='
then wrap/overflow on 'i' may happen, and optimizations (e.g. vectorization)
are not applied.
The below patch is trying to optimize this kind of loop.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570424.html

int foo (int *p, unsigned short u_n, unsigned short i)
{
  int x = 0;
  for (; i != u_n; i++) {
    x = x + p[i];
  }
  return x;
}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2021-05-17  3:11 ` guojiufu at gcc dot gnu.org
@ 2021-06-01  8:06 ` rguenth at gcc dot gnu.org
  2022-05-27  9:35 ` [Bug target/59371] [10/11/12/13 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.4                         |9.5

--- Comment #29 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [10/11/12/13 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
@ 2022-05-27  9:35 ` rguenth at gcc dot gnu.org
  2022-06-28 10:30 ` jakub at gcc dot gnu.org
  2023-07-07 10:30 ` [Bug target/59371] [11/12/13/14 " rguenth at gcc dot gnu.org
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.5                         |10.4

--- Comment #30 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [10/11/12/13 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2022-05-27  9:35 ` [Bug target/59371] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:30 ` jakub at gcc dot gnu.org
  2023-07-07 10:30 ` [Bug target/59371] [11/12/13/14 " rguenth at gcc dot gnu.org
  22 siblings, 0 replies; 24+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #31 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug target/59371] [11/12/13/14 Regression] Performance regression in GCC 4.8/9/10/11/12 and later versions.
  2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2022-06-28 10:30 ` jakub at gcc dot gnu.org
@ 2023-07-07 10:30 ` rguenth at gcc dot gnu.org
  22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #32 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-07-07 10:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-02 17:59 [Bug target/59371] New: Performance regression in GCC 4.8 and later versions sje at gcc dot gnu.org
2013-12-03  9:37 ` [Bug target/59371] [4.8/4.9 Regression] " rguenth at gcc dot gnu.org
2013-12-03 16:42 ` sje at gcc dot gnu.org
2013-12-03 23:10 ` pinskia at gcc dot gnu.org
2013-12-04  1:49 ` macro@linux-mips.org
2013-12-04  1:52 ` pinskia at gcc dot gnu.org
2013-12-05 11:12 ` rguenth at gcc dot gnu.org
2013-12-05 23:43 ` macro@linux-mips.org
2013-12-17 10:33 ` jakub at gcc dot gnu.org
2013-12-19 15:37 ` rguenth at gcc dot gnu.org
2014-01-09 20:40 ` macro@linux-mips.org
2014-05-22  9:07 ` [Bug target/59371] [4.8/4.9/4.10 " rguenth at gcc dot gnu.org
2014-12-19 13:34 ` [Bug target/59371] [4.8/4.9/5 " jakub at gcc dot gnu.org
2015-03-13 20:18 ` sje at gcc dot gnu.org
2015-06-23  8:28 ` [Bug target/59371] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
2015-06-26 20:02 ` [Bug target/59371] [4.9/5/6 " jakub at gcc dot gnu.org
2015-06-26 20:32 ` jakub at gcc dot gnu.org
2021-05-14  9:47 ` [Bug target/59371] [9/10/11/12 Regression] Performance regression in GCC 4.8/9/10/11/12 " jakub at gcc dot gnu.org
2021-05-17  2:57 ` guojiufu at gcc dot gnu.org
2021-05-17  3:11 ` guojiufu at gcc dot gnu.org
2021-06-01  8:06 ` rguenth at gcc dot gnu.org
2022-05-27  9:35 ` [Bug target/59371] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:30 ` jakub at gcc dot gnu.org
2023-07-07 10:30 ` [Bug target/59371] [11/12/13/14 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).