public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug regression/19672] New: Performance regression in simple loop code
@ 2005-01-28 16:20 stephan dot bergmann at sun dot com
  2005-01-28 19:03 ` [Bug regression/19672] [3.4/4.0? Regression] " pinskia at gcc dot gnu dot org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: stephan dot bergmann at sun dot com @ 2005-01-28 16:20 UTC (permalink / raw)
  To: gcc-bugs

In the following C source file test.c,

  int compare(char const * p1, int n1, char const * p2, int n2) {
    char const * q1 = p1 + n1;
    char const * q2 = p2 + n2;
    while (p1 < q1 && p2 < q2) {
      int n = *--q1 - *--q2;
      if (n) {
        return n;
      }
    }
    return n1 - n2;
  }
  int main(void) {
    char str[1000];
    int i;
    for (i = 0; i < 1000000; ++i) {
      compare(str, 1000, str, 1000);
    }
  }

compiled with

  gcc -O2 test.c

the loop body within compare takes 14 instructions on GCC 3.4.3 Linux x86,
compared to only 11 instructions on GCC 3.2.2 Linux x86 (see disassemblies
below), and the GCC 3.4.3 output takes substantially longer to run than the GCC
3.2.2 output:

  3.4.3> time ./a.out
  4.690u 0.001s 0:04.71 99.5%

  3.2.2> time ./a.out
  3.533u 0.002s 0:03.55 99.4%

This seems to bite us on OpenOffice.org, which contains an oft-called function
similar to compare above, and new versions of OOo (built with GCC 3.4) run
slower than old versions (built with GCC 3.2).  The question that remains for us
is whether this performance loss is specific to the given function, or could be
a general problem.

The dissasemblies:
08048360 <compare>: ! gcc 3.4.3
8048360: 55                      push   %ebp
8048361: 89 e5                   mov    %esp,%ebp
8048363: 57                      push   %edi
8048364: 56                      push   %esi
8048365: 53                      push   %ebx
8048366: 8b 45 0c                mov    0xc(%ebp),%eax
8048369: 8b 7d 08                mov    0x8(%ebp),%edi
804836c: 8b 75 10                mov    0x10(%ebp),%esi
804836f: 8d 1c 07                lea    (%edi,%eax,1),%ebx
8048372: 8b 45 14                mov    0x14(%ebp),%eax
8048375: 8d 0c 06                lea    (%esi,%eax,1),%ecx
8048378: 90                      nop
8048379: 8d b4 26 00 00 00 00    lea    0x0(%esi),%esi
8048380: 39 df                   cmp    %ebx,%edi      ! <-+
8048382: 0f 92 c0                setb   %al            !   |
8048385: 31 d2                   xor    %edx,%edx      !   |
8048387: 39 ce                   cmp    %ecx,%esi      !   |
8048389: 0f 92 c2                setb   %dl            !   |
804838c: 85 d0                   test   %edx,%eax      !   |
804838e: 74 13                   je     80483a3        !   |
8048390: 4b                      dec    %ebx           !   |
8048391: 49                      dec    %ecx           !   |
8048392: 0f be 01                movsbl (%ecx),%eax    !   |
8048395: 0f be 13                movsbl (%ebx),%edx    !   |
8048398: 29 c2                   sub    %eax,%edx      !   |
804839a: 89 d0                   mov    %edx,%eax      !   |
804839c: 74 e2                   je     8048380        ! --+
804839e: 5b                      pop    %ebx
804839f: 5e                      pop    %esi
80483a0: 5f                      pop    %edi
80483a1: 5d                      pop    %ebp
80483a2: c3                      ret
80483a3: 5b                      pop    %ebx
80483a4: 8b 45 0c                mov    0xc(%ebp),%eax
80483a7: 8b 55 14                mov    0x14(%ebp),%edx
80483aa: 5e                      pop    %esi
80483ab: 29 d0                   sub    %edx,%eax
80483ad: 5f                      pop    %edi
80483ae: 5d                      pop    %ebp
80483af: c3                      ret

08048370 <compare>: ! gcc 3.2.2
8048370: 55                      push   %ebp
8048371: 89 e5                   mov    %esp,%ebp
8048373: 57                      push   %edi
8048374: 8b 45 0c                mov    0xc(%ebp),%eax
8048377: 56                      push   %esi
8048378: 8b 7d 08                mov    0x8(%ebp),%edi
804837b: 53                      push   %ebx
804837c: 8b 75 10                mov    0x10(%ebp),%esi
804837f: 8d 1c 38                lea    (%eax,%edi,1),%ebx
8048382: 8b 45 14                mov    0x14(%ebp),%eax
8048385: 39 df                   cmp    %ebx,%edi
8048387: 8d 0c 30                lea    (%eax,%esi,1),%ecx
804838a: 73 1a                   jae    80483a6
804838c: 39 ce                   cmp    %ecx,%esi
804838e: 73 16                   jae    80483a6
8048390: 4b                      dec    %ebx           ! <-+
8048391: 49                      dec    %ecx           !   |
8048392: 0f be 01                movsbl (%ecx),%eax    !   |
8048395: 0f be 13                movsbl (%ebx),%edx    !   |
8048398: 29 c2                   sub    %eax,%edx      !   |
804839a: 89 d0                   mov    %edx,%eax      !   |
804839c: 75 10                   jne    80483ae        !   |
804839e: 39 df                   cmp    %ebx,%edi      !   |
80483a0: 73 04                   jae    80483a6        !   |
80483a2: 39 ce                   cmp    %ecx,%esi      !   |
80483a4: 72 ea                   jb     8048390        ! --+
80483a6: 8b 45 0c                mov    0xc(%ebp),%eax
80483a9: 8b 55 14                mov    0x14(%ebp),%edx
80483ac: 29 d0                   sub    %edx,%eax
80483ae: 5b                      pop    %ebx
80483af: 5e                      pop    %esi
80483b0: 5f                      pop    %edi
80483b1: 5d                      pop    %ebp
80483b2: c3                      ret

-- 
           Summary: Performance regression in simple loop code
           Product: gcc
           Version: 3.4.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: regression
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: stephan dot bergmann at sun dot com
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug regression/19672] [3.4/4.0? Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
@ 2005-01-28 19:03 ` pinskia at gcc dot gnu dot org
  2005-01-30  6:39 ` [Bug target/19672] " pinskia at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-28 19:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-28 19:03 -------
I cannot test this right now but this might be fixed on the mainline.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
            Summary|Performance regression in   |[3.4/4.0? Regression]
                   |simple loop code            |Performance regression in
                   |                            |simple loop code


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
  2005-01-28 19:03 ` [Bug regression/19672] [3.4/4.0? Regression] " pinskia at gcc dot gnu dot org
@ 2005-01-30  6:39 ` pinskia at gcc dot gnu dot org
  2005-01-31  9:09 ` stephan dot bergmann at sun dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-30  6:39 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-30 06:39 -------
Hmm, this looks like a branch cost problem or related to that.

I think you can get all the speed back by supplying -mbranch-cost=1 but I could be wrong.

Hmm, I wonder if 3.2.x was compiled for i486 (or i386) instead of i686 which means this might not be 
a regression.
Can you give the output of "gcc -v" for both GCC's?

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|regression                  |target
  GCC build triplet|i686-pc-linux-gnu           |
   GCC host triplet|i686-pc-linux-gnu           |
            Summary|[3.4/4.0? Regression]       |Performance regression in
                   |Performance regression in   |simple loop code
                   |simple loop code            |
   Target Milestone|---                         |3.4.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
  2005-01-28 19:03 ` [Bug regression/19672] [3.4/4.0? Regression] " pinskia at gcc dot gnu dot org
  2005-01-30  6:39 ` [Bug target/19672] " pinskia at gcc dot gnu dot org
@ 2005-01-31  9:09 ` stephan dot bergmann at sun dot com
  2005-04-07  7:51 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: stephan dot bergmann at sun dot com @ 2005-01-31  9:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From stephan dot bergmann at sun dot com  2005-01-31 09:09 -------
"I think you can get all the speed back by supplying -mbranch-cost=1 but I could
be wrong."

No, adding -mbranch-cost=1 leads to only a very minor performance improvement.

"Can you give the output of "gcc -v" for both GCC's?"

3.4.3> gcc -v
Reading specs from
/export/home/sb93797/gcc-3.4.3-inst/lib/gcc/i686-pc-linux-gnu/3.4.3/specs
Configured with: ../gcc-3.4.3/configure --prefix=/export/home/sb93797/gcc-3.4.3-inst
Thread model: posix
gcc version 3.4.3

3.2.2> gcc -v
Reading specs from
/so/env/gcc_3.2.2_linux_libc2.20/bin/../lib/gcc-lib/i686-pc-linux-gnu/3.2.2/specs
Configured with: /export/home/obo/gcc-3.2.2/configure
--prefix=/net/grande.germany/develop6/update/dev/gcc_3.2.2_linux_libc2.20
--with-as=/net/grande.germany/develop6/update/dev/gcc_3.2.2_linux_libc2.20/bin/as
--with-ld=/net/grande.germany/develop6/update/dev/gcc_3.2.2_linux_libc2.20/bin/ld
--enable-languages=c,c++
Thread model: posix
gcc version 3.2.2

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (2 preceding siblings ...)
  2005-01-31  9:09 ` stephan dot bergmann at sun dot com
@ 2005-04-07  7:51 ` pinskia at gcc dot gnu dot org
  2005-06-13  4:06 ` [Bug target/19672] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-04-07  7:51 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|3.4.4                       |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (3 preceding siblings ...)
  2005-04-07  7:51 ` pinskia at gcc dot gnu dot org
@ 2005-06-13  4:06 ` pinskia at gcc dot gnu dot org
  2005-06-27  6:59 ` dank at kegel dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-06-13  4:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-06-13 04:06 -------
Confirmed.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |3.4.0 4.0.0 4.1.0 3.3.3
      Known to work|                            |3.2.3
            Summary|Performance regression in   |[3.4/4.0/4.1 Regression]
                   |simple loop code            |Performance regression in
                   |                            |simple loop code
   Target Milestone|---                         |3.4.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (4 preceding siblings ...)
  2005-06-13  4:06 ` [Bug target/19672] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
@ 2005-06-27  6:59 ` dank at kegel dot com
  2005-06-27  7:39 ` steven at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dank at kegel dot com @ 2005-06-27  6:59 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dank at kegel dot com  2005-06-27 06:59 -------
I just reproduced this on two flavors of pentium 4 using -O3 -mtune=pentium.
The regression is worse, sometimes much worse, with -fPIC.

Times for gcc-2.95.3, gcc-3.4.3, gcc-4.0.0, gcc-4.1-20050603:

test run 1:
/proc/cpuinfo: cpu family 15, model 2, Intel Pentium 4 CPU 2.80GHz.
no -fPIC:   2.2, 2.7. 0.6. 0.6 seconds
with fPIC:  2.2, 2.9, 2.9, 2.9 seconds

test run 2:
/proc/cpuinfo: cpu family 15, model 4, Intel Pentium 4 CPU 3.20GHz
no -fPIC:   1.9, 3.2, 0.5, n/a seconds
with fPIC:  1.9, 3.6, 3.6, n/a seconds


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (5 preceding siblings ...)
  2005-06-27  6:59 ` dank at kegel dot com
@ 2005-06-27  7:39 ` steven at gcc dot gnu dot org
  2005-06-27  7:42 ` steven at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-06-27  7:39 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2005-06-27 07:39:38
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (6 preceding siblings ...)
  2005-06-27  7:39 ` steven at gcc dot gnu dot org
@ 2005-06-27  7:42 ` steven at gcc dot gnu dot org
  2005-06-27  7:59 ` steven at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-06-27  7:42 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-06-27 07:41 -------
For gcc4, the no-PIC case looks pretty good to me ;-) 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|3.4.0 4.0.0 4.1.0 3.3.3     |3.3.3 3.4.0 4.0.0 4.1.0
      Known to work|3.2.3                       |2.95.3 3.2.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (7 preceding siblings ...)
  2005-06-27  7:42 ` steven at gcc dot gnu dot org
@ 2005-06-27  7:59 ` steven at gcc dot gnu dot org
  2005-07-22 21:15 ` pinskia at gcc dot gnu dot org
  2005-09-27 16:06 ` mmitchel at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-06-27  7:59 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-06-27 07:59 -------
Dan, can you show the assembler output for 2.95.3 and 4.0 (why is 4.1 "n/a"??) 
for the -fPIC case? 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dank at kegel dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (8 preceding siblings ...)
  2005-06-27  7:59 ` steven at gcc dot gnu dot org
@ 2005-07-22 21:15 ` pinskia at gcc dot gnu dot org
  2005-09-27 16:06 ` mmitchel at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-07-22 21:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-07-22 21:13 -------
Moving to 4.0.2 pre Mark.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|3.4.5                       |4.0.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
  2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
                   ` (9 preceding siblings ...)
  2005-07-22 21:15 ` pinskia at gcc dot gnu dot org
@ 2005-09-27 16:06 ` mmitchel at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-09-27 16:06 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.0.2                       |4.0.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2005-09-27 16:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-28 16:20 [Bug regression/19672] New: Performance regression in simple loop code stephan dot bergmann at sun dot com
2005-01-28 19:03 ` [Bug regression/19672] [3.4/4.0? Regression] " pinskia at gcc dot gnu dot org
2005-01-30  6:39 ` [Bug target/19672] " pinskia at gcc dot gnu dot org
2005-01-31  9:09 ` stephan dot bergmann at sun dot com
2005-04-07  7:51 ` pinskia at gcc dot gnu dot org
2005-06-13  4:06 ` [Bug target/19672] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
2005-06-27  6:59 ` dank at kegel dot com
2005-06-27  7:39 ` steven at gcc dot gnu dot org
2005-06-27  7:42 ` steven at gcc dot gnu dot org
2005-06-27  7:59 ` steven at gcc dot gnu dot org
2005-07-22 21:15 ` pinskia at gcc dot gnu dot org
2005-09-27 16:06 ` mmitchel at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).