public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: gcc-2.7 creates faster code than pgcc-1.1.1
@ 1999-03-04 14:31 H.J. Lu
       [not found] ` < m10Igdx-000AUaC@shanghai.varesearch.com >
  1999-03-31 23:46 ` H.J. Lu
  0 siblings, 2 replies; 52+ messages in thread
From: H.J. Lu @ 1999-03-04 14:31 UTC (permalink / raw)
  To: medtekh; +Cc: egcs

Hi,

It seems that "movzb? %al,%?ax" may be faster than "and? $255,%?ax".
This patch for egcs 1.1.2 seems to make gzip faster.

Thanks.


-- 
H.J. Lu (hjl@gnu.org)
---
Thu Mar  4 14:04:49 1999  H.J. Lu  (hjl@gnu.org)

	* config/i386/i386.md (zero_extendqihi2): Use "and" when target
	and source are both "ax" only if TARGET_ZERO_EXTEND_WITH_AND is
	true.
	(zero_extendqisi2): Likewise.

--- ../../../import/egcs-1.1.x/egcs/gcc/config/i386/i386.md	Sun Feb 14 08:30:40 1999
+++ config/i386/i386.md	Thu Mar  4 13:46:07 1999
@@ -1738,7 +1741,7 @@
   {
   rtx xops[2];
 
-  if ((TARGET_ZERO_EXTEND_WITH_AND || REGNO (operands[0]) == 0)
+  if ((TARGET_ZERO_EXTEND_WITH_AND || (0 & REGNO (operands[0]) == 0))
       && REG_P (operands[1]) 
       && REGNO (operands[0]) == REGNO (operands[1]))
     {
@@ -1819,7 +1822,7 @@
   {
   rtx xops[2];
 
-  if ((TARGET_ZERO_EXTEND_WITH_AND || REGNO (operands[0]) == 0)
+  if ((TARGET_ZERO_EXTEND_WITH_AND || (0 & REGNO (operands[0]) == 0))
       && REG_P (operands[1]) 
       && REGNO (operands[0]) == REGNO (operands[1]))
     {

^ permalink raw reply	[flat|nested] 52+ messages in thread
* Re: gcc-2.7 creates faster code than pgcc-1.1.1
@ 1999-03-09  1:19 Òåðåõèí Âÿ÷åñëàâ
  1999-03-31 23:46 ` Òåðåõèí Âÿ÷åñëàâ
  0 siblings, 1 reply; 52+ messages in thread
From: Òåðåõèí Âÿ÷åñëàâ @ 1999-03-09  1:19 UTC (permalink / raw)
  To: Richard Henderson, Alfred Perlstein; +Cc: H.J. Lu, egcs

>On Fri, Mar 05, 1999 at 12:23:46PM -0500, Alfred Perlstein wrote:
>> > I any way "movzb? %al,%?ax" and "and? $255,%?ax" takes 1 tick both.
>> > So this is a kind of mistery with this instructions.
>>
>> I think the magic lies in that with register renaming, instruction
>> caches and all the 'behind the scenes' optimizations PPro and later
>> versions of x86 chips can do.  It really should be investigated more.
>
>It has nothing to do with register renaming.
>
>It is most likely to be related to instruction alignment -- some
>important insn in the loop is straddling a 16-byte boundary, which
>requires an extra cycle to decode.
>
>I've seen such create up to a 20% difference in runtime on a small loop.
>


It has nothing to deal with para boundary. In movz case xorb insn crosses
para boundary
while with andl no insn crosses para boundary.

Sincerely Yours, Eugene.

P.S. For H.J.Lu -- I do not state that things go slower with movz. Slow down
I get were 1% (this can be statistical error). Nevertheless there is no
speed up in
most cases too (or such a huge speed up as with decompression).
We should try to find out more why and how this happens.
BTW I have PPro 180MHz.

^ permalink raw reply	[flat|nested] 52+ messages in thread
* Re: gcc-2.7 creates faster code than pgcc-1.1.1
@ 1999-03-04 23:11 Терехин Вячеслав
       [not found] ` < 005601be66d7$ae033480$288230d4@main.medtech.ru >
  1999-03-31 23:46 ` Терехин Вячеслав
  0 siblings, 2 replies; 52+ messages in thread
From: Терехин Вячеслав @ 1999-03-04 23:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: egcs

>Hi,
>
>It seems that "movzb? %al,%?ax" may be faster than "and? $255,%?ax".
>This patch for egcs 1.1.2 seems to make gzip faster.
>
>Thanks.
>
>


Yes it maybe, but not allways, this is not the case as you can see from my
message:
Decompression becomes faster, while compression becomes slower.

More over this generally slow down code. I do have my own patch to egcs
doing
the same thing as yours. To turn on suppressing of andl in favor of movz of
use -mextendz-with-movz.
Compiling of several programms shows general slow down.

I any way "movzb? %al,%?ax" and "and? $255,%?ax" takes 1 tick both.
So this is a kind of mistery with this instructions.

As you can see from my message this change in uncompression code
yields 20% performance boost. At the same time all the loop dealing with crc
is
0x15 bytes long and takes 50% of time. The one instruction from it 5 or 3
bytes long
saves 20% total time or 40% of loop time. This can not be at all. But it is.

Sincerely Yours, Eugene.

^ permalink raw reply	[flat|nested] 52+ messages in thread
* gcc-2.7 creates faster code than pgcc-1.1.1
@ 1999-03-04  3:40 Терехин Вячеслав
       [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
  1999-03-31 23:46 ` Терехин Вячеслав
  0 siblings, 2 replies; 52+ messages in thread
From: Терехин Вячеслав @ 1999-03-04  3:40 UTC (permalink / raw)
  To: egcs

As I wrote previously gcc-2.7.2.3 generates faster gzip
than egcs-1.1.1/pgcc-1.1.1 on PentiumPro.
The slowdown is greater than 10% on decompression operation.
This can be easily checked if you have RedHat 5.2.
The shipped gzip is gcc-2.7.2.3 compiled.

After several day of search I finally find out offending
instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.

This instruction is:
andl $255, %eax
in flush_window (util.c) function body (it is inlined from updcrc)

if you manually replace it with
movzbl %al, $eax
this will boost decompression by 20%.

All the below staff is made in gzip-1.2.4a source folder.

$ make CFLAGS="-O6 -mpentiumpro"
$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m8.047s
user    0m7.970s
sys     0m0.070s

$time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.646s
user    0m12.470s
sys     0m0.160s

$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.c -S
$ sed 's/andl $255,%eax/movzbl %al, %eax/g' util.s > util.S
$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.S
$ make CFLAGS="-O6 -mpentiumpro"

$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m6.658s
user    0m6.540s
sys     0m0.110s

$ time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.688s
user    0m12.490s
sys     0m0.180s

All this staff do not apply to Pentium processor as far as I know
(I test it Pentium MMX 200MHz)

I do not know why this happens.
Anybody who knows how to deal with it, please, reply me
as soon as possible.

And finally if you have Pentium Pro or Pentium II please
do this check and report result to me.
I wonder whether I have brain damaged Pentium Pro.

Sincerely Yours, Eugene.


PS I am not on this mailing list.
Also it will be better if you will sent reply to bom@classic.iki.rssi.ru
I can not use it directly as mail can not be delivered by it to this list.


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~1999-03-31 23:46 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-03-04 14:31 gcc-2.7 creates faster code than pgcc-1.1.1 H.J. Lu
     [not found] ` < m10Igdx-000AUaC@shanghai.varesearch.com >
1999-03-04 16:04   ` Martin v. Loewis
     [not found]     ` < 199903050001.BAA00973@mira.isdn.cs.tu-berlin.de >
1999-03-04 16:46       ` H.J. Lu
     [not found]         ` < m10Iil3-000393C@ocean.lucon.org >
1999-03-04 17:03           ` Joe Buck
     [not found]             ` < 199903050102.RAA06944@atrus.synopsys.com >
1999-03-04 17:06               ` H.J. Lu
1999-03-31 23:46                 ` H.J. Lu
1999-03-31 23:46             ` Joe Buck
1999-03-05  6:52           ` craig
     [not found]             ` < 19990305145306.4796.qmail@deer >
1999-03-05  9:18               ` Jeffrey A Law
1999-03-31 23:46                 ` Jeffrey A Law
1999-03-31 23:46             ` craig
1999-03-31 23:46         ` H.J. Lu
1999-03-04 18:08       ` Jeffrey A Law
     [not found]         ` < 13494.920599668@hurl.cygnus.com >
1999-03-04 20:03           ` H.J. Lu
     [not found]             ` < m10Ilq3-00000YC@ocean.lucon.org >
1999-03-04 20:14               ` Jeffrey A Law
1999-03-31 23:46                 ` Jeffrey A Law
1999-03-31 23:46             ` H.J. Lu
1999-03-31 23:46         ` Jeffrey A Law
1999-03-31 23:46     ` Martin v. Loewis
1999-03-31 23:46 ` H.J. Lu
  -- strict thread matches above, loose matches on Subject: below --
1999-03-09  1:19 Òåðåõèí Âÿ÷åñëàâ
1999-03-31 23:46 ` Òåðåõèí Âÿ÷åñëàâ
1999-03-04 23:11 Терехин Вячеслав
     [not found] ` < 005601be66d7$ae033480$288230d4@main.medtech.ru >
1999-03-05  9:22   ` Alfred Perlstein
     [not found]     ` < Pine.BSF.3.96.990305121935.7355C-100000@cygnus.rush.net >
1999-03-05 13:02       ` Richard Henderson
1999-03-31 23:46         ` Richard Henderson
1999-03-31 23:46     ` Alfred Perlstein
1999-03-05 15:52   ` H.J. Lu
1999-03-31 23:46     ` H.J. Lu
1999-03-31 23:46 ` Терехин Вячеслав
1999-03-04  3:40 Терехин Вячеслав
     [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
1999-03-04 13:20   ` Jamie Lokier
     [not found]     ` < 19990304222018.A21939@pcep-jamie.cern.ch >
1999-03-04 17:05       ` Zack Weinberg
     [not found]         ` < 199903050104.UAA15335@octiron.phys.columbia.edu >
1999-03-04 18:09           ` Jeffrey A Law
     [not found]             ` <law@hurl.cygnus.com>
     [not found]               ` < 13506.920599740@hurl.cygnus.com >
1999-03-04 20:04                 ` David Edelsohn
     [not found]                   ` < 9903050403.AA36338@marc.watson.ibm.com >
1999-03-04 20:31                     ` Jeffrey A Law
     [not found]                       ` < 13939.920608288@hurl.cygnus.com >
1999-03-05  6:53                         ` craig
     [not found]                           ` < 19990305143358.4747.qmail@deer >
1999-03-05  9:30                             ` Jeffrey A Law
     [not found]                               ` < 15755.920655014@hurl.cygnus.com >
1999-03-05 10:18                                 ` Joe Buck
1999-03-31 23:46                                   ` Joe Buck
1999-03-05 10:19                                 ` craig
1999-03-31 23:46                                   ` craig
1999-03-31 23:46                               ` Jeffrey A Law
1999-03-31 23:46                           ` craig
1999-03-31 23:46                       ` Jeffrey A Law
1999-03-07 11:01                     ` Zack Weinberg
1999-03-31 23:46                       ` Zack Weinberg
1999-03-31 23:46                   ` David Edelsohn
1999-03-31 23:46             ` Jeffrey A Law
1999-03-31 23:46         ` Zack Weinberg
1999-03-31 23:46     ` Jamie Lokier
1999-03-31 23:46 ` Терехин Вячеслав

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).