1.0 sucessfull, install params questions

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* 1.0 sucessfull, install params questions
@ 1997-12-12  3:55 Hermann Lauer
  1997-12-12  8:55 ` Jeffrey A Law
  0 siblings, 1 reply; 38+ messages in thread
From: Hermann Lauer @ 1997-12-12  3:55 UTC (permalink / raw)
  To: egcs

Hello,

MANY THANKS FOR YOUR WORK ON EGCS !

on a i686-pc-linux-gnulibc1 (something between Redhat 4.2 and Redhat 5.0, with
libc 5.3.12 as development lib):

in the egcs dir I used the following to build:

mkdir objekts
cd objekts

../configure --program-prefix=e
make bootstrap

make install prefix=/tmp/usr/local program-prefix=e

The so generated egcs was then packed with rpm and installed to /usr/local.

egcs works, but the --program-prefix=e seems to be ignored -- all binaries
in /tmp/usr/local/bin alias /usr/local/bin have there default names !

Any comments on this - should this work in principle ?

If the "make install prefix=/tmp/usr/local" trick is not legal with egcs (works
with gcc-2.7.x), please tell me the correct way to achieve the same.

Thanks for any help.

Greetings
   Hermann


Output from tests:

                === libio Summary ===

# of expected passes            40

                === libstdc++ Summary ===

# of expected passes            30

                === gcc Summary ===

# of expected passes            4883
# of expected failures          5
# of unsupported tests          7

                === g++ Summary ===

# of expected passes            3400
# of unexpected successes       3
# of expected failures          80
# of untested testcases         6

                === g77 tests ===

FAIL: g77.f-torture/execute/dnrm2.f execution,  -O2 -fomit-frame-pointer
-finline-functions -funroll-loops
FAIL: g77.f-torture/execute/dnrm2.f execution,  -O2 -fomit-frame-pointer
-finline-functions -funroll-all-loops

                === g77 Summary ===

# of expected passes            130
# of unexpected failures        2



-- 
	Hermann Lauer

Bildverarbeitungsgruppe des Interdiziplinaeren Zentrums fuer
wissenschaftliches Rechnen, Universitaet Heidelberg
INF 368; 69120 Heidelberg; Tel: (06221)548826  Fax: (06221)548850
Email: Hermann.Lauer@iwr.uni-heidelberg.de


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: 1.0 sucessfull, install params questions
  1997-12-12  3:55 1.0 sucessfull, install params questions Hermann Lauer
@ 1997-12-12  8:55 ` Jeffrey A Law
  1997-12-12 10:18   ` Michael Poole
       [not found]   ` <law@hurl.cygnus.com>
  0 siblings, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 1997-12-12  8:55 UTC (permalink / raw)
  To: Hermann.Lauer; +Cc: egcs

  In message < 9712121155.ZM312@giotto.iwr.uni-heidelberg.de >you write:
  > on a i686-pc-linux-gnulibc1 (something between Redhat 4.2 and Redhat 5.0,
  > withlibc 5.3.12 as development lib):
Congrats.

  > make install prefix=/tmp/usr/local program-prefix=e
--prefix must be used at configure time, not at install time.  Trying
to do it at install time is just going to lead to problems later.

Test results look good.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: 1.0 sucessfull, install params questions
  1997-12-12  8:55 ` Jeffrey A Law
@ 1997-12-12 10:18   ` Michael Poole
       [not found]   ` <law@hurl.cygnus.com>
  1 sibling, 0 replies; 38+ messages in thread
From: Michael Poole @ 1997-12-12 10:18 UTC (permalink / raw)
  To: egcs

On Fri, 12 Dec 1997, Jeffrey A Law wrote:

> 
>   In message < 9712121155.ZM312@giotto.iwr.uni-heidelberg.de >you write:
>   > on a i686-pc-linux-gnulibc1 (something between Redhat 4.2 and Redhat 5.0,
>   > withlibc 5.3.12 as development lib):
> Congrats.
> 
>   > make install prefix=/tmp/usr/local program-prefix=e
> --prefix must be used at configure time, not at install time.  Trying
> to do it at install time is just going to lead to problems later.

	Why is this?  Tools like Stow (which manages symlink trees
for you) and Depot (which does quite a bit more) generally require
separate a prefix and install-prefix; is having $(prefix)/foo a symlink to
some other location going to cause problems with egcs (for each foo in
the installed egcs files) in and of itself?

- Michael


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: 1.0 sucessfull, install params questions
       [not found]   ` <law@hurl.cygnus.com>
@ 1997-12-12 15:46     ` Hermann Lauer
  1998-07-14 14:29     ` porting EGCS to the Cray T3E Julian C. Cummings
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Hermann Lauer @ 1997-12-12 15:46 UTC (permalink / raw)
  To: egcs; +Cc: egcs

On Dec 12,  9:10am, Jeffrey A Law wrote:
...
>   > make install prefix=/tmp/usr/local program-prefix=e
> --prefix must be used at configure time, not at install time.  Trying
> to do it at install time is just going to lead to problems later.

So what is the recommended way to compile for a given location but to first
install to another location ? (for example, if /usr/local is exported read-only
via NFS ? (with AFS I have heard similar things can happen)
Also for package builder's this should be an possible option, as you don't want
to destroy another egcs at the same location...

Thanks for any advice.

  Hermann

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
  1998-07-14 11:29 porting EGCS to the Cray T3E Julian C. Cummings
@ 1998-07-14 11:29 ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1998-07-14 11:29 UTC (permalink / raw)
  To: Julian C. Cummings; +Cc: egcs

  In message < 9807141122.ZM4885@vapor.acl.lanl.gov >you write:
  > I guess I don't understand your comment.  There are other Cray platforms listed
  > as possibilities in the config.sub file, such as the Cray X-MP, Y-MP, Cray  2,
  > and Cray [C,J,T]90.  Why would these options be listed if Cray is
  > not supported?
They may be supported as host or build machines, but that does not
mean they are supported as a target.

ie, gcc does not know how to generate code for any cray that I'm aware of.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
@ 1998-07-14 11:29 Julian C. Cummings
  1998-07-14 11:29 ` Jeffrey A Law
  0 siblings, 1 reply; 38+ messages in thread
From: Julian C. Cummings @ 1998-07-14 11:29 UTC (permalink / raw)
  To: law; +Cc: egcs

>> It probably failed because it thought you also wanted to *target* for
>> a cray, which isn't supported.
>>
>> jeff

I guess I don't understand your comment.  There are other Cray platforms listed
as possibilities in the config.sub file, such as the Cray X-MP, Y-MP, Cray 2,
and Cray [C,J,T]90.  Why would these options be listed if Cray is not
supported?

Julian C.

-- 
Julian C. Cummings
Advanced Computing Laboratory
Los Alamos National Laboratory
(505) 667-6064
julianc@acl.lanl.gov

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
  1998-07-14 14:29     ` porting EGCS to the Cray T3E Julian C. Cummings
@ 1998-07-14 13:20       ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1998-07-14 13:20 UTC (permalink / raw)
  To: Julian C. Cummings; +Cc: egcs

  In message < 9807141146.ZM5055@vapor.acl.lanl.gov >you write:

  > OK.  As far as I can tell, the egcs build instructions do not make any sort of
  > distinction like this between "host" and "target".  The instructions imply that
  > these are the same thing.  Nevertheless, I tried the suggestion I received of
  > setting the host to "alpha-cray-unicosmk" to see if that works.  It does not.
No, look at configure.html



To configure egcs: 


     % mkdir objdir 
     % cd objdir 
     % srcdir/configure [target] [options] 

target specification 

     egcs has code to correctly determine the correct value for target for nearly all native
     systems. Therefore, we highly recommend you not provide a configure target when
     configuring a native compiler. 
     target must be specified when configuring a cross compiler; examples of valid targets
     would be i960-rtems, m68k-coff, sh-elf, etc. 


  >  I get the exact same behavior as before.  It works for a while, then says
  > 
  > Configuration alpha-cray-unicosmk not supported
This isn't going to help your problem -- there is no support for cray
targets.  As long as you continue to try and build for a cray target
this is going to fail.
jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
  1998-07-14 16:57     ` Julian C. Cummings
@ 1998-07-14 14:29       ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1998-07-14 14:29 UTC (permalink / raw)
  To: Julian C. Cummings; +Cc: egcs

  In message < 9807141424.ZM5539@vapor.acl.lanl.gov >you write:
  > So what would it take for Cygnus to make egcs portable to the Cray T3E?
  > (i.e., how hard is it?, what would it cost?, etc.)
  > egcs would be extremely useful on Cray machines, since Cray CC is terrible
  > and KAI is very slow to update versions of KCC on the T3E.
Note that Cygnus != egcs.

egcs is a project to help build a better free compiler.  Cygnus happens
to be donating various resources to the project (both manpower and
physical resources).

--

Now, having said that, you would need to contact sales@cygnus.com to
get information about a port to the Cray T3E.


jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
       [not found]   ` <law@hurl.cygnus.com>
  1997-12-12 15:46     ` Hermann Lauer
@ 1998-07-14 14:29     ` Julian C. Cummings
  1998-07-14 13:20       ` Jeffrey A Law
  1998-07-14 16:57     ` Julian C. Cummings
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: Julian C. Cummings @ 1998-07-14 14:29 UTC (permalink / raw)
  To: law; +Cc: egcs

On Jul 14, 11:20am, Jeffrey A Law wrote:
> Subject: Re: porting EGCS to the Cray T3E
>
>   In message < 9807141122.ZM4885@vapor.acl.lanl.gov >you write:
>   > I guess I don't understand your comment.  There are other Cray platforms
listed
>   > as possibilities in the config.sub file, such as the Cray X-MP, Y-MP,
Cray  2,
>   > and Cray [C,J,T]90.  Why would these options be listed if Cray is
>   > not supported?
> They may be supported as host or build machines, but that does not
> mean they are supported as a target.
>
> ie, gcc does not know how to generate code for any cray that I'm aware of.

OK.  As far as I can tell, the egcs build instructions do not make any sort of
distinction like this between "host" and "target".  The instructions imply that
these are the same thing.  Nevertheless, I tried the suggestion I received of
setting the host to "alpha-cray-unicosmk" to see if that works.  It does not.
 I get the exact same behavior as before.  It works for a while, then says

Configuration alpha-cray-unicosmk not supported
Configure in /usr/tmp/julianc/EGCS/egcs-objdir-alpha/gcc failed, exiting.

It sounds like you're saying that egcs cannot generate code for a Cray.  But
these are DEC Alpha processors, so I don't see why not.

Is there some sort of verbosity switch I can throw that might tell me why this
is failing?  There is nothing telltale in the config.log file, as you saw.

Julian C.

-- 
Julian C. Cummings
Advanced Computing Laboratory
Los Alamos National Laboratory
(505) 667-6064
julianc@acl.lanl.gov

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: porting EGCS to the Cray T3E
       [not found]   ` <law@hurl.cygnus.com>
  1997-12-12 15:46     ` Hermann Lauer
  1998-07-14 14:29     ` porting EGCS to the Cray T3E Julian C. Cummings
@ 1998-07-14 16:57     ` Julian C. Cummings
  1998-07-14 14:29       ` Jeffrey A Law
       [not found]     ` < 1845.919451010@hurl.cygnus.com >
       [not found]     ` < 13506.920599740@hurl.cygnus.com >
  4 siblings, 1 reply; 38+ messages in thread
From: Julian C. Cummings @ 1998-07-14 16:57 UTC (permalink / raw)
  To: law; +Cc: egcs

On Jul 14, 12:00pm, Jeffrey A Law wrote:
> Subject: Re: porting EGCS to the Cray T3E
>
>   >  I get the exact same behavior as before.  It works for a while, then
says
>   >
>   > Configuration alpha-cray-unicosmk not supported
> This isn't going to help your problem -- there is no support for cray
> targets.  As long as you continue to try and build for a cray target
> this is going to fail.
> jeff

I see now why choosing "alpha" as a target won't work.  There is a small
assembly language program in config.guess used to distinguish different
types of Alpha processors.  But this does not compile on the T3E; the
assembly language is different.

So what would it take for Cygnus to make egcs portable to the Cray T3E?
(i.e., how hard is it?, what would it cost?, etc.)
egcs would be extremely useful on Cray machines, since Cray CC is terrible
and KAI is very slow to update versions of KCC on the T3E.

Julian C.

-- 
Julian C. Cummings
Advanced Computing Laboratory
Los Alamos National Laboratory
(505) 667-6064
julianc@acl.lanl.gov

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorolla dragonball
       [not found] <19990218210259.A720@loki.midheimar>
@ 1999-02-19  0:09 ` Scott Howard
       [not found]   ` < 36CD1CD3.1FC47334@objsw.com >
  1999-02-28 22:53   ` Scott Howard
  0 siblings, 2 replies; 38+ messages in thread
From: Scott Howard @ 1999-02-19  0:09 UTC (permalink / raw)
  To: crossgcc, egcs

I haven't tried it, so I'm not really on top of the details, but the gnu
documentation warns about problems when you host a cross-compiler for a
32-bit target on a 64-bit host (which I believe applies to the Alpha).

These issues may soon be (or may have already been) resolved by the EGCS
project.  Can anyone on the EGCS list provide some insight?

Kari Davidsson wrote:
> 
> Hi
> 
> I understand that a crosscompiler hosted on Alpha (Linux Alpha) is somewhat
> difficult to build. Is this something that is absolutly undoable?
> Target would be Motorolla dragonball CPU.
> 
> Thanks,
> 
> K.D.
> _______________________________________________
> New CrossGCC FAQ: http://www.objsw.com/CrossGCC
> _______________________________________________
> To remove yourself from the crossgcc list, send
> mail to crossgcc-request@cygnus.com with the
> text 'unsubscribe' (without the quotes) in the
> body of the message.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorolla dragonball
       [not found]   ` < 36CD1CD3.1FC47334@objsw.com >
@ 1999-02-19 11:04     ` Jeffrey A Law
  1999-02-28 22:53       ` Jeffrey A Law
  0 siblings, 1 reply; 38+ messages in thread
From: Jeffrey A Law @ 1999-02-19 11:04 UTC (permalink / raw)
  To: Scott Howard; +Cc: crossgcc, egcs

  In message < 36CD1CD3.1FC47334@objsw.com >you write:
  > I haven't tried it, so I'm not really on top of the details, but the gnu
  > documentation warns about problems when you host a cross-compiler for a
  > 32-bit target on a 64-bit host (which I believe applies to the Alpha).
  > 
  > These issues may soon be (or may have already been) resolved by the EGCS
  > project.  Can anyone on the EGCS list provide some insight?
What problems are you referring to?  As far as I know it's supposed to work.

It's had bugs in the past, and no doubt it'll have bugs in the future, but
that's no different than building a 32x32 or 32x64 cross compiler.



jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorola dragonball
       [not found]     ` < 1845.919451010@hurl.cygnus.com >
@ 1999-02-19 11:09       ` David Edelsohn
  1999-02-28 22:53         ` David Edelsohn
  0 siblings, 1 reply; 38+ messages in thread
From: David Edelsohn @ 1999-02-19 11:09 UTC (permalink / raw)
  To: Scott Howard; +Cc: crossgcc, egcs

	64-bit -> 32-bit cross-compiling is not inherently a problem.
Some ports are not 64-bit safe.  The PowerPC port in egcs-1.1 release is
not 64-bit safe although the development sources should be fixed.  The
ability to cross-compile from a 64-bit host to a 32-bit target is not a
fundamental limitation in EGCS, but it does depend on the particular port
in question.

David

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorola dragonball
  1999-02-19 11:09       ` [Q] alpha egc -> motorola dragonball David Edelsohn
@ 1999-02-28 22:53         ` David Edelsohn
  0 siblings, 0 replies; 38+ messages in thread
From: David Edelsohn @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Scott Howard; +Cc: crossgcc, egcs

	64-bit -> 32-bit cross-compiling is not inherently a problem.
Some ports are not 64-bit safe.  The PowerPC port in egcs-1.1 release is
not 64-bit safe although the development sources should be fixed.  The
ability to cross-compile from a 64-bit host to a 32-bit target is not a
fundamental limitation in EGCS, but it does depend on the particular port
in question.

David

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorolla dragonball
  1999-02-19 11:04     ` Jeffrey A Law
@ 1999-02-28 22:53       ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Scott Howard; +Cc: crossgcc, egcs

  In message < 36CD1CD3.1FC47334@objsw.com >you write:
  > I haven't tried it, so I'm not really on top of the details, but the gnu
  > documentation warns about problems when you host a cross-compiler for a
  > 32-bit target on a 64-bit host (which I believe applies to the Alpha).
  > 
  > These issues may soon be (or may have already been) resolved by the EGCS
  > project.  Can anyone on the EGCS list provide some insight?
What problems are you referring to?  As far as I know it's supposed to work.

It's had bugs in the past, and no doubt it'll have bugs in the future, but
that's no different than building a 32x32 or 32x64 cross compiler.



jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Q] alpha egc -> motorolla dragonball
  1999-02-19  0:09 ` [Q] alpha egc -> motorolla dragonball Scott Howard
       [not found]   ` < 36CD1CD3.1FC47334@objsw.com >
@ 1999-02-28 22:53   ` Scott Howard
  1 sibling, 0 replies; 38+ messages in thread
From: Scott Howard @ 1999-02-28 22:53 UTC (permalink / raw)
  To: crossgcc, egcs

I haven't tried it, so I'm not really on top of the details, but the gnu
documentation warns about problems when you host a cross-compiler for a
32-bit target on a 64-bit host (which I believe applies to the Alpha).

These issues may soon be (or may have already been) resolved by the EGCS
project.  Can anyone on the EGCS list provide some insight?

Kari Davidsson wrote:
> 
> Hi
> 
> I understand that a crosscompiler hosted on Alpha (Linux Alpha) is somewhat
> difficult to build. Is this something that is absolutly undoable?
> Target would be Motorolla dragonball CPU.
> 
> Thanks,
> 
> K.D.
> _______________________________________________
> New CrossGCC FAQ: http://www.objsw.com/CrossGCC
> _______________________________________________
> To remove yourself from the crossgcc list, send
> mail to crossgcc-request@cygnus.com with the
> text 'unsubscribe' (without the quotes) in the
> body of the message.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* gcc-2.7 creates faster code than pgcc-1.1.1
@ 1999-03-04  3:40 Терехин Вячеслав
       [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
  1999-03-31 23:46 ` Терехин Вячеслав
  0 siblings, 2 replies; 38+ messages in thread
From: Терехин Вячеслав @ 1999-03-04  3:40 UTC (permalink / raw)
  To: egcs

As I wrote previously gcc-2.7.2.3 generates faster gzip
than egcs-1.1.1/pgcc-1.1.1 on PentiumPro.
The slowdown is greater than 10% on decompression operation.
This can be easily checked if you have RedHat 5.2.
The shipped gzip is gcc-2.7.2.3 compiled.

After several day of search I finally find out offending
instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.

This instruction is:
andl $255, %eax
in flush_window (util.c) function body (it is inlined from updcrc)

if you manually replace it with
movzbl %al, $eax
this will boost decompression by 20%.

All the below staff is made in gzip-1.2.4a source folder.

$ make CFLAGS="-O6 -mpentiumpro"
$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m8.047s
user    0m7.970s
sys     0m0.070s

$time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.646s
user    0m12.470s
sys     0m0.160s

$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.c -S
$ sed 's/andl $255,%eax/movzbl %al, %eax/g' util.s > util.S
$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.S
$ make CFLAGS="-O6 -mpentiumpro"

$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m6.658s
user    0m6.540s
sys     0m0.110s

$ time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.688s
user    0m12.490s
sys     0m0.180s

All this staff do not apply to Pentium processor as far as I know
(I test it Pentium MMX 200MHz)

I do not know why this happens.
Anybody who knows how to deal with it, please, reply me
as soon as possible.

And finally if you have Pentium Pro or Pentium II please
do this check and report result to me.
I wonder whether I have brain damaged Pentium Pro.

Sincerely Yours, Eugene.

PS I am not on this mailing list.
Also it will be better if you will sent reply to bom@classic.iki.rssi.ru
I can not use it directly as mail can not be delivered by it to this list.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
@ 1999-03-04 13:20   ` Jamie Lokier
       [not found]     ` < 19990304222018.A21939@pcep-jamie.cern.ch >
  1999-03-31 23:46     ` Jamie Lokier
  0 siblings, 2 replies; 38+ messages in thread
From: Jamie Lokier @ 1999-03-04 13:20 UTC (permalink / raw)
  To: ÃƒÂ´ÃƒÂ…ÃƒÂ’ÃƒÂ…ÃƒÂˆÃƒÂ‰ÃƒÂŽ
	ÃƒÂ·ÃƒÂ‘ÃƒÂžÃƒÂ…ÃƒÂ“ÃƒÂŒÃƒÂÃƒÂ—,
	egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

Ã´Ã…Ã’Ã…ÃˆÃ‰ÃŽ Ã·Ã‘ÃžÃ…Ã“ÃŒÃÃ— wrote:
> After several day of search I finally find out offending
> instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
> on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.
> 
> This instruction is:
> andl $255, %eax
> in flush_window (util.c) function body (it is inlined from updcrc)
> 
> if you manually replace it with
> movzbl %al, $eax
> this will boost decompression by 20%.

In the past I have written hand-optimised assembly language, tuned for
the different x86 families, and I found movzbl to be a very effective
instruction on the Pentium Pro.  So what you describe sounds correct.

Another is to do xorl %eax,%eax just before loading something into %al.
That is fast on the PPro too.

-- Jamie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]     ` < 19990304222018.A21939@pcep-jamie.cern.ch >
@ 1999-03-04 17:05       ` Zack Weinberg
       [not found]         ` < 199903050104.UAA15335@octiron.phys.columbia.edu >
  1999-03-31 23:46         ` Zack Weinberg
  0 siblings, 2 replies; 38+ messages in thread
From: Zack Weinberg @ 1999-03-04 17:05 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: egcs

On Thu, 4 Mar 1999 22:20:18 +0100, Jamie Lokier wrote:
>> After several day of search I finally find out offending
>> instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
>> on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.
>> 
>> This instruction is:
>> andl $255, %eax
>> in flush_window (util.c) function body (it is inlined from updcrc)
>> 
>> if you manually replace it with
>> movzbl %al, $eax
>> this will boost decompression by 20%.
>
>In the past I have written hand-optimised assembly language, tuned for
>the different x86 families, and I found movzbl to be a very effective
>instruction on the Pentium Pro.  So what you describe sounds correct.
>
>Another is to do xorl %eax,%eax just before loading something into %al.
>That is fast on the PPro too.

A related issue:  I see us generate a lot of code like this for loops over
strings:

loop:
	xorl %eax, %eax
	movb (%esi), %al
	incl %esi
	movb %al, (%edi)
	incl %edi
	testl %eax
	jne loop

After the first iteration, the xorl is unnecessary.  We ought to be able to
hoist it out of the loop.

zw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]         ` < 199903050104.UAA15335@octiron.phys.columbia.edu >
@ 1999-03-04 18:09           ` Jeffrey A Law
  1999-03-31 23:46             ` Jeffrey A Law
  0 siblings, 1 reply; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-04 18:09 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Jamie Lokier, egcs

  In message < 199903050104.UAA15335@octiron.phys.columbia.edu >you write:
  > A related issue:  I see us generate a lot of code like this for loops over
  > strings:
  > 
  > loop:
  > 	xorl %eax, %eax
  > 	movb (%esi), %al
  > 	incl %esi
  > 	movb %al, (%edi)
  > 	incl %edi
  > 	testl %eax
  > 	jne loop
  > 
  > After the first iteration, the xorl is unnecessary.  We ought to be able to
  > hoist it out of the loop.
True, but I don't believe the loop optimizer is prepared to do that since it
doesn't keep track of what bits are active vs what bits are inactive in a 
value.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]     ` < 13506.920599740@hurl.cygnus.com >
@ 1999-03-04 20:04       ` David Edelsohn
       [not found]         ` < 9903050403.AA36338@marc.watson.ibm.com >
  1999-03-31 23:46         ` David Edelsohn
  0 siblings, 2 replies; 38+ messages in thread
From: David Edelsohn @ 1999-03-04 20:04 UTC (permalink / raw)
  To: law; +Cc: Zack Weinberg, Jamie Lokier, egcs

>>>>> Jeffrey A Law writes:

Jeff> True, but I don't believe the loop optimizer is prepared to do that since it
Jeff> doesn't keep track of what bits are active vs what bits are inactive in a 
Jeff> value.

	GCC is missing a general feature of value propagation which would
help with a lot of optimizations like this.  Hopefully this infrastructure
will be added or contributed someday.

David

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]         ` < 9903050403.AA36338@marc.watson.ibm.com >
@ 1999-03-04 20:31           ` Jeffrey A Law
       [not found]             ` < 13939.920608288@hurl.cygnus.com >
  1999-03-31 23:46             ` Jeffrey A Law
  1999-03-07 11:01           ` Zack Weinberg
  1 sibling, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-04 20:31 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Jamie Lokier, egcs

  In message < 9903050403.AA36338@marc.watson.ibm.com >you write:
  > 	GCC is missing a general feature of value propagation which would
  > help with a lot of optimizations like this.  Hopefully this infrastructure
  > will be added or contributed someday.
Most of the papers I've read have indicated only trivial gains from value
range propagation.  I've also had discussions with folks that have implemented
this opt in a commercial compiler -- it's so minor of a win that they didn't
consider it worth the effort.

It's best use appears to be for optimizing array bounds checking in languages
that require such checks.


jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]             ` < 13939.920608288@hurl.cygnus.com >
@ 1999-03-05  6:53               ` craig
       [not found]                 ` < 19990305143358.4747.qmail@deer >
  1999-03-31 23:46                 ` craig
  0 siblings, 2 replies; 38+ messages in thread
From: craig @ 1999-03-05  6:53 UTC (permalink / raw)
  To: law; +Cc: craig

>Most of the papers I've read have indicated only trivial gains from value
>range propagation.  I've also had discussions with folks that have implemented
>this opt in a commercial compiler -- it's so minor of a win that they didn't
>consider it worth the effort.

If that we really true *generally*, i.e. for floating-point as well,
then we could happily make -mieee the default for egcs/gcc on Alphas,
for example.  (This would make Alpha-generated floating-point code
much slower, of course, which is why I mention it: if it was possible
to use value-range propagation to determine when the special code-
generation normally needed for full IEEE range wasn't needed after all,
then some of that performance could be regained.)

But, I have to admit I can't document any really clear performance
wins from my own hand-tuned assembly code (written over the past few
decades) deriving solely from value-range propagation, especially
for *integer* (versus floating-point) values.

Generally, I think being able to avoid even checking for exceptional
values might be one source of performance wins from this: e.g. if
we knew it was "impossible" to divide by zero here, square-root a
negative number there, and so on, we could save generated extra
code, thus making Icache misses theoretically less frequent.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]                 ` < 19990305143358.4747.qmail@deer >
@ 1999-03-05  9:30                   ` Jeffrey A Law
       [not found]                     ` < 15755.920655014@hurl.cygnus.com >
  1999-03-31 23:46                     ` Jeffrey A Law
  0 siblings, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-05  9:30 UTC (permalink / raw)
  To: craig; +Cc: dje, zack, egcs, egcs

  In message < 19990305143358.4747.qmail@deer >you write:
  > If that we really true *generally*, i.e. for floating-point as well,
  > then we could happily make -mieee the default for egcs/gcc on Alphas,
  > for example.  (This would make Alpha-generated floating-point code
  > much slower, of course, which is why I mention it: if it was possible
  > to use value-range propagation to determine when the special code-
  > generation normally needed for full IEEE range wasn't needed after all,
  > then some of that performance could be regained.)
None of the papers discussed it in a floating point context, merely from an
integer context.  The basics were you had a range [min,max] and a bit which
indicated that the value was in or out of the range which was built up on a
basic block level for each expression.

The ranges were then propagated through the flow graph like any other local 
local property.  A trivial example, at a flow merge point where we had
one path with a range 0..4 inclusive and a range 6...12 inclusive the
resulting range would be 0..12 inclusive. [ It didn't try to track the hole
at 5. ]

As you mention it could be used to detect and eliminate things like domain
checks which depend solely on the range, not the precision.

Knowing that certain numbers can't be a NaN or Inf would lead to being able
to apply more identity operations on floating point values.

It'd be a lot of work though.  I suspect there's other lower hanging fruit
we can/should go after (assignment motion, partial dead code elimination,
sparse conditional constant propagation, etc).

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]                     ` < 15755.920655014@hurl.cygnus.com >
@ 1999-03-05 10:18                       ` Joe Buck
  1999-03-31 23:46                         ` Joe Buck
  1999-03-05 10:19                       ` craig
  1 sibling, 1 reply; 38+ messages in thread
From: Joe Buck @ 1999-03-05 10:18 UTC (permalink / raw)
  To: law; +Cc: craig, dje, zack, egcs, egcs

> None of the papers discussed it in a floating point context, merely from an
> integer context.  The basics were you had a range [min,max] and a bit which
> indicated that the value was in or out of the range which was built up on a
> basic block level for each expression.
> ...
> As you mention it could be used to detect and eliminate things like domain
> checks which depend solely on the range, not the precision.

There's been work on fixed point optimization for embedded DSP that also
propagates precision information.  e.g. we have [min,max,prec] which means
that the value is known to be a multiple of pow(2,prec), as well as being
in [min,max].  The idea is that we want to eliminate both the
overflow-check operations and the rounding operations in generated code.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]                     ` < 15755.920655014@hurl.cygnus.com >
  1999-03-05 10:18                       ` Joe Buck
@ 1999-03-05 10:19                       ` craig
  1999-03-31 23:46                         ` craig
  1 sibling, 1 reply; 38+ messages in thread
From: craig @ 1999-03-05 10:19 UTC (permalink / raw)
  To: law; +Cc: craig

>It'd be a lot of work though.  I suspect there's other lower hanging fruit
>we can/should go after (assignment motion, partial dead code elimination,
>sparse conditional constant propagation, etc).

No analysis to offer, but: my instincts say you're right.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
       [not found]         ` < 9903050403.AA36338@marc.watson.ibm.com >
  1999-03-04 20:31           ` Jeffrey A Law
@ 1999-03-07 11:01           ` Zack Weinberg
  1999-03-31 23:46             ` Zack Weinberg
  1 sibling, 1 reply; 38+ messages in thread
From: Zack Weinberg @ 1999-03-07 11:01 UTC (permalink / raw)
  To: David Edelsohn; +Cc: law, egcs

On Thu, 04 Mar 1999 23:03:58 -0500, David Edelsohn wrote:
>>>>>> Jeffrey A Law writes:
>
>Jeff> True, but I don't believe the loop optimizer is prepared to do that sinc
>e it
>Jeff> doesn't keep track of what bits are active vs what bits are inactive in 
>a 
>Jeff> value.
>
>	GCC is missing a general feature of value propagation which would
>help with a lot of optimizations like this.  Hopefully this infrastructure
>will be added or contributed someday.

For the case I'm interested in, we don't need general value propagation,
only to recognize that we are loading QImode values into a register without
sign extension.  Mode-based range analysis ought to be simpler than full
value propagation.

If we knew how to generate QImode compares, we wouldn't need to clear the
register at all.  That may be something fixable in i386.md, I'm not sure.

zw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-05  9:30                   ` Jeffrey A Law
       [not found]                     ` < 15755.920655014@hurl.cygnus.com >
@ 1999-03-31 23:46                     ` Jeffrey A Law
  1 sibling, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-31 23:46 UTC (permalink / raw)
  To: craig; +Cc: dje, zack, egcs, egcs

  In message < 19990305143358.4747.qmail@deer >you write:
  > If that we really true *generally*, i.e. for floating-point as well,
  > then we could happily make -mieee the default for egcs/gcc on Alphas,
  > for example.  (This would make Alpha-generated floating-point code
  > much slower, of course, which is why I mention it: if it was possible
  > to use value-range propagation to determine when the special code-
  > generation normally needed for full IEEE range wasn't needed after all,
  > then some of that performance could be regained.)
None of the papers discussed it in a floating point context, merely from an
integer context.  The basics were you had a range [min,max] and a bit which
indicated that the value was in or out of the range which was built up on a
basic block level for each expression.

The ranges were then propagated through the flow graph like any other local 
local property.  A trivial example, at a flow merge point where we had
one path with a range 0..4 inclusive and a range 6...12 inclusive the
resulting range would be 0..12 inclusive. [ It didn't try to track the hole
at 5. ]

As you mention it could be used to detect and eliminate things like domain
checks which depend solely on the range, not the precision.

Knowing that certain numbers can't be a NaN or Inf would lead to being able
to apply more identity operations on floating point values.

It'd be a lot of work though.  I suspect there's other lower hanging fruit
we can/should go after (assignment motion, partial dead code elimination,
sparse conditional constant propagation, etc).

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-05 10:19                       ` craig
@ 1999-03-31 23:46                         ` craig
  0 siblings, 0 replies; 38+ messages in thread
From: craig @ 1999-03-31 23:46 UTC (permalink / raw)
  To: law; +Cc: craig

>It'd be a lot of work though.  I suspect there's other lower hanging fruit
>we can/should go after (assignment motion, partial dead code elimination,
>sparse conditional constant propagation, etc).

No analysis to offer, but: my instincts say you're right.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04 20:04       ` gcc-2.7 creates faster code than pgcc-1.1.1 David Edelsohn
       [not found]         ` < 9903050403.AA36338@marc.watson.ibm.com >
@ 1999-03-31 23:46         ` David Edelsohn
  1 sibling, 0 replies; 38+ messages in thread
From: David Edelsohn @ 1999-03-31 23:46 UTC (permalink / raw)
  To: law; +Cc: Zack Weinberg, Jamie Lokier, egcs

>>>>> Jeffrey A Law writes:

Jeff> True, but I don't believe the loop optimizer is prepared to do that since it
Jeff> doesn't keep track of what bits are active vs what bits are inactive in a 
Jeff> value.

	GCC is missing a general feature of value propagation which would
help with a lot of optimizations like this.  Hopefully this infrastructure
will be added or contributed someday.

David

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04 13:20   ` Jamie Lokier
       [not found]     ` < 19990304222018.A21939@pcep-jamie.cern.ch >
@ 1999-03-31 23:46     ` Jamie Lokier
  1 sibling, 0 replies; 38+ messages in thread
From: Jamie Lokier @ 1999-03-31 23:46 UTC (permalink / raw)
  To: ÃƒÂ´ÃƒÂ…ÃƒÂ’ÃƒÂ…ÃƒÂˆÃƒÂ‰ÃƒÂŽ
	ÃƒÂ·ÃƒÂ‘ÃƒÂžÃƒÂ…ÃƒÂ“ÃƒÂŒÃƒÂÃƒÂ—,
	egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

Ã´Ã…Ã’Ã…ÃˆÃ‰ÃŽ Ã·Ã‘ÃžÃ…Ã“ÃŒÃÃ— wrote:
> After several day of search I finally find out offending
> instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
> on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.
> 
> This instruction is:
> andl $255, %eax
> in flush_window (util.c) function body (it is inlined from updcrc)
> 
> if you manually replace it with
> movzbl %al, $eax
> this will boost decompression by 20%.

In the past I have written hand-optimised assembly language, tuned for
the different x86 families, and I found movzbl to be a very effective
instruction on the Pentium Pro.  So what you describe sounds correct.

Another is to do xorl %eax,%eax just before loading something into %al.
That is fast on the PPro too.

-- Jamie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04 20:31           ` Jeffrey A Law
       [not found]             ` < 13939.920608288@hurl.cygnus.com >
@ 1999-03-31 23:46             ` Jeffrey A Law
  1 sibling, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-31 23:46 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Jamie Lokier, egcs

  In message < 9903050403.AA36338@marc.watson.ibm.com >you write:
  > 	GCC is missing a general feature of value propagation which would
  > help with a lot of optimizations like this.  Hopefully this infrastructure
  > will be added or contributed someday.
Most of the papers I've read have indicated only trivial gains from value
range propagation.  I've also had discussions with folks that have implemented
this opt in a commercial compiler -- it's so minor of a win that they didn't
consider it worth the effort.

It's best use appears to be for optimizing array bounds checking in languages
that require such checks.


jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04  3:40 gcc-2.7 creates faster code than pgcc-1.1.1 Терехин Вячеслав
       [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
@ 1999-03-31 23:46 ` Терехин Вячеслав
  1 sibling, 0 replies; 38+ messages in thread
From: Терехин Вячеслав @ 1999-03-31 23:46 UTC (permalink / raw)
  To: egcs

As I wrote previously gcc-2.7.2.3 generates faster gzip
than egcs-1.1.1/pgcc-1.1.1 on PentiumPro.
The slowdown is greater than 10% on decompression operation.
This can be easily checked if you have RedHat 5.2.
The shipped gzip is gcc-2.7.2.3 compiled.

After several day of search I finally find out offending
instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.

This instruction is:
andl $255, %eax
in flush_window (util.c) function body (it is inlined from updcrc)

if you manually replace it with
movzbl %al, $eax
this will boost decompression by 20%.

All the below staff is made in gzip-1.2.4a source folder.

$ make CFLAGS="-O6 -mpentiumpro"
$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m8.047s
user    0m7.970s
sys     0m0.070s

$time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.646s
user    0m12.470s
sys     0m0.160s

$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.c -S
$ sed 's/andl $255,%eax/movzbl %al, %eax/g' util.s > util.S
$
gcc -c -DASMV -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DDIRENT=1 -O6 -mpentiumpro
util.S
$ make CFLAGS="-O6 -mpentiumpro"

$ time ./gzip -cd egcs-1.1.1.tar.gz > /dev/null

real    0m6.658s
user    0m6.540s
sys     0m0.110s

$ time ./gzip -c egcs-1.1.1.tar.gz > /dev/null

real    0m12.688s
user    0m12.490s
sys     0m0.180s

All this staff do not apply to Pentium processor as far as I know
(I test it Pentium MMX 200MHz)

I do not know why this happens.
Anybody who knows how to deal with it, please, reply me
as soon as possible.

And finally if you have Pentium Pro or Pentium II please
do this check and report result to me.
I wonder whether I have brain damaged Pentium Pro.

Sincerely Yours, Eugene.

PS I am not on this mailing list.
Also it will be better if you will sent reply to bom@classic.iki.rssi.ru
I can not use it directly as mail can not be delivered by it to this list.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04 18:09           ` Jeffrey A Law
@ 1999-03-31 23:46             ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 1999-03-31 23:46 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Jamie Lokier, egcs

  In message < 199903050104.UAA15335@octiron.phys.columbia.edu >you write:
  > A related issue:  I see us generate a lot of code like this for loops over
  > strings:
  > 
  > loop:
  > 	xorl %eax, %eax
  > 	movb (%esi), %al
  > 	incl %esi
  > 	movb %al, (%edi)
  > 	incl %edi
  > 	testl %eax
  > 	jne loop
  > 
  > After the first iteration, the xorl is unnecessary.  We ought to be able to
  > hoist it out of the loop.
True, but I don't believe the loop optimizer is prepared to do that since it
doesn't keep track of what bits are active vs what bits are inactive in a 
value.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-05 10:18                       ` Joe Buck
@ 1999-03-31 23:46                         ` Joe Buck
  0 siblings, 0 replies; 38+ messages in thread
From: Joe Buck @ 1999-03-31 23:46 UTC (permalink / raw)
  To: law; +Cc: craig, dje, zack, egcs, egcs

> None of the papers discussed it in a floating point context, merely from an
> integer context.  The basics were you had a range [min,max] and a bit which
> indicated that the value was in or out of the range which was built up on a
> basic block level for each expression.
> ...
> As you mention it could be used to detect and eliminate things like domain
> checks which depend solely on the range, not the precision.

There's been work on fixed point optimization for embedded DSP that also
propagates precision information.  e.g. we have [min,max,prec] which means
that the value is known to be a multiple of pow(2,prec), as well as being
in [min,max].  The idea is that we want to eliminate both the
overflow-check operations and the rounding operations in generated code.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-04 17:05       ` Zack Weinberg
       [not found]         ` < 199903050104.UAA15335@octiron.phys.columbia.edu >
@ 1999-03-31 23:46         ` Zack Weinberg
  1 sibling, 0 replies; 38+ messages in thread
From: Zack Weinberg @ 1999-03-31 23:46 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: egcs

On Thu, 4 Mar 1999 22:20:18 +0100, Jamie Lokier wrote:
>> After several day of search I finally find out offending
>> instruction that slow down gzip compiled with egcs-1.1.1/pgcc-1.1.1
>> on PentiumPro 180MHz (132MB RAM) but the result seems crazy to me.
>> 
>> This instruction is:
>> andl $255, %eax
>> in flush_window (util.c) function body (it is inlined from updcrc)
>> 
>> if you manually replace it with
>> movzbl %al, $eax
>> this will boost decompression by 20%.
>
>In the past I have written hand-optimised assembly language, tuned for
>the different x86 families, and I found movzbl to be a very effective
>instruction on the Pentium Pro.  So what you describe sounds correct.
>
>Another is to do xorl %eax,%eax just before loading something into %al.
>That is fast on the PPro too.

A related issue:  I see us generate a lot of code like this for loops over
strings:

loop:
	xorl %eax, %eax
	movb (%esi), %al
	incl %esi
	movb %al, (%edi)
	incl %edi
	testl %eax
	jne loop

After the first iteration, the xorl is unnecessary.  We ought to be able to
hoist it out of the loop.

zw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-07 11:01           ` Zack Weinberg
@ 1999-03-31 23:46             ` Zack Weinberg
  0 siblings, 0 replies; 38+ messages in thread
From: Zack Weinberg @ 1999-03-31 23:46 UTC (permalink / raw)
  To: David Edelsohn; +Cc: law, egcs

On Thu, 04 Mar 1999 23:03:58 -0500, David Edelsohn wrote:
>>>>>> Jeffrey A Law writes:
>
>Jeff> True, but I don't believe the loop optimizer is prepared to do that sinc
>e it
>Jeff> doesn't keep track of what bits are active vs what bits are inactive in 
>a 
>Jeff> value.
>
>	GCC is missing a general feature of value propagation which would
>help with a lot of optimizations like this.  Hopefully this infrastructure
>will be added or contributed someday.

For the case I'm interested in, we don't need general value propagation,
only to recognize that we are loading QImode values into a register without
sign extension.  Mode-based range analysis ought to be simpler than full
value propagation.

If we knew how to generate QImode compares, we wouldn't need to clear the
register at all.  That may be something fixable in i386.md, I'm not sure.

zw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: gcc-2.7 creates faster code than pgcc-1.1.1
  1999-03-05  6:53               ` craig
       [not found]                 ` < 19990305143358.4747.qmail@deer >
@ 1999-03-31 23:46                 ` craig
  1 sibling, 0 replies; 38+ messages in thread
From: craig @ 1999-03-31 23:46 UTC (permalink / raw)
  To: law; +Cc: craig

>Most of the papers I've read have indicated only trivial gains from value
>range propagation.  I've also had discussions with folks that have implemented
>this opt in a commercial compiler -- it's so minor of a win that they didn't
>consider it worth the effort.

If that we really true *generally*, i.e. for floating-point as well,
then we could happily make -mieee the default for egcs/gcc on Alphas,
for example.  (This would make Alpha-generated floating-point code
much slower, of course, which is why I mention it: if it was possible
to use value-range propagation to determine when the special code-
generation normally needed for full IEEE range wasn't needed after all,
then some of that performance could be regained.)

But, I have to admit I can't document any really clear performance
wins from my own hand-tuned assembly code (written over the past few
decades) deriving solely from value-range propagation, especially
for *integer* (versus floating-point) values.

Generally, I think being able to avoid even checking for exceptional
values might be one source of performance wins from this: e.g. if
we knew it was "impossible" to divide by zero here, square-root a
negative number there, and so on, we could save generated extra
code, thus making Icache misses theoretically less frequent.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~1999-03-31 23:46 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <19990218210259.A720@loki.midheimar>
1999-02-19  0:09 ` [Q] alpha egc -> motorolla dragonball Scott Howard
     [not found]   ` < 36CD1CD3.1FC47334@objsw.com >
1999-02-19 11:04     ` Jeffrey A Law
1999-02-28 22:53       ` Jeffrey A Law
1999-02-28 22:53   ` Scott Howard
1999-03-04  3:40 gcc-2.7 creates faster code than pgcc-1.1.1 Терехин Вячеслав
     [not found] ` < 001401be6633$fed21a60$a18330d4@main.medtech.ru >
1999-03-04 13:20   ` Jamie Lokier
     [not found]     ` < 19990304222018.A21939@pcep-jamie.cern.ch >
1999-03-04 17:05       ` Zack Weinberg
     [not found]         ` < 199903050104.UAA15335@octiron.phys.columbia.edu >
1999-03-04 18:09           ` Jeffrey A Law
1999-03-31 23:46             ` Jeffrey A Law
1999-03-31 23:46         ` Zack Weinberg
1999-03-31 23:46     ` Jamie Lokier
1999-03-31 23:46 ` Терехин Вячеслав
  -- strict thread matches above, loose matches on Subject: below --
1998-07-14 11:29 porting EGCS to the Cray T3E Julian C. Cummings
1998-07-14 11:29 ` Jeffrey A Law
1997-12-12  3:55 1.0 sucessfull, install params questions Hermann Lauer
1997-12-12  8:55 ` Jeffrey A Law
1997-12-12 10:18   ` Michael Poole
     [not found]   ` <law@hurl.cygnus.com>
1997-12-12 15:46     ` Hermann Lauer
1998-07-14 14:29     ` porting EGCS to the Cray T3E Julian C. Cummings
1998-07-14 13:20       ` Jeffrey A Law
1998-07-14 16:57     ` Julian C. Cummings
1998-07-14 14:29       ` Jeffrey A Law
     [not found]     ` < 1845.919451010@hurl.cygnus.com >
1999-02-19 11:09       ` [Q] alpha egc -> motorola dragonball David Edelsohn
1999-02-28 22:53         ` David Edelsohn
     [not found]     ` < 13506.920599740@hurl.cygnus.com >
1999-03-04 20:04       ` gcc-2.7 creates faster code than pgcc-1.1.1 David Edelsohn
     [not found]         ` < 9903050403.AA36338@marc.watson.ibm.com >
1999-03-04 20:31           ` Jeffrey A Law
     [not found]             ` < 13939.920608288@hurl.cygnus.com >
1999-03-05  6:53               ` craig
     [not found]                 ` < 19990305143358.4747.qmail@deer >
1999-03-05  9:30                   ` Jeffrey A Law
     [not found]                     ` < 15755.920655014@hurl.cygnus.com >
1999-03-05 10:18                       ` Joe Buck
1999-03-31 23:46                         ` Joe Buck
1999-03-05 10:19                       ` craig
1999-03-31 23:46                         ` craig
1999-03-31 23:46                     ` Jeffrey A Law
1999-03-31 23:46                 ` craig
1999-03-31 23:46             ` Jeffrey A Law
1999-03-07 11:01           ` Zack Weinberg
1999-03-31 23:46             ` Zack Weinberg
1999-03-31 23:46         ` David Edelsohn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).