* Re: fixing the c++/f77 circular dependency
@ 1997-08-22 19:47 H.J. Lu
1997-08-22 22:49 ` Splitting configure stages Alexandre Oliva
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: H.J. Lu @ 1997-08-22 19:47 UTC (permalink / raw)
To: egcs
>
> Why not fix it by moving the c++ part of libgcc.a into cp/Makefile?
>
> That is one of the options I mentioned at the beginning of this discussion,
> and is still an open issue.
>
> However, this does not fix the problem unless we move the c++ stuff out of
> libgcc.a and into a new library. This means changing the c++ compiler to
> link in yet another library, and this isn't very desirable. But now that
> we have added libstdc++ to the egcs distribution, this stuff could perhaps
> be moved into libstdc++.
>
I'd like to see them in libgcc.a. It should not be hard to put
something in libgcc.a from cp/Makefile.
--
H.J. Lu (hjl@gnu.ai.mit.edu)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Splitting configure stages
1997-08-22 19:47 fixing the c++/f77 circular dependency H.J. Lu
@ 1997-08-22 22:49 ` Alexandre Oliva
1997-08-22 23:04 ` f771 signal 6 in latest snapshot Oleg Krivosheev
1997-08-22 23:04 ` multiple_of_p change Jeffrey A Law
2 siblings, 0 replies; 7+ messages in thread
From: Alexandre Oliva @ 1997-08-22 22:49 UTC (permalink / raw)
To: egcs
Hi there!
There's been a lot of talk about testing at configure-time such and
such property of the host/build/target system.
There's something that annoys me a bit about the compilation model of
gcc: unlike other programs, you can't just run make all and it's
done.
I've just realised a way to fix both of these problems. Every stage
directory should have its own configuration files. The global
makefile would only make sure every stage has been configured and
built as appropriate, like this:
all: stage1 stage2 stage3
stage1: stage1/config.h
cd stage1 && $(MAKE) all
stage2: stage2/config.h
cd stage2 && $(MAKE) all
stage3: stage3/config.h
cd stage3 && $(MAKE) all
stage1/config.h:
[ -d stage1 ] || mkdir stage1
cd stage1 && \
CC=$(CC) \
../configure $(CONFIGURE_FLAGS_FOR_STAGE1)
stage2/config.h: stage1/xgcc
[ -d stage2 ] || mkdir stage2
cd stage2 && \
CC="../stage1/xgcc -B../stage1" \
../configure $(CONFIGURE_FLAGS_FOR_STAGE2)
stage3/config.h: stage2/xgcc
[ -d stage3 ] || mkdir stage3
cd stage3 && \
CC="../stage3/xgcc -B../stage1" \
../configure $(CONFIGURE_FLAGS_FOR_STAGE3)
--
Alexandre Oliva
mailto:oliva@dcc.unicamp.br mailto:aoliva@acm.org
Universidade Estadual de Campinas, SP, Brasil
^ permalink raw reply [flat|nested] 7+ messages in thread
* multiple_of_p change
1997-08-22 19:47 fixing the c++/f77 circular dependency H.J. Lu
1997-08-22 22:49 ` Splitting configure stages Alexandre Oliva
1997-08-22 23:04 ` f771 signal 6 in latest snapshot Oleg Krivosheev
@ 1997-08-22 23:04 ` Jeffrey A Law
2 siblings, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1997-08-22 23:04 UTC (permalink / raw)
To: egcs
I _finally_ got around to looking at the multiple_of_p change.
The canonical C example from Toon is:
void copy3(int n; float a[n][n][n], float b[n][n][n], int n)
{
int i, j, k;
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
for (k = 0; k < n; k++)
b[i][j][k] = a[i][j][k];
}
On my PA, the inner loop expands to about 25 insns; with the
multiple_of_p patch it's only 5 insns.
I don't know how often this kind of thing normally happens in C,
but apparently it's quite common in Fortran. So it's worth trying
to optimize.
As the comments in the patch suggest, there are other cases which
potentially we might want to optimize, but would need further work
in expand_divmod (ie cases where code generated for EXACT_DIV_EXPR
is slower than than TRUNC_DIV or FLOOR_DIV).
[ BTW can someone provide code samples where TRUNC_DIV or FLOOR_DIV
produces better code than EXACT_DIV that we can examine? ]
However, since one of things we want to do with this project is
be more pragmatic about this kind of stuff, does anyone see a
reason why this patch couldn't go in (after some comment updates
of course).
Index: fold-const.c
===================================================================
RCS file: /cvs/cvsfiles/devo/gcc/fold-const.c,v
retrieving revision 1.105
diff -c -3 -p -r1.105 fold-const.c
*** fold-const.c 1997/08/08 17:58:39 1.105
--- fold-const.c 1997/08/22 23:00:33
*************** static tree fold_range_test PROTO((tree)
*** 107,112 ****
--- 107,113 ----
static tree unextend PROTO((tree, int, int, tree));
static tree fold_truthop PROTO((enum tree_code, tree, tree, tree));
static tree strip_compound_expr PROTO((tree, tree));
+ static int multiple_of_p PROTO((tree, tree, tree));
#ifndef BRANCH_COST
#define BRANCH_COST 1
*************** fold (expr)
*** 4571,4576 ****
--- 4572,4589 ----
if (integer_zerop (arg1))
return t;
+ /* If arg0 is a multiple of arg1, then rewrite to the fastest div
+ operation, EXACT_DIV_EXPR. Otherwise, handle folding of
+ general divide. Note that only CEIL_DIV_EXPR is rewritten now,
+ only because the others seem to be faster in some cases, e.g. the
+ nonoptimized TRUNC_DIV_EXPR or FLOOR_DIV_EXPR on DEC Alpha. This
+ is probably just due to more work being done on it in expmed.c than
+ on EXACT_DIV_EXPR, and could presumably be fixed, since
+ EXACT_DIV_EXPR should _never_ be slower than *_DIV_EXPR. */
+ if (code == CEIL_DIV_EXPR
+ && multiple_of_p (type, arg0, arg1))
+ return fold (build (EXACT_DIV_EXPR, type, arg0, arg1));
+
/* If we have ((a / C1) / C2) where both division are the same type, try
to simplify. First see if C1 * C2 overflows or not. */
if (TREE_CODE (arg0) == code && TREE_CODE (arg1) == INTEGER_CST
*************** fold (expr)
*** 5746,5748 ****
--- 5759,5862 ----
return t;
} /* switch (code) */
}
+
+ /* Determine if first argument is a multiple of second argument.
+ Return 0 if it is not, or is not easily determined to so be.
+
+ An example of the sort of thing we care about (at this point --
+ this routine could surely be made more general, and expanded
+ to do what the *_DIV_EXPR's fold() cases do now) is discovering
+ that
+
+ SAVE_EXPR (I) * SAVE_EXPR (J * 8)
+
+ is a multiple of
+
+ SAVE_EXPR (J * 8)
+
+ when we know that the two `SAVE_EXPR (J * 8)' nodes are the
+ same node (which means they will have the same value at run
+ time, even though we don't know when they'll be assigned).
+
+ This code also handles discovering that
+
+ SAVE_EXPR (I) * SAVE_EXPR (J * 8)
+
+ is a multiple of
+
+ 8
+
+ (of course) so we don't have to worry about dealing with a
+ possible remainder.
+
+ Note that we _look_ inside a SAVE_EXPR only to determine
+ how it was calculated; it is not safe for fold() to do much
+ of anything else with the internals of a SAVE_EXPR, since
+ fold() cannot know when it will be evaluated at run time.
+ For example, the latter example above _cannot_ be implemented
+ as
+
+ SAVE_EXPR (I) * J
+
+ or any variant thereof, since the value of J at evaluation time
+ of the original SAVE_EXPR is not necessarily the same at the time
+ the new expression is evaluated. The only optimization of this
+ sort that would be valid is changing
+
+ SAVE_EXPR (I) * SAVE_EXPR (SAVE_EXPR (J) * 8)
+ divided by
+ 8
+
+ to
+
+ SAVE_EXPR (I) * SAVE_EXPR (J)
+
+ (where the same SAVE_EXPR (J) is used in the original and the
+ transformed version). */
+
+ static int
+ multiple_of_p (type, top, bottom)
+ tree type;
+ tree top;
+ tree bottom;
+ {
+ if (operand_equal_p (top, bottom, 0))
+ return 1;
+
+ if (TREE_CODE (type) != INTEGER_TYPE)
+ return 0;
+
+ switch (TREE_CODE (top))
+ {
+ case MULT_EXPR:
+ return (multiple_of_p (type, TREE_OPERAND (top, 0), bottom)
+ || multiple_of_p (type, TREE_OPERAND (top, 1), bottom));
+
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+ return (multiple_of_p (type, TREE_OPERAND (top, 0), bottom)
+ && multiple_of_p (type, TREE_OPERAND (top, 1), bottom));
+
+ case NOP_EXPR:
+ /* Punt if conversion from non-integral or wider integral type. */
+ if ((TREE_CODE (TREE_TYPE (TREE_OPERAND (top, 0))) != INTEGER_TYPE)
+ || (TYPE_PRECISION (type)
+ < TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (top, 0)))))
+ return 0;
+ /* Fall through. */
+ case SAVE_EXPR:
+ return multiple_of_p (type, TREE_OPERAND (top, 0), bottom);
+
+ case INTEGER_CST:
+ if ((TREE_CODE (bottom) != INTEGER_CST)
+ || (tree_int_cst_sgn (top) < 0)
+ || (tree_int_cst_sgn (bottom) < 0))
+ return 0;
+ return integer_zerop (const_binop (TRUNC_MOD_EXPR,
+ top, bottom, 0));
+
+ default:
+ return 0;
+ }
+ }
+ \f
^ permalink raw reply [flat|nested] 7+ messages in thread
* f771 signal 6 in latest snapshot
1997-08-22 19:47 fixing the c++/f77 circular dependency H.J. Lu
1997-08-22 22:49 ` Splitting configure stages Alexandre Oliva
@ 1997-08-22 23:04 ` Oleg Krivosheev
1997-08-22 23:04 ` multiple_of_p change Jeffrey A Law
2 siblings, 0 replies; 7+ messages in thread
From: Oleg Krivosheev @ 1997-08-22 23:04 UTC (permalink / raw)
To: egcs
Hi,
i've bootstrapped latest snapshot and tried
to compile mdbench.f with maximum optimization,
Ultra2/200, solaris 2.5.1.
well, looks like there is regression since
previous snapshot - i was unable to
finish compilation. Bootstrap flags
were -O3 -pipe -g -mv8
here gdb backtrace
jor-el ~/work/Project/tmp$ g77 -O6 -funroll-all-loops -fomit-frame-pointer
-ffast-math -fstrength-reduce -fthread-jumps -mcpu=ultrasparc
-mtune=ultrasparc mdbnch.f
gcc: Internal compiler error: program f771 got fatal signal 6
jor-el ~/work/Project/tmp$ gdb
~/local/lib/gcc-lib/sparc-sun-solaris2.5.1/egcs-2.90.
01/f771 core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (sparc-sun-solaris2.4), Copyright 1996 Free Software Foundation,
Inc...
Core was generated by
`/home/room1/kriol/local/lib/gcc-lib/sparc-sun-solaris2.5.1/eg
cs-2.90.01/f771 md'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/lib/libc.so.1...done.
Reading symbols from /usr/lib/libdl.so.1...done.
Reading symbols from /usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1...done.
#0 0xef77452c in kill ()
(gdb) bt
#0 0xef77452c in kill ()
#1 0xef73a620 in abort ()
#2 0x1c40d8 in update_equiv_regs () at local-alloc.c:1145
#3 0x1c2bdc in local_alloc () at local-alloc.c:408
#4 0xd5f48 in rest_of_compilation (decl=0x323410) at toplev.c:3416
#5 0x4a818 in finish_function (nested=0) at f/com.c:14113
#6 0x460d0 in ffecom_finish_progunit () at f/com.c:11555
#7 0xb076c in ffestd_exec_end () at f/std.c:1414
#8 0xae4f8 in ffestc_R1225 (name=0x0) at f/stc.c:4935
#9 0xa558c in ffestc_end () at f/stc.c:5658
#10 0x8573c in ffestb_end3_ (t=0x344758) at f/stb.c:3308
#11 0x850a8 in ffestb_end (t=0x344758) at f/stb.c:2986
#12 0x8177c in ffesta_second_ (t=0x344758) at f/sta.c:1358
#13 0x798a4 in ffelex_send_token_ () at f/lex.c:1652
#14 0x77fb4 in ffelex_finish_statement_ () at f/lex.c:973
#15 0x7b490 in ffelex_file_fixed (wf=0x9, f=0x2a5188) at f/lex.c:2994
#16 0xce084 in ffe_file (wf=0x2c44f0, f=0x2a5188) at f/top.c:555
#17 0x7e734 in yyparse () at f/parse.c:74
#18 0xd19f0 in compile_file (name=0xeffff09d "mdbnch.f") at toplev.c:2470
#19 0xd826c in main (argc=16, argv=0xefffee74, envp=0xefffeeb8) at
toplev.c:4309
(gdb)
regards
OK
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: multiple_of_p change
1997-10-24 16:16 ` multiple_of_p change Christian Iseli
@ 1997-11-02 21:29 ` Jeffrey A Law
0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1997-11-02 21:29 UTC (permalink / raw)
To: Christian Iseli; +Cc: egcs
In message <199710242313.BAA08816@Rivendell.MiddleEarth.net>you write:
> During the processing, the subreg case around line 7390 of cse.c is entered.
> a classp is found, and its exp field is (subreg:SI (reg/v:SF 35) 0). At the
> next loop iteration, new_src is set from gen_lowpart_if_possible to the same
> expression (subreg:SI (reg/v:SF 35) 0).
> The *bad* thing is that HASH produces a different value than it did when
> classp->exp was hashed.
OK. This is where I'd suggest you concentrate your effort -- figure
out how/why the same RTL hashes differently from the two different
call sites.
jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: multiple_of_p change
1997-10-20 9:43 Small problem in cse Jeffrey A Law
@ 1997-10-24 16:16 ` Christian Iseli
1997-11-02 21:29 ` Jeffrey A Law
0 siblings, 1 reply; 7+ messages in thread
From: Christian Iseli @ 1997-10-24 16:16 UTC (permalink / raw)
To: law; +Cc: egcs
>> Sorry, I meant to say why is classp->first_same_value NULL?
>> From my review of the code I don't see that classp->first_same_value
>> should ever be NULL -- thus I suspect something has gone wrong
>> elsewhere that needs to be investigated.
>> But I could be wrong, since you've got a target & testcase which
>> triggers this problem you'll need to do some of the analysis.
>Ok, I'll do my best and keep you posted...
Well, it seems you were right about the fact that first_same_value should never
be null... but here is what I managed to observe so far...
cse_insn is called with the following insn
(insn 4708 4707 4709 (set (subreg:SF (reg/v:SI 47) 0)
(const_double:SF (const_int 0) 0 1076953088)) 4 {movsf} (nil)
(expr_list:REG_EQUAL (minus:SF (const_double:SF (const_int 0) 0 1077018624)
(const_double:SF (cc0) 0 1072693248))
(nil)))
This is for an 8-bit target, where the source is attempting to do a
pre-decrement of a long double
number. The target defines float as TQF (24 bits), double and long double are
both SF (32 bits).
During the processing, the subreg case around line 7390 of cse.c is entered.
a classp is found, and its exp field is (subreg:SI (reg/v:SF 35) 0). At the
next loop iteration,
new_src is set from gen_lowpart_if_possible to the same expression (subreg:SI
(reg/v:SF 35) 0).
The *bad* thing is that HASH produces a different value than it did when
classp->exp was hashed.
So, when insert_regs is called, the element pointed to by classp is deleted
from the hash table
and thus classp->first_same_value becomes 0...
But now I'm stuck (and tired... ;-). What would be the right thing(tm) to do
now?
Christian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: multiple_of_p change
@ 1997-08-23 0:01 Torbjorn Granlund
0 siblings, 0 replies; 7+ messages in thread
From: Torbjorn Granlund @ 1997-08-23 0:01 UTC (permalink / raw)
To: egcs
As the comments in the patch suggest, there are other cases which
potentially we might want to optimize, but would need further work
in expand_divmod (ie cases where code generated for EXACT_DIV_EXPR
is slower than than TRUNC_DIV or FLOOR_DIV).
[ BTW can someone provide code samples where TRUNC_DIV or FLOOR_DIV
produces better code than EXACT_DIV that we can examine? ]
I suppose that could happen for powers-of-two. If there are other cases
that give worse code for EXACT_DIV_EXPR, please let me know.
Here is an untested patch:
* expmed.c (expand_divmod): Make op1_is_pow2 depend on unsignedp
for negative constants. Promote EXACT_DIV_EXPR to TRUNC_DIV_EXPR
when op1_is_pow2.
*** /u/gcc/ss/970803/expmed.c Mon May 26 00:54:03 1997
--- expmed.c Sat Aug 23 01:49:18 1997
***************
*** 2719,2723 ****
op1_is_pow2 = (op1_is_constant
&& ((EXACT_POWER_OF_2_OR_ZERO_P (INTVAL (op1))
! || EXACT_POWER_OF_2_OR_ZERO_P (-INTVAL (op1)))));
/*
--- 2719,2723 ----
op1_is_pow2 = (op1_is_constant
&& ((EXACT_POWER_OF_2_OR_ZERO_P (INTVAL (op1))
! || (! unsignedp && EXACT_POWER_OF_2_OR_ZERO_P (-INTVAL (op1))))));
/*
***************
*** 2855,2858 ****
--- 2855,2860 ----
if (code == FLOOR_MOD_EXPR)
code = TRUNC_MOD_EXPR;
+ if (code == EXACT_DIV_EXPR && op1_is_pow2)
+ code = TRUNC_DIV_EXPR;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~1997-11-02 21:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-08-22 19:47 fixing the c++/f77 circular dependency H.J. Lu
1997-08-22 22:49 ` Splitting configure stages Alexandre Oliva
1997-08-22 23:04 ` f771 signal 6 in latest snapshot Oleg Krivosheev
1997-08-22 23:04 ` multiple_of_p change Jeffrey A Law
1997-08-23 0:01 Torbjorn Granlund
1997-10-20 9:43 Small problem in cse Jeffrey A Law
1997-10-24 16:16 ` multiple_of_p change Christian Iseli
1997-11-02 21:29 ` Jeffrey A Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).