public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: Reconsidering gcjx
@ 2006-01-27 23:26 Boehm, Hans
  0 siblings, 0 replies; 43+ messages in thread
From: Boehm, Hans @ 2006-01-27 23:26 UTC (permalink / raw)
  To: Laurent GUERBY, Kaveh R. Ghazi; +Cc: tromey, gcc, java


> From:  Laurent GUERBY
> Wether C++, Java or Ada, a new language requirement looks the same to
> me: having a good enough base compiler and runtime installed 
> for the language, I do not see anything special to Java or 
> Ada over C++ here. The base compiler I use for building GCC 
> has only c,ada (4.0) because that's what is needed, if c++ is 
> needed I'll add the recommanded c++ compiler, if java and 
> some JVM is needed, I'll add java and the recommanded JVM, no 
> big difference.

As others have pointed out, there's potentially a small difference in
the case of Java, in that I believe the .class -> .o part of the
compiler would still be buildable without an existing JVM, and perhaps
even somewhat tested without one.  And that's the part that's likely to
break if other parts of the compiler are changed.  I don't think Ada has
an analog to that.

Hans

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-02-02 18:19       ` Thorsten Glaser
@ 2006-02-05 19:28         ` Tom Tromey
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-02-05 19:28 UTC (permalink / raw)
  To: Thorsten Glaser; +Cc: GCC Mailing List

>>>>> "Thorsten" == Thorsten Glaser <tg@mirbsd.de> writes:

Thorsten> Why not keep enough support in jc1 to bootstrap ecj?

>> > We don't know how much of the language that would be.

>> And we can't tell _a priori_.  As I understand it, the intention is to
>> use upstream sources, and they will change.

Thorsten> Just keep the current state then - maybe in a separate frontend
Thorsten> only used for bootstrapping, sharing some code with the final,
Thorsten> class-only, frontend. And expand it if needed for ecj.

This really is not practical.

First, on occasion a change to the Eclipse compiler will cause it to
stop building with the current gcj.  That is, the existing java front
end is already too buggy for this plan to work.

Second, if we look down the road we can see that there's no subset of
the language that we can implement that will let this plan work.
E.g., what is to stop the eclipse compiler authors from using
generics?

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-31 14:27     ` Andrew Haley
  2006-01-31 21:23       ` Kevin Handy
@ 2006-02-02 18:19       ` Thorsten Glaser
  2006-02-05 19:28         ` Tom Tromey
  1 sibling, 1 reply; 43+ messages in thread
From: Thorsten Glaser @ 2006-02-02 18:19 UTC (permalink / raw)
  Cc: GCC Mailing List

Andrew Haley dixit:

> > Thorsten> Why not keep enough support in jc1 to bootstrap ecj?
> > 
> > We don't know how much of the language that would be.
>
>And we can't tell _a priori_.  As I understand it, the intention is to
>use upstream sources, and they will change.

Just keep the current state then - maybe in a separate frontend
only used for bootstrapping, sharing some code with the final,
class-only, frontend. And expand it if needed for ecj.

bye,
//mirabile
-- 
I believe no one can invent an algorithm. One just happens to hit upon it
when God enlightens him. Or only God invents algorithms, we merely copy them.
If you don't believe in God, just consider God as Nature if you won't deny
existence.		-- Coywolf Qi Hunt

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-31 14:27     ` Andrew Haley
@ 2006-01-31 21:23       ` Kevin Handy
  2006-02-02 18:19       ` Thorsten Glaser
  1 sibling, 0 replies; 43+ messages in thread
From: Kevin Handy @ 2006-01-31 21:23 UTC (permalink / raw)
  Cc: GCC Mailing List

Andrew Haley wrote:

>Tom Tromey writes:
> > >>>>> "Thorsten" == Thorsten Glaser <tg@mirbsd.de> writes:
> > 
> > >> ecj is written in java.  This will complicate the bootstrap process.
> > 
> > Thorsten> Why not keep enough support in jc1 to bootstrap ecj?
> > 
> > We don't know how much of the language that would be.
>
>And we can't tell _a priori_.  As I understand it, the intention is to
>use upstream sources, and they will change.
>
>  
>
Don't you just need to have a functional JVM, and .class (.jar) files 
for ecj
and all its libraries? That would change the question to what language
the JVM is written in.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-30 20:54   ` Tom Tromey
@ 2006-01-31 14:27     ` Andrew Haley
  2006-01-31 21:23       ` Kevin Handy
  2006-02-02 18:19       ` Thorsten Glaser
  0 siblings, 2 replies; 43+ messages in thread
From: Andrew Haley @ 2006-01-31 14:27 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Thorsten Glaser, GCC Mailing List

Tom Tromey writes:
 > >>>>> "Thorsten" == Thorsten Glaser <tg@mirbsd.de> writes:
 > 
 > >> ecj is written in java.  This will complicate the bootstrap process.
 > 
 > Thorsten> Why not keep enough support in jc1 to bootstrap ecj?
 > 
 > We don't know how much of the language that would be.

And we can't tell _a priori_.  As I understand it, the intention is to
use upstream sources, and they will change.

Andrew.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-30 14:30 ` Thorsten Glaser
@ 2006-01-30 20:54   ` Tom Tromey
  2006-01-31 14:27     ` Andrew Haley
  0 siblings, 1 reply; 43+ messages in thread
From: Tom Tromey @ 2006-01-30 20:54 UTC (permalink / raw)
  To: Thorsten Glaser; +Cc: GCC Mailing List

>>>>> "Thorsten" == Thorsten Glaser <tg@mirbsd.de> writes:

>> ecj is written in java.  This will complicate the bootstrap process.

Thorsten> Why not keep enough support in jc1 to bootstrap ecj?

We don't know how much of the language that would be.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 19:22   ` Tom Tromey
  2006-01-27 19:55     ` Daniel Jacobowitz
@ 2006-01-30 17:50     ` Andrew Haley
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Haley @ 2006-01-30 17:50 UTC (permalink / raw)
  To: Tom Tromey; +Cc: GCJ Hackers, GCC Mailing List

Tom Tromey writes:
 > >>>>> "Andrew" == Andrew Haley <aph@redhat.com> writes:
 > 
 > Andrew> In particular, the type system and the rules for exception
 > Andrew> regions are different.  Also, a "slot" in the .class format
 > Andrew> doesn't necessarily correspond to a variable in the source
 > Andrew> language.
 > 
 > One way to look at this is: we really want to fix these things,

We can't.  They're inherent.  As far as I'm aware all bugs caused by
this difference have been fixed in the current bytecode compiler
(after much rewriting :-), but it's an "impedance mismatch" that
doesn't go away.

 > because, e.g., we'll probably be compiling all the java code in FC to
 > objects from class files for the foreseeable future anyway (to avoid
 > huge divergences from upstream build setups).
 > 
 > Moving to gcjx would turn "writing a 1.5 compiler" from a project
 > taking all of my time for the next year into a project that takes all
 > of my time for 2 months.  Then we'd have more time to fix other
 > things.

I understand that, but I don't believe it.  IMO, integrating ecj in
such an inelegant way would lead to perpetual maintenance problems,
especially with bootstrapping.  The great thing about the possibility
of gcjx is that it's possible to get things right, with a clean
separation of interfaces.

I would contrast your proposed ecj interface with a "correct" (IMO,
YMMV, etc) way to do it -- pass the parse trees from ecj to gcj, with
none of this messing about with bytecode.  I'm sure that's possible
with ecj.  But it still would lead to some fairly painful problems
bootstrapping.  This wouldn't affect us, because we always have a Java
compiler of some kind on every system we use.

 > Andrew> In particular,
 > Andrew> .class files don't contain the full pathnames to source files.

Apparently (says Per) there's a JSR around that we can use to fix this
in a standard way.

Andrew.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
                   ` (6 preceding siblings ...)
  2006-01-29  2:06 ` Adam Megacz
@ 2006-01-30 14:30 ` Thorsten Glaser
  2006-01-30 20:54   ` Tom Tromey
  7 siblings, 1 reply; 43+ messages in thread
From: Thorsten Glaser @ 2006-01-30 14:30 UTC (permalink / raw)
  To: gcc

Tom Tromey dixit:

>In my preferred approach we would simply delete a portion of the
>existing gcj and turn jc1 into a purely bytecode-based compiler.

>ecj is written in java.  This will complicate the bootstrap process.

Why not keep enough support in jc1 to bootstrap ecj?
Maybe split out so that it can be used for only bootstrapping
(calling it jc1source or something, and being built only once
during a make bootstrap)?

>However, the situation will not be quite as severe as the Ada

Indeed. Try to build gcc 3.4 with gcc 4.0 (the Ada part)...

But having front-ends written in languages other than
C really is no good idea. On the other hand, in this
case the technical and maintenance (dropping off the
work to other people) benefits may outweigh it.

bye,
//mirabile
-- 
I believe no one can invent an algorithm. One just happens to hit upon it
when God enlightens him. Or only God invents algorithms, we merely copy them.
If you don't believe in God, just consider God as Nature if you won't deny
existence.		-- Coywolf Qi Hunt

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-29 19:41       ` Tom Tromey
@ 2006-01-29 20:29         ` David Daney
  0 siblings, 0 replies; 43+ messages in thread
From: David Daney @ 2006-01-29 20:29 UTC (permalink / raw)
  To: tromey; +Cc: Per Bothner, java, gcc

Tom Tromey wrote:
>>>>>>"Per" == Per Bothner <per@bothner.com> writes:
> 
> 
> Per> Tom Tromey wrote:
> 
>>>While investigating I realized that we would also lose a small
>>>optimization related to String "+" operations.  When translating from
>>>.java we currently use a non-synchronizing variant of StringBuffer to
>>>do this.
> 
> 
> Per> In Java-5-mode I would expect ecj to use the unsynchronized
> Per> java.lang.StringBuilder.  If not, I'd consider it a bug.
> 
> Yeah.  StringBuilder isn't as nice as our private StringBuffer,
> though, because it requires copying the character data when toString
> is invoked.  IMNSHO this is a design bug, but we're stuck with it.
> 
> Per> A desirable optimization would be to convert local (non-escaping)
> Per> uses of StringBuffer/StringBuilder to be stack allocated.  You
> Per> could even stack-allocate the actual buffer up to a limited size,
> Per> only heap-allocating when it gets over that size.
> 
> Yeah, that would be good.  We could fix the toString semantics thing
> using the same machinery.  I know David Daney was looking in this area
> a bit, I don't know what became of it though.

I started looking at it, but decided I was trying to hack it in at the 
wrong level (tree-ssa).  If we were to use ecj I would do the following:

1a) As Adam Megacz suggested, at the byte code level (between ecj and 
jc1) do some code analysis and annotating.  I think using something like 
BCEL or similar we could analyze the class files and add gcj private 
method attributes describing how parameters escaped and if they return 
non-null.

1b) Do transformations like converting java.lang.String[Buffer|Builder] 
-> gcj.gnu.StringBuffer (or what ever it is called).  I was running to 
many problems trying to rename classes after the TREEs had been 
generated, so I think the proper place to do the transformation is 
before the TREEs are generated (like at the classfile level).

2) In jc1 while generating trees from the bytecode we can use attributes 
generated in 1a to decide to allocate on the stack instead of the heap.

David Daney

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-29 19:05     ` Per Bothner
@ 2006-01-29 19:41       ` Tom Tromey
  2006-01-29 20:29         ` David Daney
  0 siblings, 1 reply; 43+ messages in thread
From: Tom Tromey @ 2006-01-29 19:41 UTC (permalink / raw)
  To: Per Bothner; +Cc: java, gcc

>>>>> "Per" == Per Bothner <per@bothner.com> writes:

Per> Tom Tromey wrote:
>> While investigating I realized that we would also lose a small
>> optimization related to String "+" operations.  When translating from
>> .java we currently use a non-synchronizing variant of StringBuffer to
>> do this.

Per> In Java-5-mode I would expect ecj to use the unsynchronized
Per> java.lang.StringBuilder.  If not, I'd consider it a bug.

Yeah.  StringBuilder isn't as nice as our private StringBuffer,
though, because it requires copying the character data when toString
is invoked.  IMNSHO this is a design bug, but we're stuck with it.

Per> A desirable optimization would be to convert local (non-escaping)
Per> uses of StringBuffer/StringBuilder to be stack allocated.  You
Per> could even stack-allocate the actual buffer up to a limited size,
Per> only heap-allocating when it gets over that size.

Yeah, that would be good.  We could fix the toString semantics thing
using the same machinery.  I know David Daney was looking in this area
a bit, I don't know what became of it though.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-29 18:54   ` Tom Tromey
@ 2006-01-29 19:05     ` Per Bothner
  2006-01-29 19:41       ` Tom Tromey
  0 siblings, 1 reply; 43+ messages in thread
From: Per Bothner @ 2006-01-29 19:05 UTC (permalink / raw)
  To: tromey; +Cc: java, gcc

Tom Tromey wrote:
> While investigating I realized that we would also lose a small
> optimization related to String "+" operations.  When translating from
> .java we currently use a non-synchronizing variant of StringBuffer to
> do this.

In Java-5-mode I would expect ecj to use the unsynchronized
java.lang.StringBuilder.  If not, I'd consider it a bug.

A desirable optimization would be to convert local (non-escaping)
uses of StringBuffer/StringBuilder to be stack allocated.  You
could even stack-allocate the actual buffer up to a limited size,
only heap-allocating when it gets over that size.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-29  2:06 ` Adam Megacz
  2006-01-29 16:55   ` Mike Emmel
@ 2006-01-29 18:54   ` Tom Tromey
  2006-01-29 19:05     ` Per Bothner
  1 sibling, 1 reply; 43+ messages in thread
From: Tom Tromey @ 2006-01-29 18:54 UTC (permalink / raw)
  To: Adam Megacz; +Cc: java, gcc

>>>>> "Adam" == Adam Megacz <megacz@cs.berkeley.edu> writes:

>> I think our technical approach should be to have ecj emit class files,
>> which would then be compiled by jc1.  In particular I think we could
>> change ecj to emit a single .jar file. 

Adam> I (and David Crawshaw) have actually done this.
Adam>   http://tool.ibex.org/

Nice!

Last night I modified the eclipse compiler to emit a zip file and also
to use GNU-style error formatting.  I also went through the other gcj
options to see what would require upstream changes.

The big thing seems to be handling the -M family of options.

There are also little impedance mismatches, like translating gcc's -W
options to the style ecj wants; or handling -I.  There are different
ways to approach these kinds of problems, though; e.g. we could write
a custom argument translating program, or we could somehow wedge it
into the gcj driver.


While investigating I realized that we would also lose a small
optimization related to String "+" operations.  When translating from
.java we currently use a non-synchronizing variant of StringBuffer to
do this.

There are a couple possible fixes for this.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-29  2:06 ` Adam Megacz
@ 2006-01-29 16:55   ` Mike Emmel
  2006-01-29 18:54   ` Tom Tromey
  1 sibling, 0 replies; 43+ messages in thread
From: Mike Emmel @ 2006-01-29 16:55 UTC (permalink / raw)
  To: Adam Megacz; +Cc: java, gcc

Sorry to reply late to this thread. First I think concentrating on a
native bytcode compiler for java makes excellent sense it decouples
you from the front end implementation. And I agree that the eclipse
compiler is a good choice. I'd have to add that jikes is also
resonable.

I would like to say that my intrest in gcjx is not so much a java
compiler but as a framework for developing  native compilers for
objects oriented languages like ruby python etc etc.
Thus I think in a bigger context were gcjx became more of a compiler
suite for languages  that generally are only implemented as interpeted
is important. So I think there is a lot of value in gcjx from this
view point and a compiler target for bytecode is not the solution for
this class of problems. Also C++ probably makes more sense for a
generic frontend then java does.

Now it may be possible to extend the bytecode to handle efficient
compliation of languages such as ruby and thats pretty intresting but
generally bytecode makes implicit restrictions on the language MS CLR
is not truely generic for example but certianly it more complete then
java bytecode. I really don't see bytecode->native compilers ever
being really generic.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
                   ` (5 preceding siblings ...)
  2006-01-28  0:13 ` Mark Mitchell
@ 2006-01-29  2:06 ` Adam Megacz
  2006-01-29 16:55   ` Mike Emmel
  2006-01-29 18:54   ` Tom Tromey
  2006-01-30 14:30 ` Thorsten Glaser
  7 siblings, 2 replies; 43+ messages in thread
From: Adam Megacz @ 2006-01-29  2:06 UTC (permalink / raw)
  To: gcc; +Cc: java


Tom Tromey <tromey@redhat.com> writes:
> I think our technical approach should be to have ecj emit class files,
> which would then be compiled by jc1.  In particular I think we could
> change ecj to emit a single .jar file. 

I (and David Crawshaw) have actually done this.

  http://tool.ibex.org/

The Eclipse compiler is great.  I really like it.

Plus, lots of tools are designed to use Java bytecode as an
intermediate format (however ill-suited it may be for that task).
Making bytecode the primary input format for gcj would make coupling
such tools to gcj a less out-there-in-left-field sort of thing.

  - a

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  0:57         ` Per Bothner
  2006-01-28  1:00           ` Tom Tromey
@ 2006-01-29  1:00           ` Anthony Green
  1 sibling, 0 replies; 43+ messages in thread
From: Anthony Green @ 2006-01-29  1:00 UTC (permalink / raw)
  To: Per Bothner; +Cc: java, gcc, tromey

On Fri, 2006-01-27 at 16:41 -0800, Per Bothner wrote:
> I.e. I'm hoping one can *statically* link ecj without any
> dependencies on (say) the SWT toolkit, or the debugger?

Yes, you can.  And when references have crept in by mistake, the Eclipse
guys were pretty quick about removing them.

BTW, the compiler can be checked out and built from this module...

cvs -z3 -d:pserver:anonymous@dev.eclipse.org:/home/eclipse co org.eclipse.jdt.core

(For the record, I think if people ask the Eclipse compiler people about
"ecj" they'll have no idea what you're talking about!  It was just
invented when we first started building it as a stand-alone batch
compiler in order to build Eclipse via RPM.  It was a convenient name to
use since upstream didn't provide one.)

AG


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  2:42         ` Kaveh R. Ghazi
@ 2006-01-28 17:51           ` Gabriel Dos Reis
  0 siblings, 0 replies; 43+ messages in thread
From: Gabriel Dos Reis @ 2006-01-28 17:51 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc, java, laurent, tromey

"Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:

|  > "Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:
|  > 
|  > | However with Tom's proposal, we need an existing java compiler for
|  > | our target.
|  > 
|  > I don't believe the issues at hand here (Java specific case) are as
|  > severe as they sound from your messages.  
| 
| Okay fine, let's quantify it.  I downloaded the Dec 2005
| gcc-testresults archive from:
| ftp://gcc.gnu.org/pub/gcc/mail-archives/gcc-testresults/gcc-testresults-2005-12.bz2
| 
| Then i ran this shell pipeline:
| 
| grep '\--enable-languages=' gcc-testresults-2005-12 | sed 's/.*--enable-languages=//; s/ .*$//' | tr ',' '\n' | sort | uniq -c | sort -nr
| 
| and I got:
| 
|    1690 c
|    1659 c++
|    1379 objc
|    1233 java
|     945 fortran
|     451 ada
|     292 treelang
|     229 obj-c++
|     228 f95
|     185 f77
|      14 pascal
|      13 for
|       3
|       2 3Dc
|       1 treela=
|       1 c+
|       1 ;t
|       1 3Dfortran
| 
| (Note: fortran + f95 + f77 = 1358 or about on par with Java.)

Thanks for the data.

| As you can see, Java currently gets less testing than c/c++ but its
| still about 3x the testing that Ada gets.  Part of the reason for
| Ada's low numbers is the extra prerequisite placed on bootstrapping.
| I'd like to avoid having Java fall lower than it already is.

I understand your goal.  However, I do not believe that the reasons
you give to explain the Ada situation carry verbatim to the Java
situation.  From the description I've seen and following the regular
Ada bootstrapping issues, it strikes that the situations are quite
dissimilar, even though they bear some ressemblance points.

Tom has provided a data point that theu used the Eclipse compiler to
build javac.  It is not like we only have one source of widely used
java compiler.  And we may even not need to full blown one.

|  > In 2006, I believe the availability of java front-ends for
|  > bootsstrapping the GNU Java is sufficiently widespread enough to
|  > outweight and overcome the potential problems you're anticipating.
| 
| I don't think it matters how available it is.

that has been one of the fundamental issue with the Ada front-end,
with requirements on specific version.

| Many testers and developers just won't bother.

See, my conclusion then is it must not be the language in which it is
written issue.  
Having the compiler written in C does not automatically drag hundreds of
testers or developers batalions. But it does have the disavantage of
not stressing the front-end as one written in Java would.

Letting the Java front-end more integrated to the Java community tools
have the potential of dragging more interested people (its community
and developers) than the already scare C-only GCC developers.

|  > We desperatly need to get GCC more supported, more integrated into
|  > widely used development tools.  We cannot sustain improvements,
|  > competition by isolating and painting ourselves into corners.
|  > -- Gaby
| 
| I think we agree on that goal and I've said my piece.  If others think
| the benefits of using Java in the Java FE are worthwhile I won't
| oppose it.

FWIW, given the scarce resource we -- and especially the Java folks --
have, we should not put the bar higher than necessary.  The idea 
outlined by Tom sounds sensible to me and worths exploring.  He
provided data points that indicate that the idea is not totally alien
to working.

-- Gaby

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  0:41       ` Gabriel Dos Reis
  2006-01-28  0:57         ` Per Bothner
@ 2006-01-28  2:42         ` Kaveh R. Ghazi
  2006-01-28 17:51           ` Gabriel Dos Reis
  1 sibling, 1 reply; 43+ messages in thread
From: Kaveh R. Ghazi @ 2006-01-28  2:42 UTC (permalink / raw)
  To: gdr; +Cc: gcc, java, laurent, tromey

 > "Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:
 > 
 > | However with Tom's proposal, we need an existing java compiler for
 > | our target.
 > 
 > I don't believe the issues at hand here (Java specific case) are as
 > severe as they sound from your messages.  

Okay fine, let's quantify it.  I downloaded the Dec 2005
gcc-testresults archive from:
ftp://gcc.gnu.org/pub/gcc/mail-archives/gcc-testresults/gcc-testresults-2005-12.bz2

Then i ran this shell pipeline:

grep '\--enable-languages=' gcc-testresults-2005-12 | sed 's/.*--enable-languages=//; s/ .*$//' | tr ',' '\n' | sort | uniq -c | sort -nr

and I got:

   1690 c
   1659 c++
   1379 objc
   1233 java
    945 fortran
    451 ada
    292 treelang
    229 obj-c++
    228 f95
    185 f77
     14 pascal
     13 for
      3
      2 3Dc
      1 treela=
      1 c+
      1 ;t
      1 3Dfortran

(Note: fortran + f95 + f77 = 1358 or about on par with Java.)

As you can see, Java currently gets less testing than c/c++ but its
still about 3x the testing that Ada gets.  Part of the reason for
Ada's low numbers is the extra prerequisite placed on bootstrapping.
I'd like to avoid having Java fall lower than it already is.


 > In 2006, I believe the availability of java front-ends for
 > bootsstrapping the GNU Java is sufficiently widespread enough to
 > outweight and overcome the potential problems you're anticipating.

I don't think it matters how available it is.  Many testers and
developers just won't bother.

 > We desperatly need to get GCC more supported, more integrated into
 > widely used development tools.  We cannot sustain improvements,
 > competition by isolating and painting ourselves into corners.
 > -- Gaby

I think we agree on that goal and I've said my piece.  If others think
the benefits of using Java in the Java FE are worthwhile I won't
oppose it.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  0:57         ` Per Bothner
@ 2006-01-28  1:00           ` Tom Tromey
  2006-01-29  1:00           ` Anthony Green
  1 sibling, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-28  1:00 UTC (permalink / raw)
  To: Per Bothner; +Cc: java, gcc

>>>>> "Per" == Per Bothner <per@bothner.com> writes:

Per> Another concern: I gather there are lots of dependencies
Per> between Eclipse libraries.  Does ecj depend on any other
Per> Eclipse libraries?

Good point.  I forgot to mention this.

The Eclipse compiler is standalone by design.

The project to look at is 'org.eclipse.jdt.core'.
You can get it from
:pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse

There are a lot of directories in this module; the compiler core
itself is in 'compiler' and 'batch'.  The only dependency these files
have is J2SE.  (The code outside these directories does depend on the
rest of Eclipse, but we would not build that.)

In fact we exploit this to build the system 'javac' for Fedora Core.
We also are currently building the Eclipse compiler nightly on the
Classpath build machine.

One minor risk for this project is that the upstream compiler will
change this for some reason.  However, I don't think that is a very
major risk; they've actually been quite interested in what we've done
for FC.  Also, due to the builder, we'll know the same day that
anything breaks :-)

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  0:41       ` Gabriel Dos Reis
@ 2006-01-28  0:57         ` Per Bothner
  2006-01-28  1:00           ` Tom Tromey
  2006-01-29  1:00           ` Anthony Green
  2006-01-28  2:42         ` Kaveh R. Ghazi
  1 sibling, 2 replies; 43+ messages in thread
From: Per Bothner @ 2006-01-28  0:57 UTC (permalink / raw)
  To: java; +Cc: gcc, tromey

Another concern: I gather there are lots of dependencies
between Eclipse libraries.  Does ecj depend on any other
Eclipse libraries?  Even if there are no run-time dependencies,
it's awkward if a class statically references some random
Eclipse class that somehow pulls in large parts of Eclipse.

I.e. I'm hoping one can *statically* link ecj without any
dependencies on (say) the SWT toolkit, or the debugger?

This is not an absolute requirement, and if there are just
a couple of troublesome dependencies, it is normally possible
(if ugly) to turn a static dependency into a run-time check,
using reflection.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-28  0:23     ` Kaveh R. Ghazi
@ 2006-01-28  0:41       ` Gabriel Dos Reis
  2006-01-28  0:57         ` Per Bothner
  2006-01-28  2:42         ` Kaveh R. Ghazi
  0 siblings, 2 replies; 43+ messages in thread
From: Gabriel Dos Reis @ 2006-01-28  0:41 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: laurent, gcc, java, tromey

"Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:

[...]

| However with Tom's proposal, we need an existing java compiler for our
| target.

I don't believe the issues at hand here (Java specific case) are as
severe as they sound from your messages.  If GCC creators had to
follow the reasoning developed in your messages, your should have had
gcc to start with.

In 2006, I believe the availability of java front-ends for
bootsstrapping the GNU Java is sufficiently widespread enough to
outweight and overcome the potential problems you're anticipating.

We desperatly need to get GCC more supported, more integrated into
widely used development tools.  We cannot sustain improvements,
competition by isolating and painting ourselves into corners.

-- Gaby

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 22:17   ` Laurent GUERBY
  2006-01-27 22:24     ` Daniel Jacobowitz
@ 2006-01-28  0:23     ` Kaveh R. Ghazi
  2006-01-28  0:41       ` Gabriel Dos Reis
  1 sibling, 1 reply; 43+ messages in thread
From: Kaveh R. Ghazi @ 2006-01-28  0:23 UTC (permalink / raw)
  To: laurent; +Cc: gcc, java, tromey

 > Also I believe not allowing new languages for new front-ends might
 > limit the increase of language front-ends in the GNU Compiler
 > Collection:

I think "not allowing" is too strong a characterization of my previous
message.  I have neither the inclination or the power to do that on my
own.  I do however feel extra caution is in order...

 > Wether C++, Java or Ada, a new language requirement looks the same to
 > me: having a good enough base compiler and runtime installed for the
 > language, I do not see anything special to Java or Ada over C++ here.

The issue for me is not C++ vs Java vs Ada.  The issue is writing the
language frontend in the same language it parses so that it depends on
itself to bootstrap.

If we wrote the G++ FE in java and the java FE in C that would satisfy
my concern just as well as the reverse of writing java FE in C++ and
the G++ FE in C.  Either way we can get from start to finish with only
a C compiler to start with.

However with Tom's proposal, we need an existing java compiler for our
target.  This same requirement for Ada has caused lots of confusion.
(Which prior versions of Ada work?  Does our configure infrastructure
handle everything correctly?)

Looking at the testsuite results, many of the people who don't bother
to compile Ada do currently compile and test java.  I suspect if we
make it harder to boot/test java then we'll see it's testing and
support decline.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
                   ` (4 preceding siblings ...)
  2006-01-27 17:08 ` Joe Buck
@ 2006-01-28  0:13 ` Mark Mitchell
  2006-01-29  2:06 ` Adam Megacz
  2006-01-30 14:30 ` Thorsten Glaser
  7 siblings, 0 replies; 43+ messages in thread
From: Mark Mitchell @ 2006-01-28  0:13 UTC (permalink / raw)
  To: tromey; +Cc: GCJ Hackers, GCC Mailing List

Tom Tromey wrote:
> Now that the GPL v3 looks as though it may be EPL-compatible, the time
> has come to reconsider using the Eclipse java compiler ("ecj") as our
> primary gcj front end.  This has both political and technical
> ramifications, I discuss them below.

First, I'd like to commend and thank you just for brining this issue up.
 It takes a lot of courage to look at your own hard work and consider
tossing it out.

My personal feeling is that using ecj would be an excellent approach.
I'm not qualified to talk about all the particular technical issues, but
I think that focusing free software developers on a single Java front
end is an excellent idea, and since you believe ecj is an actively
maintained high-quality complier, that seems fine.

As others have said, the bootstrapping issues are much more minor than
GNAT, since there are free java runtimes that can run ecj; the severity
of the "you must have version X of <lang> to build version Y" is rather
less.

The FSF's web site lists the EPL as a GPL-incompatible free software
license, but I don't think that's an issue.  We could argue about
whether or not invoking ecj from the driver, before feeding the class
file to gcj, somehow constitutes a combination under GPLv2, but I think
that's relatively pointless.  I think the general consensus is that
nobody knows for sure (lack of case law), but that most people assume
the GPL applies to programs linked together, and we certainly don't
assume a GPL problem because GCC on Solaris invokes the Solaris assembler!

Of course, we should definitely get the SC's buy in before making such a
change of this magnitude.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: Reconsidering gcjx
@ 2006-01-27 23:44 Richard Kenner
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Kenner @ 2006-01-27 23:44 UTC (permalink / raw)
  To: hans.boehm; +Cc: gcc, java

     As others have pointed out, there's potentially a small difference in
     the case of Java, in that I believe the .class -> .o part of the
     compiler would still be buildable without an existing JVM, and perhaps
     even somewhat tested without one.  And that's the part that's likely
     to break if other parts of the compiler are changed.  I don't think
     Ada has an analog to that.

This is a historical aside, but interestingly enough Ada (GNAT) *did* have
such a thing in the *very* early days, but it was decided that the trouble
involved in maintaining and using the mechanism was much larger than
other available bootstrap methods.  The difference between this and Java
is that the mechanism in question has other uses for Java, but none for Ada.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 22:53       ` Laurent GUERBY
@ 2006-01-27 23:20         ` Tom Tromey
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27 23:20 UTC (permalink / raw)
  To: Laurent GUERBY; +Cc: gcc

>>>>> "Laurent" == Laurent GUERBY <laurent@guerby.net> writes:

Laurent> If someone comes up with an old JVM that misrun java in GCC
Laurent> and there's no easy obvious workaround, will you cancel the
Laurent> java project or just tell the user to install a known to work
Laurent> JVM from the GCC install documentation? Will the 3.0 gcj and
Laurent> runtime be enough for it to work?

I assume these questions come from previous Ada bootstrap issues.

Please, let's not get overly involved in comparing this plan to Ada.
I know it is similar, but it is also different in some important ways.


The fix for the problem you outline is not very hard.  All you need
are the .class files corresponding to the .java files in your source
tree.  These can be made on any machine with any java bytecode
compiler that works.  Currently this would include the already
existing gcj... but seeing as one of the major features of this
proposed change is fixing a ton of front end bugs, I would guess that
eventually this will no longer work.  But even in this case you can
run ecj today on gij.  And, you can download ecj jars from
eclipse.org.

This is pretty much like ordinary compiler bootstrapping, except you
don't need a cross toolchain, since the needed intermediate results
are machine-independent.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 22:24     ` Daniel Jacobowitz
@ 2006-01-27 22:53       ` Laurent GUERBY
  2006-01-27 23:20         ` Tom Tromey
  0 siblings, 1 reply; 43+ messages in thread
From: Laurent GUERBY @ 2006-01-27 22:53 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gcc

On Fri, 2006-01-27 at 17:17 -0500, Daniel Jacobowitz wrote:
> Two interesting things I'd like to point out here:
> 
>   - I don't know if it's been a problem lately, but GNAT definitely
>     used to have issues with not just the language it was written
>     in, but what specific version of the compiler was used to
>     bootstrap.  ECJ, hopefully, will be less problematic.

It's indeed likely that Java will benefit from more mature (less likely
to have bugs affecting the bootstrap process - most of the GNAT
requirements came from here) and established (less likely not to be
installed - also a GNAT issue for many platforms) compilers and JVMs
when it goes in GCC than Ada at the time, but that does not change the
basic requirement for adding a new bootstraped language.

If someone comes up with an old JVM that misrun java in GCC and
there's no easy obvious workaround, will you cancel the java project
or just tell the user to install a known to work JVM from the GCC
install documentation? Will the 3.0 gcj and runtime be enough for it to
work?

Laurent


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 22:17   ` Laurent GUERBY
@ 2006-01-27 22:24     ` Daniel Jacobowitz
  2006-01-27 22:53       ` Laurent GUERBY
  2006-01-28  0:23     ` Kaveh R. Ghazi
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Jacobowitz @ 2006-01-27 22:24 UTC (permalink / raw)
  To: gcc

On Fri, Jan 27, 2006 at 11:09:11PM +0100, Laurent GUERBY wrote:
> On Fri, 2006-01-27 at 09:25 -0500, Kaveh R. Ghazi wrote:
> >  > ecj is written in java.  This will complicate the bootstrap process.
> >  > However, the situation will not be quite as severe as the Ada
> >  > situation, in that it ought to be possible to bootstrap gcj using any
> >  > java runtime, including mini ones such as JamVM -- at least, assuming
> >  > that the suggested implementation route is taken.
> > 
> > I would really hesitate to follow Ada in this regard.
> >
> > IMHO, writing your frontend in the same language it's intended to
> > compile causes it to be marginalized.  It no longer becomes part of
> > the default bootstrap sequence and gets much less testing.  You'll
> > find patches that were supposedly "bootstrapped and regtested" will
> > quite often break java because it didn't get tested as part of the
> > default.

Two interesting things I'd like to point out here:

  - I don't know if it's been a problem lately, but GNAT definitely
    used to have issues with not just the language it was written
    in, but what specific version of the compiler was used to
    bootstrap.  ECJ, hopefully, will be less problematic.

  - People already marginalize gcj because libjava takes so bloody
    long to build.  If ECJ can be noticably faster than GCJ is,
    this might have the opposite effect.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 16:24 ` Kaveh R. Ghazi
  2006-01-27 16:27   ` Frank Ch. Eigler
@ 2006-01-27 22:17   ` Laurent GUERBY
  2006-01-27 22:24     ` Daniel Jacobowitz
  2006-01-28  0:23     ` Kaveh R. Ghazi
  1 sibling, 2 replies; 43+ messages in thread
From: Laurent GUERBY @ 2006-01-27 22:17 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: tromey, gcc, java

On Fri, 2006-01-27 at 09:25 -0500, Kaveh R. Ghazi wrote:
>  > ecj is written in java.  This will complicate the bootstrap process.
>  > However, the situation will not be quite as severe as the Ada
>  > situation, in that it ought to be possible to bootstrap gcj using any
>  > java runtime, including mini ones such as JamVM -- at least, assuming
>  > that the suggested implementation route is taken.
> 
> I would really hesitate to follow Ada in this regard.
>
> IMHO, writing your frontend in the same language it's intended to
> compile causes it to be marginalized.  It no longer becomes part of
> the default bootstrap sequence and gets much less testing.  You'll
> find patches that were supposedly "bootstrapped and regtested" will
> quite often break java because it didn't get tested as part of the
> default.

It's a bit annoying but not really blocking: all GCC developpers do fix
regressions in Ada when they're properly attributed to a patch, and do
follow the project policy that exposing a latent bug elsewhere is not an
excuse (and of course if it's traced back to dubious front-end
behaviour, front-end maintainer do step in). The consequence is that you
have to test separately trunk and your development tree and merge and
submit patches only when they both work otherwise you might loose a lot
of time to unrelated issues (and that means batch submitting if you do
lots of development).

Also I believe not allowing new languages for new front-ends might
limit the increase of language front-ends in the GNU Compiler
Collection: not everyone likes coding parsing and tree algorithms in C
in 2006 and I assume a developper wanting to add a new language do like
to use this new language :).

Wether C++, Java or Ada, a new language requirement looks the same to
me: having a good enough base compiler and runtime installed for the
language, I do not see anything special to Java or Ada over C++ here.
The base compiler I use for building GCC has only c,ada (4.0) because
that's what is needed, if c++ is needed I'll add the recommanded c++
compiler, if java and some JVM is needed, I'll add java and the
recommanded JVM, no big difference.

Currently building trunk with --enable-checking and testing nearly all
languages (c,ada,c++,fortran,java,objc,treelang, so excluding Objective
C++) on a Pentium III 1GHz machine with 1GB of RAM takes between 11 and
12 hours. It could be more if some languages did have a more substantial
testsuite, which is desirable when a front-end matures. Restricting the
required number of languages for the default patch testing procedure is
unavoidable if we leave it to individual developpers to run on their
machine: in 12 hours, about 10 patches are commited to trunk on average.

Laurent


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 17:08 ` Joe Buck
@ 2006-01-27 21:23   ` Tom Tromey
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27 21:23 UTC (permalink / raw)
  To: Joe Buck; +Cc: GCJ Hackers, GCC Mailing List

>>>>> "Joe" == Joe Buck <Joe.Buck@synopsys.COM> writes:

Joe> But before fighting the political battles, we should first figure out if
Joe> this is what we really want to do if there weren't political obstacles.
Joe> Let's try coming to a technical consensus first.

I made a list of things which would have to be addressed, based
mostly on this thread.

- Make ecj emit .jar files
- Change ecj's error reporting format
  Right now it is quite ugly and doesn't conform to GNU standards.
- Consider putting column numbers in debug info
  Though as far as I know, nothing uses this today, so I think this
  is low priority.
- Check compile time performance.
  We don't want to slow gcj down too much.
- Make bootstrapping simpler.
  Some small library refactorings would make it quite simple,
  amounting to downloading a single jar file.
- Exception regions as mentioned by Andrew.
  I'm not sure what we need to do here.
- Fix variable slot tracking for bytecode
  I don't recall exactly what the problem here was.

These last two are already present in today's compiler, though
switching to ecj would exacerbate the situation.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 19:22   ` Tom Tromey
@ 2006-01-27 19:55     ` Daniel Jacobowitz
  2006-01-30 17:50     ` Andrew Haley
  1 sibling, 0 replies; 43+ messages in thread
From: Daniel Jacobowitz @ 2006-01-27 19:55 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Andrew Haley, GCJ Hackers, GCC Mailing List

On Fri, Jan 27, 2006 at 11:59:06AM -0700, Tom Tromey wrote:
> Column numbers, as Per mentioned, are trickier.  We know that ecj has
> this information (since Eclipse itself uses it), but there is no
> standard way to pass it via the class file format.  But does gdb
> actually use column numbers?

Not today, no.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 11:17 ` Andrew Haley
  2006-01-27 13:29   ` Per Bothner
@ 2006-01-27 19:22   ` Tom Tromey
  2006-01-27 19:55     ` Daniel Jacobowitz
  2006-01-30 17:50     ` Andrew Haley
  1 sibling, 2 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27 19:22 UTC (permalink / raw)
  To: Andrew Haley; +Cc: GCJ Hackers, GCC Mailing List

>>>>> "Andrew" == Andrew Haley <aph@redhat.com> writes:

Andrew> In particular, the type system and the rules for exception
Andrew> regions are different.  Also, a "slot" in the .class format
Andrew> doesn't necessarily correspond to a variable in the source
Andrew> language.

One way to look at this is: we really want to fix these things,
because, e.g., we'll probably be compiling all the java code in FC to
objects from class files for the foreseeable future anyway (to avoid
huge divergences from upstream build setups).

Moving to gcjx would turn "writing a 1.5 compiler" from a project
taking all of my time for the next year into a project that takes all
of my time for 2 months.  Then we'd have more time to fix other
things.

Andrew> In particular,
Andrew> .class files don't contain the full pathnames to source files.

I think this one we can handle via the specs.

Consider a simple compilation like "gcj -g -c foo.java".
In my proposal we would turn this into a series of invocations:

    ecj -g -Xjar /tmp/tmpfile.jar foo.java
    jc1 blah blah blah -o /tmp/tmpfile2.o tmpfile.jar

... but we could easily have the specs pass the real original file
name to jc1.

Just as a side note for non-java experts ... the reason we use an
intermediate jar file is that a single java compilation may result in
multiple .class files.  A jar file is convenient because it is simple
to make in java, and because gcj already has all the code needed to
read and compile one.

It occurs to me now that we'll probably need a way to make ecj not
compile eagerly.  I haven't looked to see whether it already has one.

Column numbers, as Per mentioned, are trickier.  We know that ecj has
this information (since Eclipse itself uses it), but there is no
standard way to pass it via the class file format.  But does gdb
actually use column numbers?

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  1:44 ` Per Bothner
       [not found]   ` <drckhs$j5p$1@sea.gmane.org>
@ 2006-01-27 19:03   ` Tom Tromey
  1 sibling, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27 19:03 UTC (permalink / raw)
  To: Per Bothner; +Cc: GCJ Hackers, GCC Mailing List

>>>>> "Per" == Per Bothner <per@bothner.com> writes:

Per> A couple of other factors:

Thanks for bringing these up.

Per> * Compile time.

Yeah, this is a potential problem.  If it is severe it could be fixed
by linking ecj into GCC.  FWIW, at least for all the packaging we do
in Fedora, we have to compile things separately, because we're using
unmodified upstream build systems.

I realize this isn't the only gcj use case.  I can try to come up with
some timings.

Per> * Debugging.  Historically Java degugging information is pretty limited.
Per>    Even with the latest specifications there is (for example) no support
Per>    for column numbers.  However, the classfile format is extensible, and
Per>    so if needed we can define extra "attribute" sections.

Yeah, this could be fixed with some upstream cooperation.

Per> * The .classfile format is quite inefficient.  For example there is
Per>    no sharing of "symbols" between classes, so there is a lot of
Per>    duplication.  However, this is a problem we share with everybody
Per>    else, and it could be solved at the bytecode level, co-operating
Per>    with other parties, iseally as a Java Specification Request.

1.5 includes Pack200, which addresses this in a somewhat odd way.
This seems like a subset of the performance problem though... if
performance is ok, this inefficiency won't matter.

Per> (2) A bytecode versions of ecj.  This is only useful if we also make
Per> available a bytecode version of libgcj, I think.

Yeah.  Right now we have a few java files that we compile based on the
target platform, eg the Process stuff.  I think we could easily move
these around and move to a slightly different approach which would let
us have a single libgcj.jar that works for all platforms.

For folks not involved in libgcj, the build is essentially done in two
steps at the moment.  First we compile to class files (this is the
classpath part), then we go back and compile to native (using the
class files as a kind of precompiled header).

In the new setup I think we would compile the class files to object
files directly in the second step (since that is what we'd get with
ecj anyway).  Bringing up libgcj on a new system would involve getting
those class files from "somewhere"; with some relatively minor build
changes we could just make this a download from gcc.gnu.org.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
       [not found]   ` <drckhs$j5p$1@sea.gmane.org>
  2006-01-27 14:17     ` Chris Gray
@ 2006-01-27 18:49     ` Tom Tromey
  1 sibling, 0 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27 18:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: java, gcc

>>>>> "Paolo" == Paolo Bonzini <bonzini@gnu.org> writes:

Paolo> How big would the mini Java-runtime be?  A bytecode-interpreter, with
Paolo> only support for two or three packages, using a simple Baker or
Paolo> mark'n'sweep GC, could be done in 10,000 lines of C code or maybe less.

The problem with a mini JVM in the bootstrap is that it will also need
a class library, which has to come from somewhere.

Bootstrapping an ecj-based gcj is not as hard as bootstrapping Ada.
The class->object compiler will still be C code, so all you will need
to bootstrap is the class files for the library.  These are platform
independent, and can be built with any java compiler on any machine.

We could even check in the class files, if need be.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
                   ` (3 preceding siblings ...)
  2006-01-27 16:24 ` Kaveh R. Ghazi
@ 2006-01-27 17:08 ` Joe Buck
  2006-01-27 21:23   ` Tom Tromey
  2006-01-28  0:13 ` Mark Mitchell
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 43+ messages in thread
From: Joe Buck @ 2006-01-27 17:08 UTC (permalink / raw)
  To: Tom Tromey; +Cc: GCJ Hackers, GCC Mailing List

On Thu, Jan 26, 2006 at 05:08:20PM -0700, Tom Tromey wrote:
> Now that the GPL v3 looks as though it may be EPL-compatible, the time
> has come to reconsider using the Eclipse java compiler ("ecj") as our
> primary gcj front end.  This has both political and technical
> ramifications, I discuss them below.

> Steering committee members, please read through if you would.  I think
> this requires some resolution at the SC/FSF level.

The political/legal issues include:

* License compatibility: if we have to wait for GPLv3, we're talking 2007,
  even if the FSF is fine with it.  Even development before then might
  be problematic if developers must link GPL and EPL code.

* Using a language other than C.  At least you're talking Java; RMS
  so strongly dislikes C++ that he goes nonlinear when the topic is
  brought up, though his arguments against it sometimes seem
  to reflect the state of C++ 10 years ago.

* The FSF not owning significant parts of the compiler.

But before fighting the political battles, we should first figure out if
this is what we really want to do if there weren't political obstacles.
Let's try coming to a technical consensus first.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 16:27   ` Frank Ch. Eigler
@ 2006-01-27 16:57     ` Gabriel Dos Reis
  0 siblings, 0 replies; 43+ messages in thread
From: Gabriel Dos Reis @ 2006-01-27 16:57 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Kaveh R. Ghazi, gcc, java

fche@redhat.com (Frank Ch. Eigler) writes:

| - supplies "virtuous circle" motivation for improvement (speed,
|   quality, ...), since it itself directly benefits

Fully agreed.  Witness: GNU C :-)

-- Gaby

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 16:24 ` Kaveh R. Ghazi
@ 2006-01-27 16:27   ` Frank Ch. Eigler
  2006-01-27 16:57     ` Gabriel Dos Reis
  2006-01-27 22:17   ` Laurent GUERBY
  1 sibling, 1 reply; 43+ messages in thread
From: Frank Ch. Eigler @ 2006-01-27 16:27 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc, java

"Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:

> [...] IMHO, writing your frontend in the same language it's intended
> to compile causes it to be marginalized.  It no longer becomes part
> of the default bootstrap sequence and gets much less testing.  [..]

Even if so, it may be worth spelling out some of the obvious benefits
of writing a compiler in its own language:

- genuine bootstrapping capability (to compile itself, being a source
  of realistic test coverage)

- supplies "virtuous circle" motivation for improvement (speed,
  quality, ...), since it itself directly benefits

- providing a concrete, educational systems application of the
  language (rather than saying "language X would be great for writing
  compilers, but by the way here is an X compiler written in Y.)


- FChE

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
                   ` (2 preceding siblings ...)
  2006-01-27 11:17 ` Andrew Haley
@ 2006-01-27 16:24 ` Kaveh R. Ghazi
  2006-01-27 16:27   ` Frank Ch. Eigler
  2006-01-27 22:17   ` Laurent GUERBY
  2006-01-27 17:08 ` Joe Buck
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 43+ messages in thread
From: Kaveh R. Ghazi @ 2006-01-27 16:24 UTC (permalink / raw)
  To: tromey; +Cc: gcc, java

 > ecj is written in java.  This will complicate the bootstrap process.
 > However, the situation will not be quite as severe as the Ada
 > situation, in that it ought to be possible to bootstrap gcj using any
 > java runtime, including mini ones such as JamVM -- at least, assuming
 > that the suggested implementation route is taken.

I would really hesitate to follow Ada in this regard.

IMHO, writing your frontend in the same language it's intended to
compile causes it to be marginalized.  It no longer becomes part of
the default bootstrap sequence and gets much less testing.  You'll
find patches that were supposedly "bootstrapped and regtested" will
quite often break java because it didn't get tested as part of the
default.

If you want to use a non-C language for java, then C++ is a better
choice because you can use G++ in the local tree to compile the java
frontend as if the java FE was a target library.  No extra
dependencies are introduced for bootstrap it just modifies the order
things get done.

It'll be more complicated for a cross-config, but at least everything
you need is in the local src tree.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
       [not found]   ` <drckhs$j5p$1@sea.gmane.org>
@ 2006-01-27 14:17     ` Chris Gray
  2006-01-27 18:49     ` Tom Tromey
  1 sibling, 0 replies; 43+ messages in thread
From: Chris Gray @ 2006-01-27 14:17 UTC (permalink / raw)
  To: Paolo Bonzini, java; +Cc: gcc

On Friday 27 January 2006 09:10, Paolo Bonzini wrote:

> How big would the mini Java-runtime be?  A bytecode-interpreter, with
> only support for two or three packages, using a simple Baker or
> mark'n'sweep GC, could be done in 10,000 lines of C code or maybe less.

JamVM is pretty small, I doubt it would be worth the effort of trying to make 
something smaller.

-- 
Chris Gray        /k/ Embedded Java Solutions  BE0503765045
Embedded & Mobile Java, OSGi        http://www.kiffer.be/k/
chris.gray@kiffer.be                         +32 3 216 0369
See us at Embedded World 2006 in Nuernberg, 14--16 Feb. 2006.
We're on the DSP Valley stand, no. 12-650 (opposite Intel).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 11:17 ` Andrew Haley
@ 2006-01-27 13:29   ` Per Bothner
  2006-01-27 19:22   ` Tom Tromey
  1 sibling, 0 replies; 43+ messages in thread
From: Per Bothner @ 2006-01-27 13:29 UTC (permalink / raw)
  To: Andrew Haley; +Cc: Tom Tromey, GCJ Hackers, GCC Mailing List

Andrew Haley wrote:
> I think that from a maintenance point of view this would be a PITA.
> Also, as Per mentioned we'd need to extend the .class file format in a
> non-standard way to get full debugging information.  In particular,
> .class files don't contain the full pathnames to source files.

The "SourceDebugExtension" attribute specified by JSR-45
(http://jcp.org/en/jsr/detail?id=45) provides one way to
provide full pathnames.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
  2006-01-27  1:44 ` Per Bothner
  2006-01-27 10:21 ` Thomas Hallgren
@ 2006-01-27 11:17 ` Andrew Haley
  2006-01-27 13:29   ` Per Bothner
  2006-01-27 19:22   ` Tom Tromey
  2006-01-27 16:24 ` Kaveh R. Ghazi
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 43+ messages in thread
From: Andrew Haley @ 2006-01-27 11:17 UTC (permalink / raw)
  To: Tom Tromey; +Cc: GCJ Hackers, GCC Mailing List

Tom Tromey writes:

 > Historically we've wanted to have a 'native' java-source-code-reading
 > compiler, that is, one which parses java sources and converts them
 > directly to trees.  From what I can remember this was based on 3
 > things:
 > 
 > * In the past the compiler handled loops built with LOOP_EXPR better
 >   than it handled loops built "by hand" out of GOTO_EXPRs.  My
 >   understanding is that this has changed since tree-ssa.  The issue
 >   here was that we made no attempt to rebuild a LOOP_EXPR from java
 >   bytecode.
 > 
 > * The .java front end could do a "constant array" optimization.  This
 >   optimization has not worked for quite some time (there's a PR).  In
 >   any case we could implement this for bytecode if it matters.
 > 
 > * The .java front end could more efficiently handle class literals.
 >   With the new 1.5 'ldc' bytecode extension, this is no longer a
 >   problem.
 > 
 > In other words, as far as I can remember, our old reasons for wanting
 > this are obsolete.

True, but there is still some information lost when going via the
.class format.  Per mentions debugging information, but there are some
other problems.  In particular, the type system and the rules for
exception regions are different.  Also, a "slot" in the .class format
doesn't necessarily correspond to a variable in the source language.
We work around all of this fairly sccessfully, but from an engineering
POV it's something of a kludge.

 > I think our technical approach should be to have ecj emit class files,
 > which would then be compiled by jc1.  In particular I think we could
 > change ecj to emit a single .jar file.  This has a few benefits: it
 > would give -save-temps meaning for gcj, it would let us more easily
 > drop ecj into the existing specs mechanism, and it would require very
 > few changes to the upstream compiler.

I think that from a maintenance point of view this would be a PITA.
Also, as Per mentioned we'd need to extend the .class file format in a
non-standard way to get full debugging information.  In particular,
.class files don't contain the full pathnames to source files.

If we were starting from scratch it would be good to start with ecj.
But we aren't starting from scratch, and it looks to me as though gcjx
has the potential to be the best long-term route.

Andrew.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27 10:21 ` Thomas Hallgren
@ 2006-01-27 10:38   ` Thomas Hallgren
  0 siblings, 0 replies; 43+ messages in thread
From: Thomas Hallgren @ 2006-01-27 10:38 UTC (permalink / raw)
  To: gcc; +Cc: java

Hear, hear!

I think using ecj as a gcj front end sounds like a terrific idea!

Kind regards,
Thomas Hallgren

Tom Tromey wrote:
> Now that the GPL v3 looks as though it may be EPL-compatible, the time
> has come to reconsider using the Eclipse java compiler ("ecj") as our
> primary gcj front end.  This has both political and technical
> ramifications, I discuss them below.
> 
> Steering committee members, please read through if you would.  I think
> this requires some resolution at the SC/FSF level.
> 
> First, a brief note on gcjx.  I had intended gcjx to serve not only as
> a cleanly written replacement for the current gcj, but also as a model
> for how GCC front ends should be written in the future; in particular
> I think writing it as a library and separating out the tree-generating
> code from the bulk of the compiler remain good ideas.  I enjoyed, and
> continue to enjoy, the writing of gcjx.  However, in this case I think
> that pleasure must give way to the greater needs of efficiency and
> cross-community cooperation.
> 
> 
> Motivation.
> 
> The motivation for this investigation is simple: sharing code is
> preferable to working in isolation.  In particular this change would
> let us offload much of the front end maintenance onto a different
> group.
> 
> Ecj has a good front end (much better than the current gcj) and decent
> bytecode generation.  It is fully 1.5-compliant and, apparently, is
> tested against the TCK by the upstream maintainers (us gcj developers
> don't have TCK access).  It also has some improvements for 1.6 (stack
> maps).  Upstream is very active.
> 
> gcjx by comparison is unfinished and really has just a single
> full-time developer, me.
> 
> 
> Technical approach.
> 
> Historically we've wanted to have a 'native' java-source-code-reading
> compiler, that is, one which parses java sources and converts them
> directly to trees.  From what I can remember this was based on 3
> things:
> 
> * In the past the compiler handled loops built with LOOP_EXPR better
>   than it handled loops built "by hand" out of GOTO_EXPRs.  My
>   understanding is that this has changed since tree-ssa.  The issue
>   here was that we made no attempt to rebuild a LOOP_EXPR from java
>   bytecode.
> 
> * The .java front end could do a "constant array" optimization.  This
>   optimization has not worked for quite some time (there's a PR).  In
>   any case we could implement this for bytecode if it matters.
> 
> * The .java front end could more efficiently handle class literals.
>   With the new 1.5 'ldc' bytecode extension, this is no longer a
>   problem.
> 
> In other words, as far as I can remember, our old reasons for wanting
> this are obsolete.
> 
> I think our technical approach should be to have ecj emit class files,
> which would then be compiled by jc1.  In particular I think we could
> change ecj to emit a single .jar file.  This has a few benefits: it
> would give -save-temps meaning for gcj, it would let us more easily
> drop ecj into the existing specs mechanism, and it would require very
> few changes to the upstream compiler.
> 
> An alternative approach would be to directly link ecj to the gcc back
> end.  However, this looks like significantly more work, requiring much
> more hacking on the internals of the upstream compiler.  I suspect
> that this won't be worth the effort.
> 
> In my preferred approach we would simply delete a portion of the
> existing gcj and turn jc1 into a purely bytecode-based compiler.  Then
> we would proceed to augment it with all the bits needed for proper 1.5
> support.
> 
> ecj is written in java.  This will complicate the bootstrap process.
> However, the situation will not be quite as severe as the Ada
> situation, in that it ought to be possible to bootstrap gcj using any
> java runtime, including mini ones such as JamVM -- at least, assuming
> that the suggested implementation route is taken.
> 
> 
> Politics.
> 
> I don't know whether the FSF or the GCC SC would let us import ecj,
> even assuming it is actually GPL compatible.  SC members, please
> discuss.
> 
> We don't know how upstream would react.  I think this is a fairly
> minor risk.
> 
> It is unclear to me whether we must even rely on GPL v3 if we went
> with the separate-ecj route.  Any comments here?  In the
> exec-via-specs approach we're invoking ecj as a separate executable,
> much the same way we exec 'as' or 'ld'.  Comments on this from
> license-oriented folks would be appreciated.
> 
> 
> Summary.
> 
> I think this would be the most efficient way to achieve 1.5 language
> compatibility for gcj, and it would also make future language changes
> less expensive.  Given the scope of the entire gcj project, especially
> when the scarcity of resource devoted to it are taken into account,
> this is significant enough to warrant the change.
> 
> Tom
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
  2006-01-27  1:44 ` Per Bothner
@ 2006-01-27 10:21 ` Thomas Hallgren
  2006-01-27 10:38   ` Thomas Hallgren
  2006-01-27 11:17 ` Andrew Haley
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 43+ messages in thread
From: Thomas Hallgren @ 2006-01-27 10:21 UTC (permalink / raw)
  To: tromey; +Cc: GCC Mailing List

Hear, hear!

I think using ecj as a gcj front end sounds like a terrific idea!

Kind regards,
Thomas Hallgren

Tom Tromey wrote:
> Now that the GPL v3 looks as though it may be EPL-compatible, the time
> has come to reconsider using the Eclipse java compiler ("ecj") as our
> primary gcj front end.  This has both political and technical
> ramifications, I discuss them below.
> 
> Steering committee members, please read through if you would.  I think
> this requires some resolution at the SC/FSF level.
> 
> First, a brief note on gcjx.  I had intended gcjx to serve not only as
> a cleanly written replacement for the current gcj, but also as a model
> for how GCC front ends should be written in the future; in particular
> I think writing it as a library and separating out the tree-generating
> code from the bulk of the compiler remain good ideas.  I enjoyed, and
> continue to enjoy, the writing of gcjx.  However, in this case I think
> that pleasure must give way to the greater needs of efficiency and
> cross-community cooperation.
> 
> 
> Motivation.
> 
> The motivation for this investigation is simple: sharing code is
> preferable to working in isolation.  In particular this change would
> let us offload much of the front end maintenance onto a different
> group.
> 
> Ecj has a good front end (much better than the current gcj) and decent
> bytecode generation.  It is fully 1.5-compliant and, apparently, is
> tested against the TCK by the upstream maintainers (us gcj developers
> don't have TCK access).  It also has some improvements for 1.6 (stack
> maps).  Upstream is very active.
> 
> gcjx by comparison is unfinished and really has just a single
> full-time developer, me.
> 
> 
> Technical approach.
> 
> Historically we've wanted to have a 'native' java-source-code-reading
> compiler, that is, one which parses java sources and converts them
> directly to trees.  From what I can remember this was based on 3
> things:
> 
> * In the past the compiler handled loops built with LOOP_EXPR better
>   than it handled loops built "by hand" out of GOTO_EXPRs.  My
>   understanding is that this has changed since tree-ssa.  The issue
>   here was that we made no attempt to rebuild a LOOP_EXPR from java
>   bytecode.
> 
> * The .java front end could do a "constant array" optimization.  This
>   optimization has not worked for quite some time (there's a PR).  In
>   any case we could implement this for bytecode if it matters.
> 
> * The .java front end could more efficiently handle class literals.
>   With the new 1.5 'ldc' bytecode extension, this is no longer a
>   problem.
> 
> In other words, as far as I can remember, our old reasons for wanting
> this are obsolete.
> 
> I think our technical approach should be to have ecj emit class files,
> which would then be compiled by jc1.  In particular I think we could
> change ecj to emit a single .jar file.  This has a few benefits: it
> would give -save-temps meaning for gcj, it would let us more easily
> drop ecj into the existing specs mechanism, and it would require very
> few changes to the upstream compiler.
> 
> An alternative approach would be to directly link ecj to the gcc back
> end.  However, this looks like significantly more work, requiring much
> more hacking on the internals of the upstream compiler.  I suspect
> that this won't be worth the effort.
> 
> In my preferred approach we would simply delete a portion of the
> existing gcj and turn jc1 into a purely bytecode-based compiler.  Then
> we would proceed to augment it with all the bits needed for proper 1.5
> support.
> 
> ecj is written in java.  This will complicate the bootstrap process.
> However, the situation will not be quite as severe as the Ada
> situation, in that it ought to be possible to bootstrap gcj using any
> java runtime, including mini ones such as JamVM -- at least, assuming
> that the suggested implementation route is taken.
> 
> 
> Politics.
> 
> I don't know whether the FSF or the GCC SC would let us import ecj,
> even assuming it is actually GPL compatible.  SC members, please
> discuss.
> 
> We don't know how upstream would react.  I think this is a fairly
> minor risk.
> 
> It is unclear to me whether we must even rely on GPL v3 if we went
> with the separate-ecj route.  Any comments here?  In the
> exec-via-specs approach we're invoking ecj as a separate executable,
> much the same way we exec 'as' or 'ld'.  Comments on this from
> license-oriented folks would be appreciated.
> 
> 
> Summary.
> 
> I think this would be the most efficient way to achieve 1.5 language
> compatibility for gcj, and it would also make future language changes
> less expensive.  Given the scope of the entire gcj project, especially
> when the scarcity of resource devoted to it are taken into account,
> this is significant enough to warrant the change.
> 
> Tom
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Reconsidering gcjx
  2006-01-27  0:12 Tom Tromey
@ 2006-01-27  1:44 ` Per Bothner
       [not found]   ` <drckhs$j5p$1@sea.gmane.org>
  2006-01-27 19:03   ` Tom Tromey
  2006-01-27 10:21 ` Thomas Hallgren
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 43+ messages in thread
From: Per Bothner @ 2006-01-27  1:44 UTC (permalink / raw)
  To: tromey; +Cc: GCJ Hackers, GCC Mailing List

> Technical approach.
> 
> Historically we've wanted to have a 'native' java-source-code-reading
> compiler, that is, one which parses java sources and converts them
> directly to trees.  From what I can remember this was based on 3
> things:

A couple of other factors:

* Compile time.  It is at least potentially faster to compile directly
   to trees.  However, this is negated in many cases since you want to
   generate both bytecoe and native code.  (This include libgcj.)
   So generating bytecode first and then generating native code from
   the bytecode is actually faster.

* Debugging.  Historically Java degugging information is pretty limited.
   Even with the latest specifications there is (for example) no support
   for column numbers.  However, the classfile format is extensible, and
   so if needed we can define extra "attribute" sections.

* The .classfile format is quite inefficient.  For example there is
   no sharing of "symbols" between classes, so there is a lot of
   duplication.  However, this is a problem we share with everybody
   else, and it could be solved at the bytecode level, co-operating
   with other parties, iseally as a Java Specification Request.

> An alternative approach would be to directly link ecj to the gcc back
> end.  However, this looks like significantly more work, requiring much
> more hacking on the internals of the upstream compiler.  I suspect
> that this won't be worth the effort.

I think you're right.  It could be a project for somebody to tackle
later.

> ecj is written in java.  This will complicate the bootstrap process.
> However, the situation will not be quite as severe as the Ada
> situation, in that it ought to be possible to bootstrap gcj using any
> java runtime, including mini ones such as JamVM -- at least, assuming
> that the suggested implementation route is taken.

I don't a "mini java runtime" would be useful.  We could offer two
bootstrap solution:
(1) An existing (installed) Java run-time, which would be an older
version of gcj.
(2) A bytecode versions of ecj.  This is only useful if we also make
available a bytecode version of libgcj, I think.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Reconsidering gcjx
@ 2006-01-27  0:12 Tom Tromey
  2006-01-27  1:44 ` Per Bothner
                   ` (7 more replies)
  0 siblings, 8 replies; 43+ messages in thread
From: Tom Tromey @ 2006-01-27  0:12 UTC (permalink / raw)
  To: GCJ Hackers; +Cc: GCC Mailing List

Now that the GPL v3 looks as though it may be EPL-compatible, the time
has come to reconsider using the Eclipse java compiler ("ecj") as our
primary gcj front end.  This has both political and technical
ramifications, I discuss them below.

Steering committee members, please read through if you would.  I think
this requires some resolution at the SC/FSF level.

First, a brief note on gcjx.  I had intended gcjx to serve not only as
a cleanly written replacement for the current gcj, but also as a model
for how GCC front ends should be written in the future; in particular
I think writing it as a library and separating out the tree-generating
code from the bulk of the compiler remain good ideas.  I enjoyed, and
continue to enjoy, the writing of gcjx.  However, in this case I think
that pleasure must give way to the greater needs of efficiency and
cross-community cooperation.


Motivation.

The motivation for this investigation is simple: sharing code is
preferable to working in isolation.  In particular this change would
let us offload much of the front end maintenance onto a different
group.

Ecj has a good front end (much better than the current gcj) and decent
bytecode generation.  It is fully 1.5-compliant and, apparently, is
tested against the TCK by the upstream maintainers (us gcj developers
don't have TCK access).  It also has some improvements for 1.6 (stack
maps).  Upstream is very active.

gcjx by comparison is unfinished and really has just a single
full-time developer, me.


Technical approach.

Historically we've wanted to have a 'native' java-source-code-reading
compiler, that is, one which parses java sources and converts them
directly to trees.  From what I can remember this was based on 3
things:

* In the past the compiler handled loops built with LOOP_EXPR better
  than it handled loops built "by hand" out of GOTO_EXPRs.  My
  understanding is that this has changed since tree-ssa.  The issue
  here was that we made no attempt to rebuild a LOOP_EXPR from java
  bytecode.

* The .java front end could do a "constant array" optimization.  This
  optimization has not worked for quite some time (there's a PR).  In
  any case we could implement this for bytecode if it matters.

* The .java front end could more efficiently handle class literals.
  With the new 1.5 'ldc' bytecode extension, this is no longer a
  problem.

In other words, as far as I can remember, our old reasons for wanting
this are obsolete.

I think our technical approach should be to have ecj emit class files,
which would then be compiled by jc1.  In particular I think we could
change ecj to emit a single .jar file.  This has a few benefits: it
would give -save-temps meaning for gcj, it would let us more easily
drop ecj into the existing specs mechanism, and it would require very
few changes to the upstream compiler.

An alternative approach would be to directly link ecj to the gcc back
end.  However, this looks like significantly more work, requiring much
more hacking on the internals of the upstream compiler.  I suspect
that this won't be worth the effort.

In my preferred approach we would simply delete a portion of the
existing gcj and turn jc1 into a purely bytecode-based compiler.  Then
we would proceed to augment it with all the bits needed for proper 1.5
support.

ecj is written in java.  This will complicate the bootstrap process.
However, the situation will not be quite as severe as the Ada
situation, in that it ought to be possible to bootstrap gcj using any
java runtime, including mini ones such as JamVM -- at least, assuming
that the suggested implementation route is taken.


Politics.

I don't know whether the FSF or the GCC SC would let us import ecj,
even assuming it is actually GPL compatible.  SC members, please
discuss.

We don't know how upstream would react.  I think this is a fairly
minor risk.

It is unclear to me whether we must even rely on GPL v3 if we went
with the separate-ecj route.  Any comments here?  In the
exec-via-specs approach we're invoking ecj as a separate executable,
much the same way we exec 'as' or 'ld'.  Comments on this from
license-oriented folks would be appreciated.


Summary.

I think this would be the most efficient way to achieve 1.5 language
compatibility for gcj, and it would also make future language changes
less expensive.  Given the scope of the entire gcj project, especially
when the scarcity of resource devoted to it are taken into account,
this is significant enough to warrant the change.

Tom

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2006-02-05 19:28 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-27 23:26 Reconsidering gcjx Boehm, Hans
  -- strict thread matches above, loose matches on Subject: below --
2006-01-27 23:44 Richard Kenner
2006-01-27  0:12 Tom Tromey
2006-01-27  1:44 ` Per Bothner
     [not found]   ` <drckhs$j5p$1@sea.gmane.org>
2006-01-27 14:17     ` Chris Gray
2006-01-27 18:49     ` Tom Tromey
2006-01-27 19:03   ` Tom Tromey
2006-01-27 10:21 ` Thomas Hallgren
2006-01-27 10:38   ` Thomas Hallgren
2006-01-27 11:17 ` Andrew Haley
2006-01-27 13:29   ` Per Bothner
2006-01-27 19:22   ` Tom Tromey
2006-01-27 19:55     ` Daniel Jacobowitz
2006-01-30 17:50     ` Andrew Haley
2006-01-27 16:24 ` Kaveh R. Ghazi
2006-01-27 16:27   ` Frank Ch. Eigler
2006-01-27 16:57     ` Gabriel Dos Reis
2006-01-27 22:17   ` Laurent GUERBY
2006-01-27 22:24     ` Daniel Jacobowitz
2006-01-27 22:53       ` Laurent GUERBY
2006-01-27 23:20         ` Tom Tromey
2006-01-28  0:23     ` Kaveh R. Ghazi
2006-01-28  0:41       ` Gabriel Dos Reis
2006-01-28  0:57         ` Per Bothner
2006-01-28  1:00           ` Tom Tromey
2006-01-29  1:00           ` Anthony Green
2006-01-28  2:42         ` Kaveh R. Ghazi
2006-01-28 17:51           ` Gabriel Dos Reis
2006-01-27 17:08 ` Joe Buck
2006-01-27 21:23   ` Tom Tromey
2006-01-28  0:13 ` Mark Mitchell
2006-01-29  2:06 ` Adam Megacz
2006-01-29 16:55   ` Mike Emmel
2006-01-29 18:54   ` Tom Tromey
2006-01-29 19:05     ` Per Bothner
2006-01-29 19:41       ` Tom Tromey
2006-01-29 20:29         ` David Daney
2006-01-30 14:30 ` Thorsten Glaser
2006-01-30 20:54   ` Tom Tromey
2006-01-31 14:27     ` Andrew Haley
2006-01-31 21:23       ` Kevin Handy
2006-02-02 18:19       ` Thorsten Glaser
2006-02-05 19:28         ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).