public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Adding Python as a possible language and it's usage
       [not found] <1531832440.64499.ezmlm@gcc.gnu.org>
@ 2018-07-17 17:13 ` Basile Starynkevitch
  2018-07-17 23:52   ` David Malcolm
  0 siblings, 1 reply; 58+ messages in thread
From: Basile Starynkevitch @ 2018-07-17 17:13 UTC (permalink / raw)
  To: gcc

Hello All,


In https://gcc.gnu.org/ml/gcc/2018-07/msg00233.html  Martin Liška wrote:

> I've recently touched AWK option generate machinery and it's quite unpleasant
> to make any adjustments. My question is simple: can we starting using a scripting
> language like Python and replace usage of the AWK scripts? It's probably question
> for Steering committee, but I would like to see feedback from community.

I would suggest also (and perhaps instead) considering using GNU Guile 
https://www.gnu.org/software/guile/

(personally, I prefer Guile to Python, but that is just my preference)

Since Guile is the preferred GNU scripting language (for example Guile 
is a GNU project, but AFAIK Python is not).

BTW, I dislike Python syntax (my personal taste is an allergy to 
significant spaces, but I admit it is just a matter of taste and I could 
contribute some Python code in the future if it becomes needed). Also, I 
am noticing that these days the Python project might have some 
governance issues (see e.g. https://lwn.net/Articles/759654/ in case you 
did not heard about it).


However, the idea of depending more deeply on a good scripting language 
in GCC is very pleasant.


Regards

-- 
Basile STARYNKEVITCH   == http://starynkevitch.net/Basile
opinions are mine only - les opinions sont seulement miennes
Bourg La Reine, France

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 17:13 ` [RFC] Adding Python as a possible language and it's usage Basile Starynkevitch
@ 2018-07-17 23:52   ` David Malcolm
  0 siblings, 0 replies; 58+ messages in thread
From: David Malcolm @ 2018-07-17 23:52 UTC (permalink / raw)
  To: Basile Starynkevitch, gcc

On Tue, 2018-07-17 at 19:13 +0200, Basile Starynkevitch wrote:
> Hello All,
> 
> 
> In https://gcc.gnu.org/ml/gcc/2018-07/msg00233.html  Martin Liška
> wrote:
> 
> > I've recently touched AWK option generate machinery and it's quite
> > unpleasant
> > to make any adjustments. My question is simple: can we starting
> > using a scripting
> > language like Python and replace usage of the AWK scripts? It's
> > probably question
> > for Steering committee, but I would like to see feedback from
> > community.
> 
> I would suggest also (and perhaps instead) considering using GNU
> Guile 
> https://www.gnu.org/software/guile/
> 
> (personally, I prefer Guile to Python, but that is just my
> preference)
> 
> Since Guile is the preferred GNU scripting language (for example
> Guile 
> is a GNU project, but AFAIK Python is not).
> 
> BTW, I dislike Python syntax (my personal taste is an allergy to 
> significant spaces, but I admit it is just a matter of taste and I
> could 
> contribute some Python code in the future if it becomes needed).
> Also, I 
> am noticing that these days the Python project might have some 
> governance issues (see e.g. https://lwn.net/Articles/759654/ in case
> you 
> did not heard about it).

[disclosure: I'm a CPython core developer, albeit a rather dormant one]

"Governance issues" seems a little strong to me: yes, Guido is stepping
down as BDFL, but will still participate, and CPython is one of the
best-run FLOSS projects I've had the pleasure of participating in.  I'm
sure that the project will continue to be well-run.

> However, the idea of depending more deeply on a good scripting
> language 
> in GCC is very pleasant.

Indeed.  I'm a fan of Python in this regard, as you might have guessed
:)

Dave

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-30 14:51       ` Joseph Myers
@ 2018-07-30 16:29         ` Andreas Schwab
  0 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2018-07-30 16:29 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Ramana Radhakrishnan, Michael Matz, Martin Liška, GCC Development

On Jul 30 2018, Joseph Myers <joseph@codesourcery.com> wrote:

> Python has been used for some glibc tests for some time.

Using it for tests is ok, since they are not part of the bootstrap
cycle.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-28 12:11     ` Ramana Radhakrishnan
  2018-07-28 17:23       ` David Malcolm
@ 2018-07-30 14:51       ` Joseph Myers
  2018-07-30 16:29         ` Andreas Schwab
  1 sibling, 1 reply; 58+ messages in thread
From: Joseph Myers @ 2018-07-30 14:51 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Michael Matz, Martin Liška, GCC Development

On Sat, 28 Jul 2018, Ramana Radhakrishnan wrote:

> > Obviously if you're bootstrapping core packages and their build
> > dependencies, use in glibc is more or less equivalent to use in GCC.  (But
> > if build dependencies include those involved in testing, you already have
> > python as one for glibc, and Tcl for GCC, for example.)
> 
> This implies that the decision for glibc has been made. while you
> imply above that the discussion is still on going ?

Python has been used for some glibc tests for some time.  It's usage to 
replace other Perl and Awk scripts (and especially those required for the 
build) for which there is discussion - though as no-one has objected to 
such a change we may effectively have consensus.

https://sourceware.org/ml/libc-alpha/2018-07/msg00559.html

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-28  0:26       ` Paul Smith
@ 2018-07-30 14:34         ` Joseph Myers
  0 siblings, 0 replies; 58+ messages in thread
From: Joseph Myers @ 2018-07-30 14:34 UTC (permalink / raw)
  To: Paul Smith; +Cc: Michael Matz, Martin Liška, GCC Development

On Fri, 27 Jul 2018, Paul Smith wrote:

> If Perl is already in the bootstrap set and the awk scripts are hard to
> maintain then why can't the awk scripts be rewritten in Perl instead of
> Python?  That would avoid adding more prerequisites and surely Perl is
> sufficiently expressive that it can perform these translations just as
> well as Python.

At least in the glibc community we find the current developers generally 
prefer Python for such code, so using it in place of Perl (or Awk) works 
better for maintainability now.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-28 12:11     ` Ramana Radhakrishnan
@ 2018-07-28 17:23       ` David Malcolm
  2018-07-30 14:51       ` Joseph Myers
  1 sibling, 0 replies; 58+ messages in thread
From: David Malcolm @ 2018-07-28 17:23 UTC (permalink / raw)
  To: Ramana Radhakrishnan, Joseph Myers
  Cc: Michael Matz, Martin Liška, GCC Development

On Sat, 2018-07-28 at 10:55 +0100, Ramana Radhakrishnan wrote:
> On Fri, Jul 27, 2018 at 3:38 PM, Joseph Myers <joseph@codesourcery.co
> m> wrote:
> > On Fri, 27 Jul 2018, Michael Matz wrote:
> > 
> > > Using any python scripts as part of generally building GCC (i.e.
> > > where the
> > > generated files aren't prepackaged) will introduce a python
> > > dependency for
> > > distro packages.  And for those distros that bootstrap a core
> > > cycle of
> > > packages (e.g. *SUSE) this will include python (and all its
> > > dependencies)
> > > into that bootstrap cycle.
> > 
> > I would have expected most concerns to be about builds on non-GNU
> > hosts -
> > not about builds on GNU/Linux where Python is generally already
> > available
> > (and differences in Python versions should definitely *not* affect
> > the
> > generated output, so there should be no increases in the number of
> > iterations required for any bootstrap cycle to converge).
> > 
> > We've been having a similar discussion for glibc, both about
> > replacing
> > uses of perl (optional, but required to build the manual and to run
> > various tests - python is also already required to run various
> > tests) with
> > python and about replacing uses of awk (required) with python as
> > well, in
> > the interests of easier maintainability - and I didn't see any
> > concerns
> > raised about such a change at all.  Of course in the glibc case
> > pretty
> > much all building is done on GNU hosts (although theoretically you
> > can
> > cross-compile from non-GNU systems, in practice that's liable to be
> > broken
> > with e.g. cross-rpcgen not building with random systems' headers,
> > and
> > probable dependencies on GNU versions of various host tools).
> 
> I can certainly remember quite a number of painful issues getting
> shaken out by python during the AArch64 bootstrap much before we
> published the port upstream that not much other testing was able to
> find. It was a good test of the toolchain but if it is required that
> you need to have working python on the target *before* you get a
> bootstrapped GCC on the system, I'm not sure how helpful /
> frustrating
> that is really to folks trying to bring up a GNU / Linux system
> natively. I am concerned that we are increasing the barrier on entry
> for such developers.
> 
> It is not the majority of developers (but put another way) we do need
> to answer the question whether the dependency on python makes it
> harder for folks to bring up a new GNU/Linux system on a new
> architecture even though it may make life easier in other areas for
> working on the compiler.
> 
> What are the other areas where we envisage using python in the longer
> term for GCC ?  option processing  is one area, where else ?

FWIW I have a Python module for working with the output of -fsave-
optimization-record (a JSON-based format).  It's not clear to me if
that should live in the gcc source tree (and thus tarball) or as a part
of a 3rd-party repository.

Related to that, I'd like to use Python in the testsuite, for verifying
the output of -fsave-optimization-record.

See
  https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01546.html
for more info on both of these.

Dave

> > Obviously if you're bootstrapping core packages and their build
> > dependencies, use in glibc is more or less equivalent to use in
> > GCC.  (But
> > if build dependencies include those involved in testing, you
> > already have
> > python as one for glibc, and Tcl for GCC, for example.)
> 
> This implies that the decision for glibc has been made. while you
> imply above that the discussion is still on going ?
> 
> regards
> Ramana
> 
> > 
> > 
> > Joseph S. Myers
> > joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 14:54   ` Joseph Myers
  2018-07-27 15:11     ` Michael Matz
@ 2018-07-28 12:11     ` Ramana Radhakrishnan
  2018-07-28 17:23       ` David Malcolm
  2018-07-30 14:51       ` Joseph Myers
  1 sibling, 2 replies; 58+ messages in thread
From: Ramana Radhakrishnan @ 2018-07-28 12:11 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Michael Matz, Martin Liška, GCC Development

On Fri, Jul 27, 2018 at 3:38 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Fri, 27 Jul 2018, Michael Matz wrote:
>
>> Using any python scripts as part of generally building GCC (i.e. where the
>> generated files aren't prepackaged) will introduce a python dependency for
>> distro packages.  And for those distros that bootstrap a core cycle of
>> packages (e.g. *SUSE) this will include python (and all its dependencies)
>> into that bootstrap cycle.
>
> I would have expected most concerns to be about builds on non-GNU hosts -
> not about builds on GNU/Linux where Python is generally already available
> (and differences in Python versions should definitely *not* affect the
> generated output, so there should be no increases in the number of
> iterations required for any bootstrap cycle to converge).
>
> We've been having a similar discussion for glibc, both about replacing
> uses of perl (optional, but required to build the manual and to run
> various tests - python is also already required to run various tests) with
> python and about replacing uses of awk (required) with python as well, in
> the interests of easier maintainability - and I didn't see any concerns
> raised about such a change at all.  Of course in the glibc case pretty
> much all building is done on GNU hosts (although theoretically you can
> cross-compile from non-GNU systems, in practice that's liable to be broken
> with e.g. cross-rpcgen not building with random systems' headers, and
> probable dependencies on GNU versions of various host tools).

I can certainly remember quite a number of painful issues getting
shaken out by python during the AArch64 bootstrap much before we
published the port upstream that not much other testing was able to
find. It was a good test of the toolchain but if it is required that
you need to have working python on the target *before* you get a
bootstrapped GCC on the system, I'm not sure how helpful / frustrating
that is really to folks trying to bring up a GNU / Linux system
natively. I am concerned that we are increasing the barrier on entry
for such developers.

It is not the majority of developers (but put another way) we do need
to answer the question whether the dependency on python makes it
harder for folks to bring up a new GNU/Linux system on a new
architecture even though it may make life easier in other areas for
working on the compiler.

What are the other areas where we envisage using python in the longer
term for GCC ?  option processing  is one area, where else ?


> Obviously if you're bootstrapping core packages and their build
> dependencies, use in glibc is more or less equivalent to use in GCC.  (But
> if build dependencies include those involved in testing, you already have
> python as one for glibc, and Tcl for GCC, for example.)

This implies that the decision for glibc has been made. while you
imply above that the discussion is still on going ?

regards
Ramana

>
>
> Joseph S. Myers
> joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 14:38   ` Michael Matz
@ 2018-07-28  3:01     ` Matthias Klose
  0 siblings, 0 replies; 58+ messages in thread
From: Matthias Klose @ 2018-07-28  3:01 UTC (permalink / raw)
  To: Michael Matz, Martin Liška; +Cc: GCC Development

On 27.07.2018 16:31, Michael Matz wrote:
> Hi,
> 
> On Fri, 27 Jul 2018, Michael Matz wrote:
> 
>> Using any python scripts as part of generally building GCC (i.e. where 
>> the generated files aren't prepackaged) will introduce a python 
>> dependency for distro packages.  And for those distros that bootstrap a 
>> core cycle of packages (e.g. *SUSE) this will include python (and all 
>> its dependencies) into that bootstrap cycle.
>>
>> That will be terrible.
> 
> Oh, and of course, I haven't read any really convincing arguments for 
> why python would be so much better than awk to counter the disadvantages.
> 
> Building a compiler (especially one that regards itself as a 
> multi-target/host one) should have extremely few prerequisites (ideally 
> only a compiler and runtime for the language its written in), and I 
> wouldn't call a full python distro that (no matter how much people claim 
> that getting the necessary subset of python is mostly trivial.  compiling 
> any random awk is trivial, especially given a compiler you already need 
> anyway; python is not).

that very much depends on your bootstrap system supporting staged builds.  You
already have to do that for glibc/gcc anyway.  But yes, if you think that adding
a staged python build is more complicated ...

> Hell, if anything I'd say we should rewrite the awk scripts into POSIX sh 
> (!).  I'll concede that for text processing AWK is nicer ;-)
> 
> So, if it's only for a minor convenience of writing some text 
> processing scripts, no, that's not a good reason to complicate our 
> prerequisites.  (The helper scripts in contrib/ as long as they aren't 
> used during GCC build can use any fancy language they want)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
                   ` (4 preceding siblings ...)
  2018-07-27 14:31 ` Michael Matz
@ 2018-07-28  2:29 ` konsolebox
  5 siblings, 0 replies; 58+ messages in thread
From: konsolebox @ 2018-07-28  2:29 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development

Just another user here.

I'm not a fan of Python and I don't want it added as a dependency to my
favorite compiler. If I would build a minimal system with a toolchain, I
wouldn't want Python to be a mandatory component, so please don't. Thanks.

P.S. I don't mind Perl. It's a legacy tool next to Awk.


On Tue, Jul 17, 2018, 8:49 PM Martin Liška <mliska@suse.cz> wrote:

> Hi.
>
> I've recently touched AWK option generate machinery and it's quite
> unpleasant
> to make any adjustments. My question is simple: can we starting using a
> scripting
> language like Python and replace usage of the AWK scripts? It's probably
> question
> for Steering committee, but I would like to see feedback from community.
>
> There are some bulletins why I would like to replace current AWK scripts:
>
> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags
> type classes multiple
> global variables are created (var_opt_char, var_opt_string, ...)
>
> 2) similar happens in gcc/opth-gen.awk
>
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
> believe
>    we should come up with a structured option format that will make
> parsing and
>    processing much simpler.
>
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
>
> 5) there are various targets that generate *.opt files, one example is ARM:
> gcc/config/arm/parsecpu.awk
>
> where transforms:
> ./gcc/config/arm/arm-cpus.in
>
> I guess having a well-defined structured format for *.opt files will make
> it easier to write generated opt files?
>
> I'm attaching a prototype that can transform optionlist into options-save.c
> that can be compiled and works.
>
> I'm looking forward to a feedback.
> Martin
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 15:11     ` Michael Matz
@ 2018-07-28  0:26       ` Paul Smith
  2018-07-30 14:34         ` Joseph Myers
  0 siblings, 1 reply; 58+ messages in thread
From: Paul Smith @ 2018-07-28  0:26 UTC (permalink / raw)
  To: Michael Matz, Joseph Myers; +Cc: Martin Liška, GCC Development

On Fri, 2018-07-27 at 14:53 +0000, Michael Matz wrote:
> perl is currently included in the bootstrap set.  There's no reason
> why python couldn't be included as well,

If Perl is already in the bootstrap set and the awk scripts are hard to
maintain then why can't the awk scripts be rewritten in Perl instead of
Python?  That would avoid adding more prerequisites and surely Perl is
sufficiently expressive that it can perform these translations just as
well as Python.

I understand some people have an issue with Perl's maintainability but
just because you CAN write difficult to maintain code in Perl doesn't
mean you HAVE to.

I've seen plenty of difficult to understand and maintain Python
scripting... just saying "use Python" is not a panacea for
supportability problems.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 14:54   ` Joseph Myers
@ 2018-07-27 15:11     ` Michael Matz
  2018-07-28  0:26       ` Paul Smith
  2018-07-28 12:11     ` Ramana Radhakrishnan
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Matz @ 2018-07-27 15:11 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Martin Liška, GCC Development

Hi,

On Fri, 27 Jul 2018, Joseph Myers wrote:

> I would have expected most concerns to be about builds on non-GNU hosts - 
> not about builds on GNU/Linux where Python is generally already available 
> (and differences in Python versions should definitely *not* affect the 
> generated output, so there should be no increases in the number of 
> iterations required for any bootstrap cycle to converge).
> 
> We've been having a similar discussion for glibc, both about replacing 
> uses of perl (optional, but required to build the manual and to run 
> various tests - python is also already required to run various tests) with 
> python and about replacing uses of awk (required) with python as well, in 
> the interests of easier maintainability - and I didn't see any concerns 
> raised about such a change at all.

perl is currently included in the bootstrap set.  There's no reason why 
python couldn't be included as well, but we'd have to make it a limited 
python (so that the additional builddeps become at least minimal), and 
that leads to further work (decisions and implementation around the 
existence of minimal-python and full-python).

And of course the build time of the bootstrap cycle lengthens 
non-trivially.  Maybe not by much, but still.

I don't know why you didn't get concerns raised during those discussions, 
it can't mean an indication that everything is fine with going from 
perl to python when part of non-optional build dependencies.  (Optional 
deps are always fine; we're breaking out those parts, like testsuite, into 
different packages that aren't then part of the bootstrap cycle).

> Obviously if you're bootstrapping core packages and their build 
> dependencies, use in glibc is more or less equivalent to use in GCC.  (But 
> if build dependencies include those involved in testing, you already have 
> python as one for glibc, and Tcl for GCC, for example.)

Testsuites aren't part of the bootstrap cycle if they would have to 
enlarge it unduly.  Tcl and expect is, though (hmm, I wonder why), as is 
perl; they all have trivial buildrequires.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 14:31 ` Michael Matz
  2018-07-27 14:38   ` Michael Matz
@ 2018-07-27 14:54   ` Joseph Myers
  2018-07-27 15:11     ` Michael Matz
  2018-07-28 12:11     ` Ramana Radhakrishnan
  1 sibling, 2 replies; 58+ messages in thread
From: Joseph Myers @ 2018-07-27 14:54 UTC (permalink / raw)
  To: Michael Matz; +Cc: Martin Liška, GCC Development

On Fri, 27 Jul 2018, Michael Matz wrote:

> Using any python scripts as part of generally building GCC (i.e. where the 
> generated files aren't prepackaged) will introduce a python dependency for 
> distro packages.  And for those distros that bootstrap a core cycle of 
> packages (e.g. *SUSE) this will include python (and all its dependencies) 
> into that bootstrap cycle.

I would have expected most concerns to be about builds on non-GNU hosts - 
not about builds on GNU/Linux where Python is generally already available 
(and differences in Python versions should definitely *not* affect the 
generated output, so there should be no increases in the number of 
iterations required for any bootstrap cycle to converge).

We've been having a similar discussion for glibc, both about replacing 
uses of perl (optional, but required to build the manual and to run 
various tests - python is also already required to run various tests) with 
python and about replacing uses of awk (required) with python as well, in 
the interests of easier maintainability - and I didn't see any concerns 
raised about such a change at all.  Of course in the glibc case pretty 
much all building is done on GNU hosts (although theoretically you can 
cross-compile from non-GNU systems, in practice that's liable to be broken 
with e.g. cross-rpcgen not building with random systems' headers, and 
probable dependencies on GNU versions of various host tools).

Obviously if you're bootstrapping core packages and their build 
dependencies, use in glibc is more or less equivalent to use in GCC.  (But 
if build dependencies include those involved in testing, you already have 
python as one for glibc, and Tcl for GCC, for example.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-27 14:31 ` Michael Matz
@ 2018-07-27 14:38   ` Michael Matz
  2018-07-28  3:01     ` Matthias Klose
  2018-07-27 14:54   ` Joseph Myers
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Matz @ 2018-07-27 14:38 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development

Hi,

On Fri, 27 Jul 2018, Michael Matz wrote:

> Using any python scripts as part of generally building GCC (i.e. where 
> the generated files aren't prepackaged) will introduce a python 
> dependency for distro packages.  And for those distros that bootstrap a 
> core cycle of packages (e.g. *SUSE) this will include python (and all 
> its dependencies) into that bootstrap cycle.
> 
> That will be terrible.

Oh, and of course, I haven't read any really convincing arguments for 
why python would be so much better than awk to counter the disadvantages.

Building a compiler (especially one that regards itself as a 
multi-target/host one) should have extremely few prerequisites (ideally 
only a compiler and runtime for the language its written in), and I 
wouldn't call a full python distro that (no matter how much people claim 
that getting the necessary subset of python is mostly trivial.  compiling 
any random awk is trivial, especially given a compiler you already need 
anyway; python is not).

Hell, if anything I'd say we should rewrite the awk scripts into POSIX sh 
(!).  I'll concede that for text processing AWK is nicer ;-)

So, if it's only for a minor convenience of writing some text 
processing scripts, no, that's not a good reason to complicate our 
prerequisites.  (The helper scripts in contrib/ as long as they aren't 
used during GCC build can use any fancy language they want)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
                   ` (3 preceding siblings ...)
  2018-07-23 14:21 ` Joseph Myers
@ 2018-07-27 14:31 ` Michael Matz
  2018-07-27 14:38   ` Michael Matz
  2018-07-27 14:54   ` Joseph Myers
  2018-07-28  2:29 ` konsolebox
  5 siblings, 2 replies; 58+ messages in thread
From: Michael Matz @ 2018-07-27 14:31 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 1374 bytes --]

Hi,

On Tue, 17 Jul 2018, Martin Liška wrote:

> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
> global variables are created (var_opt_char, var_opt_string, ...)
> 
> 2) similar happens in gcc/opth-gen.awk
> 
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
>    we should come up with a structured option format that will make parsing and
>    processing much simpler.
> 
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
> 
> 5) there are various targets that generate *.opt files, one example is ARM:
> gcc/config/arm/parsecpu.awk
> 
> where transforms:
> ./gcc/config/arm/arm-cpus.in
> 
> I guess having a well-defined structured format for *.opt files will make
> it easier to write generated opt files?
> 
> I'm attaching a prototype that can transform optionlist into options-save.c
> that can be compiled and works.
> 
> I'm looking forward to a feedback.

Using any python scripts as part of generally building GCC (i.e. where the 
generated files aren't prepackaged) will introduce a python dependency for 
distro packages.  And for those distros that bootstrap a core cycle of 
packages (e.g. *SUSE) this will include python (and all its dependencies) 
into that bootstrap cycle.

That will be terrible.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 10:56   ` David Malcolm
  2018-07-18 11:08     ` Jakub Jelinek
  2018-07-18 11:31     ` Jonathan Wakely
@ 2018-07-23 14:31     ` Joseph Myers
  2 siblings, 0 replies; 58+ messages in thread
From: Joseph Myers @ 2018-07-23 14:31 UTC (permalink / raw)
  To: David Malcolm; +Cc: Richard Biener, Martin Liška, GCC Development

On Wed, 18 Jul 2018, David Malcolm wrote:

> Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
> 414), which makes it much easier to write scripts that work with both
> 2.* and 3.*.  Python 3.3 is almost 6 years old.

I can't see u'' as of any relevance to .opt parsing.  Both the .opt files, 
and the generated output from them, should be pure ASCII, and using native 
str throughout (never using Python 2 unicode) should work fine.

(I don't see much value in declaring support for EOL versions of Python, 
i.e. anything before 2.7 and 3.4, but if we do, I don't think u'' will be 
a feature that controls which versions are supported.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
                   ` (2 preceding siblings ...)
  2018-07-18 15:13 ` Boris Kolpackov
@ 2018-07-23 14:21 ` Joseph Myers
  2018-07-27 14:31 ` Michael Matz
  2018-07-28  2:29 ` konsolebox
  5 siblings, 0 replies; 58+ messages in thread
From: Joseph Myers @ 2018-07-23 14:21 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]

On Tue, 17 Jul 2018, Martin Liška wrote:

> I've recently touched AWK option generate machinery and it's quite 
> unpleasant to make any adjustments. My question is simple: can we 
> starting using a scripting language like Python and replace usage of the 
> AWK scripts? It's probably question for Steering committee, but I would 
> like to see feedback from community.

I'd prefer Python to Awk for this code.

> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397

More generally, I don't think there are any checks that flags specified 
for options are known flags at all; I expect a typo in a flag to result in 
it being silently ignored.

Common code that reads .opt files into some logical datastructure, 
complete with validation including that all flags specified are in the 
list of valid flags, followed by converting those structures to whatever 
output is required, seems appropriate to me.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 20:09                         ` Matthias Klose
@ 2018-07-20 20:15                           ` Konovalov, Vadim
  0 siblings, 0 replies; 58+ messages in thread
From: Konovalov, Vadim @ 2018-07-20 20:15 UTC (permalink / raw)
  To: Matthias Klose, Segher Boessenkool, Paul Koning; +Cc: Martin Liška, gcc

> From: Matthias Klose 
> To: Konovalov, Vadim; Segher Boessenkool;
> On 20.07.2018 20:53, Konovalov, Vadim wrote:
> > Sometimes those are not behind, those could have no python for other reasons - 
> > maybe those are too forward? They just don't have python yet?
> > 
> >>> it is straightforward.
> >>
> >> Installing it is not straightforward at all.
> > 
> > I also agree with this;
> 
> all == "Installing it is not straightforward" ?
> 
> I do question this. I mentioned elsewhere what is needed.

What is needed - not always presented.

> > Please consider that both Python - 2 and 3 - they both do not 
> > support build chain on Windows with GCC
> > 
> > for me, it is a showstopper
> 
> This seems to be a different issue.  However I have to say
> that I'm not booting
> Windows on a regular basis.  Does build chain on Windows
> means Cygwin?  If yes,
> there surely is Python available prebuilt.

Cygwin is very different platform, 
python rebuild on Cygwin is supported here, yes, but this is very 
different matter.

But I was talking about Windows, not Cygwin,

Rebuild of Python on windows (without Cygwin) not supported,
I was surprised to discover that and I will be gladly accept and use it
When it eventually will support GCC+Windows rebuild.

There are some blogs on Internet about someone who eventually 
did a build on windows with GCC, but - 
why this effort wasn't propagated into python mainstream?

Most of those mentioned blogs are from 2006 or 2008; rather obsolete and could 
not be easily reused

https://wiki.python.org/moin/WindowsCompilers

mentions
GCC - MinGW (x86)
MinGW is an alternative C/C++ compiler that works with all Python versions up to 3.4.

BUT this is just fake - no, the instruction is unfinished and does not work even supposed to work

> Matthias

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 18:59                       ` Konovalov, Vadim
@ 2018-07-20 20:09                         ` Matthias Klose
  2018-07-20 20:15                           ` Konovalov, Vadim
  0 siblings, 1 reply; 58+ messages in thread
From: Matthias Klose @ 2018-07-20 20:09 UTC (permalink / raw)
  To: Konovalov, Vadim, Segher Boessenkool, Paul Koning; +Cc: Martin Liška, gcc

On 20.07.2018 20:53, Konovalov, Vadim wrote:
>> From: Segher Boessenkool
>> On Fri, Jul 20, 2018 at 12:54:36PM -0400, Paul Koning wrote:
>>>>> Fully agree with that. Coming up with a new scripts written in python2 really
>>>>> makes no sense.
>>>>
>>>> Then python cannot be a build requirement for GCC, since some of our
>>>> primary targets do not ship python3.
>>>
>>> Is it required that GCC must build with only the stock
>>> support elements on the primary target platforms?
>>
>> Not that I know.  But why
>> should we make it hugely harder for essentially
>> no benefit?
>>
>> All the arguments
>> against awk were arguments against *the current scripts*.
>>
>> And yes, we can (and
>> perhaps should) rewrite those build scripts as C code,
>> just like all the other
>> gen* we have.
> 
> +1 
> 
>>> Or is it allowed to require installing prerequisites?  Yes,
>>> some platforms are so far behind they still don't ship Python 3, but installing
> 
> Sometimes those are not behind, those could have no python for other reasons - 
> maybe those are too forward? They just don't have python yet?
> 
>>> it is straightforward.
>>
>> Installing it is not straightforward at all.
> 
> I also agree with this;

all == "Installing it is not straightforward" ?

I do question this. I mentioned elsewhere what is needed.

> Please consider that both Python - 2 and 3 - they both do not 
> support build chain on Windows with GCC
> 
> for me, it is a showstopper

This seems to be a different issue.  However I have to say that I'm not booting
Windows on a regular basis.  Does build chain on Windows means Cygwin?  If yes,
there surely is Python available prebuilt.

Matthias

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 17:59                     ` Segher Boessenkool
@ 2018-07-20 18:59                       ` Konovalov, Vadim
  2018-07-20 20:09                         ` Matthias Klose
  0 siblings, 1 reply; 58+ messages in thread
From: Konovalov, Vadim @ 2018-07-20 18:59 UTC (permalink / raw)
  To: Segher Boessenkool, Paul Koning; +Cc: Martin Liška, Matthias Klose, gcc

> From: Segher Boessenkool
> On Fri, Jul 20, 2018 at 12:54:36PM -0400, Paul Koning wrote:
> > >> Fully agree with that. Coming up with a new scripts written in python2 really
> > >> makes no sense.
> > > 
> > > Then python cannot be a build requirement for GCC, since some of our
> > > primary targets do not ship python3.
> > 
> > Is it required that GCC must build with only the stock
> > support elements on the primary target platforms?
> 
> Not that I know.  But why
> should we make it hugely harder for essentially
> no benefit?
> 
> All the arguments
> against awk were arguments against *the current scripts*.
> 
> And yes, we can (and
> perhaps should) rewrite those build scripts as C code,
> just like all the other
> gen* we have.

+1 

> > Or is it allowed to require installing prerequisites?  Yes,
> > some platforms are so far behind they still don't ship Python 3, but installing

Sometimes those are not behind, those could have no python for other reasons - 
maybe those are too forward? They just don't have python yet?

> > it is straightforward.
> 
> Installing it is not straightforward at all.

I also agree with this;

Please consider that both Python - 2 and 3 - they both do not 
support build chain on Windows with GCC

for me, it is a showstopper

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 17:12                   ` Paul Koning
@ 2018-07-20 17:59                     ` Segher Boessenkool
  2018-07-20 18:59                       ` Konovalov, Vadim
  0 siblings, 1 reply; 58+ messages in thread
From: Segher Boessenkool @ 2018-07-20 17:59 UTC (permalink / raw)
  To: Paul Koning; +Cc: Martin Liška, Matthias Klose, gcc

On Fri, Jul 20, 2018 at 12:54:36PM -0400, Paul Koning wrote:
> 
> 
> > On Jul 20, 2018, at 12:37 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > 
> > On Fri, Jul 20, 2018 at 11:49:05AM +0200, Martin Liška wrote:
> >> Fully agree with that. Coming up with a new scripts written in python2 really
> >> makes no sense.
> > 
> > Then python cannot be a build requirement for GCC, since some of our
> > primary targets do not ship python3.
> 
> Is it required that GCC must build with only the stock support elements on the primary target platforms?

Not that I know.  But why should we make it hugely harder for essentially
no benefit?

All the arguments against awk were arguments against *the current scripts*.

And yes, we can (and perhaps should) rewrite those build scripts as C code,
just like all the other gen* we have.

> Or is it allowed to require installing prerequisites?  Yes, some platforms are so far behind they still don't ship Python 3, but installing it is straightforward.

Installing it is not straightforward at all.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 16:54                 ` Segher Boessenkool
@ 2018-07-20 17:12                   ` Paul Koning
  2018-07-20 17:59                     ` Segher Boessenkool
  0 siblings, 1 reply; 58+ messages in thread
From: Paul Koning @ 2018-07-20 17:12 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Martin Liška, Matthias Klose, gcc



> On Jul 20, 2018, at 12:37 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> On Fri, Jul 20, 2018 at 11:49:05AM +0200, Martin Liška wrote:
>> Fully agree with that. Coming up with a new scripts written in python2 really
>> makes no sense.
> 
> Then python cannot be a build requirement for GCC, since some of our
> primary targets do not ship python3.

Is it required that GCC must build with only the stock support elements on the primary target platforms?  Or is it allowed to require installing prerequisites?  Yes, some platforms are so far behind they still don't ship Python 3, but installing it is straightforward.

	paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-20 10:01               ` Martin Liška
@ 2018-07-20 16:54                 ` Segher Boessenkool
  2018-07-20 17:12                   ` Paul Koning
  0 siblings, 1 reply; 58+ messages in thread
From: Segher Boessenkool @ 2018-07-20 16:54 UTC (permalink / raw)
  To: Martin Liška; +Cc: Matthias Klose, gcc

On Fri, Jul 20, 2018 at 11:49:05AM +0200, Martin Liška wrote:
> Fully agree with that. Coming up with a new scripts written in python2 really
> makes no sense.

Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 18:11         ` Matthias Klose
@ 2018-07-20 11:04           ` Martin Liška
  0 siblings, 0 replies; 58+ messages in thread
From: Martin Liška @ 2018-07-20 11:04 UTC (permalink / raw)
  To: Matthias Klose, gcc

On 07/18/2018 08:03 PM, Matthias Klose wrote:
> On 18.07.2018 19:29, Paul Koning wrote:
>>
>>
>>> On Jul 18, 2018, at 1:22 PM, Boris Kolpackov <boris@codesynthesis.com> wrote:
>>>
>>> Paul Koning <paulkoning@comcast.net> writes:
>>>
>>>>> On Jul 18, 2018, at 11:13 AM, Boris Kolpackov <boris@codesynthesis.com> wrote:
>>>>>
>>>>> I wonder what will be the expected way to obtain a suitable version of
>>>>> Python if one is not available on the build machine? With awk I can
>>>>> build it from source pretty much anywhere. Is building newer versions
>>>>> of Python on older targets a similarly straightforward process (somehow
>>>>> I doubt it)? What about Windows?
>>>>
>>>> It's the same sort of thing: untar the sources, configure, make, make
>>>> install.
> 
> Windows binaries and MacOSX binaries are available from upstream.  The build
> process on *ix targets is autoconf based and easy as for awk/gawk.
> 
>>> Will this also install all the Python packages one might plausible want
>>> to use in GCC?
> 
> some extension modules depend on external libraries, but even if those don't
> exist, the build succeeds without building these extension modules. The sources
> come with embedded libs for zlib, libmpdec,  libexpat.  They don't include
> libffi (only in 3.7), libsqlite, libgdbm, libbluetooth, libdb.  I suppose the
> usage of such modules should be banned by policy.  The only needed thing is any
> of libdb (Berkley/SleepyCat) or gdbm to build the anydbm module which might be
> necessary.
> 
>> It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual).  I expect that will easily cover anything GCC might want to do.
> 
> The current usage of awk and perl doesn't include any third party libraries.
> That's where the usage of Python should start with.

Thank you Matthias for explanation of dependencies problematics. I can confirm
that option handling scripts can easily work without any fancy modules.

Martin

> 
> Matthias
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 20:24   ` Karsten Merker
  2018-07-20 10:02     ` Matthias Klose
@ 2018-07-20 10:07     ` Martin Liška
  1 sibling, 0 replies; 58+ messages in thread
From: Martin Liška @ 2018-07-20 10:07 UTC (permalink / raw)
  To: Karsten Merker, gcc

On 07/19/2018 10:20 PM, Karsten Merker wrote:
> David Malcolm wrote:
>> On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:
>>> I've recently touched AWK option generate machinery and it's
>>> quite unpleasant to make any adjustments.  My question is
>>> simple: can we starting using a scripting language like Python
>>> and replace usage of the AWK scripts?  It's probably question
>>> for Steering committee, but I would like to see feedback from
>>> community.
>>
>> As you know, I'm a fan of Python.  As I noted elsewhere in this
>> thread, one issue is Python 2 vs Python 3 (and minimum
>> versions).  Within Python 2.*, Python 2.6 onwards is broadly
>> compatible with Python 3.*, and there's a well-known common
>> subset that works in both languages.
>>
>> To what extent would this complicate bootstrap?  (I don't think
>> so, in that it would appear to be just an external build-time
>> dependency on the build machine).
>>
>> Would this make it harder for people to build GCC?  It's one
>> more dependency, but CPython is widely available and relatively
>> easy to build.  (I don't have experience of doing bring-up of a
>> new architecture, though).
> 
> Hello,
> 
> I have recently been working on bringing up a new Debian port for
> the riscv64 architecture from scratch, so I would like to add
> some of my personal experiences here.
> 
> Adding a dependency on python for building gcc would make life
> for distribution porters quite a bit harder.  There are a bunch
> of packages that are more or less essential for a modern Linux
> distribution but at the same time extremely difficult to properly
> cross-build.  For a distribution porter trying to bootstrap a new
> architecture, this means that one has to resort to native
> building sooner or later, i.e. one has to build native toolchain
> packages and then work forward from there.  During the bootstrap
> process it is often necessary to break dependency cycles and
> natively rebuild toolchain packages with different build-profiles
> enabled, or to build newer versions of the same toolchain packages
> with bugfixes for the new architecture.
> 
> A dependency on python would mean that to be able to do a native
> rebuild of the toolchain one would need a native python.  The
> problem here is that python has an enormous number of transitive
> build-dependencies and not all of them are easily cross-buildable,
> i.e. one needs a native compiler to build some of them in a
> bootstrap scenario.  This can lead to a catch-22-style situation
> where one would need a native python package and its dependencies
> for natively building the gcc package and a native gcc package
> for building (some of) the dependencies of the python package.

Hi.

The problematic is quite covered in this thread. You're not CC, so
please take a look:

https://gcc.gnu.org/ml/gcc/2018-07/msg00233.html

So for your use case, cross compilation of python (without fancy
modules that have dependencies) should work for you to make a transition
into native distribution.

Martin

> 
> With awk we don't have this problem as in contrast to python awk
> doesn't pull in any dependencies that aren't required by gcc
> anyway.  From a distro porter's point of view I would therefore
> appreciate very much if it would be possible to avoid adding a
> python dependency to the gcc build process.
> 
> Regards,
> Karsten
> 
> P.S.: I am not subscribed to the list, so it would be nice
>       if you could CC me on replies.
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 15:56     ` Jeff Law
  2018-07-19 16:12       ` Eric Gallager
@ 2018-07-20 10:05       ` Martin Liška
  1 sibling, 0 replies; 58+ messages in thread
From: Martin Liška @ 2018-07-20 10:05 UTC (permalink / raw)
  To: Jeff Law, Segher Boessenkool, Richard Biener; +Cc: GCC Development

On 07/19/2018 04:47 PM, Jeff Law wrote:
> On 07/18/2018 03:28 PM, Segher Boessenkool wrote:
>> On Wed, Jul 18, 2018 at 11:51:36AM +0200, Richard Biener wrote:
>>> We already conditionally require Perl for building for some targets so I wonder
>>> if using perl would be better ...
>>
>> At least perl is GPL (Python is not).
>>
>>
>> What would the advantage of using Python be?  I haven't heard any yet.
>> Awk may be a bit clunky but at least it is easily readable for anyone.
> I've found python *far* easier to read than awk.  And you can actually
> run a debugger on your python code to see what it's doing.
> Jeff
> 

Yes, using Python is mainly because of object-oriented programming paradigm.
It's handy to have encapsulation of functionality in methods, one can do
unit-testing of parts of the script. Currently AWK scripts are mix of input/output
transformation and various emission of printf('#error..') sanity checks.
In general the script is not easily readable and contains multiple global arrays
that simulate encapsulation in classes.

Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 20:24   ` Karsten Merker
@ 2018-07-20 10:02     ` Matthias Klose
  2018-07-20 10:07     ` Martin Liška
  1 sibling, 0 replies; 58+ messages in thread
From: Matthias Klose @ 2018-07-20 10:02 UTC (permalink / raw)
  To: Karsten Merker, gcc

On 19.07.2018 22:20, Karsten Merker wrote:
> David Malcolm wrote:
>> On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:
>>> I've recently touched AWK option generate machinery and it's
>>> quite unpleasant to make any adjustments.  My question is
>>> simple: can we starting using a scripting language like Python
>>> and replace usage of the AWK scripts?  It's probably question
>>> for Steering committee, but I would like to see feedback from
>>> community.
>>
>> As you know, I'm a fan of Python.  As I noted elsewhere in this
>> thread, one issue is Python 2 vs Python 3 (and minimum
>> versions).  Within Python 2.*, Python 2.6 onwards is broadly
>> compatible with Python 3.*, and there's a well-known common
>> subset that works in both languages.
>>
>> To what extent would this complicate bootstrap?  (I don't think
>> so, in that it would appear to be just an external build-time
>> dependency on the build machine).
>>
>> Would this make it harder for people to build GCC?  It's one
>> more dependency, but CPython is widely available and relatively
>> easy to build.  (I don't have experience of doing bring-up of a
>> new architecture, though).
> 
> Hello,
> 
> I have recently been working on bringing up a new Debian port for
> the riscv64 architecture from scratch, so I would like to add
> some of my personal experiences here.
> 
> Adding a dependency on python for building gcc would make life
> for distribution porters quite a bit harder.  There are a bunch
> of packages that are more or less essential for a modern Linux
> distribution but at the same time extremely difficult to properly
> cross-build.  For a distribution porter trying to bootstrap a new
> architecture, this means that one has to resort to native
> building sooner or later, i.e. one has to build native toolchain
> packages and then work forward from there.  During the bootstrap
> process it is often necessary to break dependency cycles and
> natively rebuild toolchain packages with different build-profiles
> enabled, or to build newer versions of the same toolchain packages
> with bugfixes for the new architecture.
> 
> A dependency on python would mean that to be able to do a native
> rebuild of the toolchain one would need a native python.  The
> problem here is that python has an enormous number of transitive
> build-dependencies and not all of them are easily cross-buildable,
> i.e. one needs a native compiler to build some of them in a
> bootstrap scenario.  This can lead to a catch-22-style situation
> where one would need a native python package and its dependencies
> for natively building the gcc package and a native gcc package
> for building (some of) the dependencies of the python package.
> 
> With awk we don't have this problem as in contrast to python awk
> doesn't pull in any dependencies that aren't required by gcc
> anyway.  From a distro porter's point of view I would therefore
> appreciate very much if it would be possible to avoid adding a
> python dependency to the gcc build process.

I don't see that as an issue.  As said in another reply in this thread, you can
do a staged python build, which has the same build dependencies as awk (maybe
except the db/gdvm module). And if you need to, you can cross build python as
well more easily than for example perl or guile.

Matthias

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 14:29             ` Matthias Klose
  2018-07-18 14:46               ` Janne Blomqvist
@ 2018-07-20 10:01               ` Martin Liška
  2018-07-20 16:54                 ` Segher Boessenkool
  1 sibling, 1 reply; 58+ messages in thread
From: Martin Liška @ 2018-07-20 10:01 UTC (permalink / raw)
  To: Matthias Klose, gcc

On 07/18/2018 04:29 PM, Matthias Klose wrote:
> On 18.07.2018 14:49, Joel Sherrill wrote:
>> On Wed, Jul 18, 2018, 7:15 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>>
>>> On Wed, 18 Jul 2018 at 13:06, Eric S. Raymond wrote:
>>>>
>>>> Jonathan Wakely <jwakely.gcc@gmail.com>:
>>>>> On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
>>>>>> Python 2.6 onwards is broadly compatible with Python 3.*. and is
>>> about
>>>>>> to be 10 years old.  (IIRC it was the system python implementation in
>>>>>> RHEL 6).
>>>>>
>>>>> It is indeed. Without some regular testing with Python 2.6 it could be
>>>>> easy to introduce code that doesn't actually work on that old version.
>>>>> I did that recently, see PR 86112.
>>>>>
>>>>> This isn't an objection to using Python (I like it, and anyway I don't
>>>>> touch the parts of GCC that you're talking about using it for). Just a
>>>>> caution that trying to restrict yourself to a portable subset isn't
>>>>> always easy for casual users of a language (also a problem with C++98
>>>>> vs C++11 vs C++14 as I'm sure many GCC devs are aware).
>>>>
>>>> It's not very difficult to write "polyglot" Python that is indifferent
>>>> to which version it runs under.  I had to solve this problem for
>>>> reposurgeon; techniques documented here...
>>>
>>> I don't see any mention of avoiding dict comprehensions (not supported
>>> until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
>>>
>>> I maintain it's easy to unwittingly use a feature (such as dict
>>> comprehensions) which works fine on your machine, but aren't supported
>>> by all versions you intend to support. Regular testing with the oldest
>>> version is needed to prevent that (which was the point I was making).
>>>
>>
>> I think the RTEMS Community may be a good precedence here. RTEMS is always
>> cross compiled and we are as host agnostic as possible. We use as close to
>> the latest release of GCC, binutils, gdb, and newlib as possible. Our host
>> side tools are in a combination of Python and C++. We use Sphinx for
>> documentation.
>>
>> We are careful to use the Python on RHEL 6 as a baseline. You can build an
>> RTEMS environment there. But at least one of the Sphinx pieces requires a
>> Python of at least RHEL 7 vintage.
>>
>> We have a lot of what I will politely call institutional and large
>> organization users who have to adhere to strict IT policies. I think RHEL 7
>> is common but can't swear there is no RHEL 6 out there and because of that,
>> we set the Python 2.x as a minimum.
>>
>> Yes these are old. And for native new distribution use, it doesn't matter.
>> But for cross and local upgrades, old distributions matter. Particularly
>> those targeting enterprise users. And those are glacially slow.
>>
>> As an aside, it was not being able to build the RTEMS documentation that
>> pushed me off RHEL 6 as my primary personal environment last year. I wanted
>> to be using the oldest distribution I thought was in use in our community.
> 
> doesn't RHEL 6 has overlays for that very reason to install a newer Python3?
> 
> Please don't start with Python2 anymore. It's discontinued in less than two
> years and then you'll have distributions not having Python2 anymore.  If you
> don't have a recent Python3, then you probably can build it for your platform
> itself.

Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense. Even though we agree on transition of option scripts to Python,
I'm planning to that in time frame of GCC 10 release.

Martin

> 
> Python3 is also cross-buildable, and much easier to cross-build than guile or perl.
> 
> Matthias
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 20:08       ` Richard Earnshaw (lists)
@ 2018-07-20  9:49         ` Michael Clark
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Clark @ 2018-07-20  9:49 UTC (permalink / raw)
  To: Richard Earnshaw (lists)
  Cc: Florian Weimer, Segher Boessenkool, Richard Biener,
	Martin Liška, GCC Development



On 20/07/2018, at 4:12 AM, Richard Earnshaw (lists) <Richard.Earnshaw@arm.com <mailto:Richard.Earnshaw@arm.com>> wrote:

> On 19/07/18 12:30, Florian Weimer wrote:
>> * Segher Boessenkool:
>> 
>>> What would the advantage of using Python be?  I haven't heard any yet.
>>> Awk may be a bit clunky but at least it is easily readable for anyone.
>> 
>> I'm not an experienced awk programmer, but I don't think plain awk
>> supports arrays of arrays, so there's really no good way to emulate
>> user-defined data types and structure the code.
>> 
> 
> You can do multi-dimentional arrays in awk.  They're flattened a bit
> like tcl ones are, but there are ways of iterating over a dimention.
> See, for example, config/arm/parsecpu.awk which gets up to tricks like that.

Has it occurred to anyone to write a lean and fast tool in the host C++ subset that is allowable for use in the host tools. I guess this is currently C++98/GNU++98. This adds no additional dependencies. Sure it is a slight level of effort higher than writing an awk script, but with a modern C++ this is less of a case as it has ever been in the past. I personally use C++11/14 as a substitute for python type programs that would normally be considered script language material, mostly due to fluency and the fact that modern C++ has grown more tractable as a replacement for “fast to code in” languages given it is much faster to code in than C.

LLVM discussed changing the host compiler language feature dependencies to C++11/C++14. There are obvious but not insurmountable bootstrap requirements. i.e. for very old systems it will require an intermediate C++11/C++14 compiler to bootstrap LLVM 6.0. Here is LLVM's new compiler baseline and it seems to require CentOS 7.

- Clang 3.1
- GCC 4.8
- Visual Studio 2015 (Update 3)

[1] https://llvm.org/docs/GettingStarted.html#getting-a-modern-host-c-toolchain

I find I can be very productive and nearly as concise in C++11 as I can in many script languages due to modern versions of <vector>, <map>, <set>, <memory>, <string>, <regex>, auto, lambdas, etc. It’s relatively easy to write memory clean code from the get go using std::unique_ptr<> and sparing amounts of std::shared_ptr<>) and the new C++11 for comprehensions, initializer lists and various other enhancements can make coding in “modern C++” relatively friendly and productive. In the words of Bjarne Stroustrup: It feels like a new language. I can point to examples of small text processing utilities that i’ve written that could be written in python with a relatively similar amount of effort. Fluency with STL and the use of lean idioms. STL and structs (data hiding is only a header tweak away from being broken in a language like C++, and the use of struct and is similar to language like python which resorts to using underscores or “idiomatic enforcement”). i.e. there are lightweight, fast and productive modern C++ idioms that work well with vectors, sets, maps and unique_ptr or shared_ptr automatic memory management. I find with modern idioms these programs valgrind clean almost always.

Would modern-C++ (for generated files) be considered for GCC 9? The new idioms may make parts of the code considerable more concise and could allow use of some of the new automatic memory management features. The reason I’m suggesting this, is that for anything other than a trivial command line invocation of sed or awk, I would tend to write a modern C++ program to do text processing versus a script langauge like python. Firstly it is faster, Secondly I am proficient enough and the set and map functionality combined with the new automatic memory management is sufficient enough that complex nested data structures and text processing can handled with relative ease. Note: I do tend to avoid iostream and instead use stdc FILE * and fopen/fread/frwite/fclose or POSIX open/read/write/close if I want to avoid buffering. I find iostream performance is not that great.

How conservative are we? Is C++11 going go be available for use in GCC before C++2x in 202x. Indeed <filesystem> would improve some of the Windows/UNIX interoperability. I’ve found that writing C++11/14 allows me to write in an idiomatic C/C++ subset that is quite stable across platforms. We now even have <cstdint> on Windows. There has been quite a bit of convergence.

Having the constraint that modern C++11/14 can only be used for generated files lessens the burden as the distribution build can maintain the same base compiler dependencies.

Michael.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  1:01 ` David Malcolm
@ 2018-07-19 20:24   ` Karsten Merker
  2018-07-20 10:02     ` Matthias Klose
  2018-07-20 10:07     ` Martin Liška
  0 siblings, 2 replies; 58+ messages in thread
From: Karsten Merker @ 2018-07-19 20:24 UTC (permalink / raw)
  To: gcc

David Malcolm wrote:
>On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:
>> I've recently touched AWK option generate machinery and it's
>> quite unpleasant to make any adjustments.  My question is
>> simple: can we starting using a scripting language like Python
>> and replace usage of the AWK scripts?  It's probably question
>> for Steering committee, but I would like to see feedback from
>> community.
>
>As you know, I'm a fan of Python.  As I noted elsewhere in this
>thread, one issue is Python 2 vs Python 3 (and minimum
>versions).  Within Python 2.*, Python 2.6 onwards is broadly
>compatible with Python 3.*, and there's a well-known common
>subset that works in both languages.
>
>To what extent would this complicate bootstrap?  (I don't think
>so, in that it would appear to be just an external build-time
>dependency on the build machine).
>
>Would this make it harder for people to build GCC?  It's one
>more dependency, but CPython is widely available and relatively
>easy to build.  (I don't have experience of doing bring-up of a
>new architecture, though).

Hello,

I have recently been working on bringing up a new Debian port for
the riscv64 architecture from scratch, so I would like to add
some of my personal experiences here.

Adding a dependency on python for building gcc would make life
for distribution porters quite a bit harder.  There are a bunch
of packages that are more or less essential for a modern Linux
distribution but at the same time extremely difficult to properly
cross-build.  For a distribution porter trying to bootstrap a new
architecture, this means that one has to resort to native
building sooner or later, i.e. one has to build native toolchain
packages and then work forward from there.  During the bootstrap
process it is often necessary to break dependency cycles and
natively rebuild toolchain packages with different build-profiles
enabled, or to build newer versions of the same toolchain packages
with bugfixes for the new architecture.

A dependency on python would mean that to be able to do a native
rebuild of the toolchain one would need a native python.  The
problem here is that python has an enormous number of transitive
build-dependencies and not all of them are easily cross-buildable,
i.e. one needs a native compiler to build some of them in a
bootstrap scenario.  This can lead to a catch-22-style situation
where one would need a native python package and its dependencies
for natively building the gcc package and a native gcc package
for building (some of) the dependencies of the python package.

With awk we don't have this problem as in contrast to python awk
doesn't pull in any dependencies that aren't required by gcc
anyway.  From a distro porter's point of view I would therefore
appreciate very much if it would be possible to avoid adding a
python dependency to the gcc build process.

Regards,
Karsten

P.S.: I am not subscribed to the list, so it would be nice
      if you could CC me on replies.
-- 
Gem. Par. 28 Abs. 4 Bundesdatenschutzgesetz widerspreche ich der Nutzung
sowie der Weitergabe meiner personenbezogenen Daten für Zwecke der
Werbung sowie der Markt- oder Meinungsforschung.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 12:28     ` Florian Weimer
@ 2018-07-19 20:08       ` Richard Earnshaw (lists)
  2018-07-20  9:49         ` Michael Clark
  0 siblings, 1 reply; 58+ messages in thread
From: Richard Earnshaw (lists) @ 2018-07-19 20:08 UTC (permalink / raw)
  To: Florian Weimer, Segher Boessenkool
  Cc: Richard Biener, Martin Liška, GCC Development

On 19/07/18 12:30, Florian Weimer wrote:
> * Segher Boessenkool:
> 
>> What would the advantage of using Python be?  I haven't heard any yet.
>> Awk may be a bit clunky but at least it is easily readable for anyone.
> 
> I'm not an experienced awk programmer, but I don't think plain awk
> supports arrays of arrays, so there's really no good way to emulate
> user-defined data types and structure the code.
> 

You can do multi-dimentional arrays in awk.  They're flattened a bit
like tcl ones are, but there are ways of iterating over a dimention.
See, for example, config/arm/parsecpu.awk which gets up to tricks like that.

R.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-19 15:56     ` Jeff Law
@ 2018-07-19 16:12       ` Eric Gallager
  2018-07-20 10:05       ` Martin Liška
  1 sibling, 0 replies; 58+ messages in thread
From: Eric Gallager @ 2018-07-19 16:12 UTC (permalink / raw)
  To: Jeff Law
  Cc: Segher Boessenkool, Richard Biener, Martin Liška, GCC Development

On 7/19/18, Jeff Law <law@redhat.com> wrote:
> On 07/18/2018 03:28 PM, Segher Boessenkool wrote:
>> On Wed, Jul 18, 2018 at 11:51:36AM +0200, Richard Biener wrote:
>>> We already conditionally require Perl for building for some targets so I
>>> wonder
>>> if using perl would be better ...
>>
>> At least perl is GPL (Python is not).
>>
>>
>> What would the advantage of using Python be?  I haven't heard any yet.
>> Awk may be a bit clunky but at least it is easily readable for anyone.
> I've found python *far* easier to read than awk.  And you can actually
> run a debugger on your python code to see what it's doing.
> Jeff
>

gawk comes with a debugger, too:
https://www.gnu.org/software/gawk/manual/html_node/Debugger.html

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 22:42   ` Segher Boessenkool
  2018-07-19 12:28     ` Florian Weimer
@ 2018-07-19 15:56     ` Jeff Law
  2018-07-19 16:12       ` Eric Gallager
  2018-07-20 10:05       ` Martin Liška
  1 sibling, 2 replies; 58+ messages in thread
From: Jeff Law @ 2018-07-19 15:56 UTC (permalink / raw)
  To: Segher Boessenkool, Richard Biener; +Cc: Martin Liška, GCC Development

On 07/18/2018 03:28 PM, Segher Boessenkool wrote:
> On Wed, Jul 18, 2018 at 11:51:36AM +0200, Richard Biener wrote:
>> We already conditionally require Perl for building for some targets so I wonder
>> if using perl would be better ...
> 
> At least perl is GPL (Python is not).
> 
> 
> What would the advantage of using Python be?  I haven't heard any yet.
> Awk may be a bit clunky but at least it is easily readable for anyone.
I've found python *far* easier to read than awk.  And you can actually
run a debugger on your python code to see what it's doing.
Jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 16:56   ` Paul Koning
  2018-07-18 17:29     ` Boris Kolpackov
@ 2018-07-19 14:47     ` Konovalov, Vadim
  1 sibling, 0 replies; 58+ messages in thread
From: Konovalov, Vadim @ 2018-07-19 14:47 UTC (permalink / raw)
  To: Paul Koning, Boris Kolpackov; +Cc: gcc, mliska

Boris Kolpackov  wrote:
> From: Paul Koning
> > I wonder what will be the expected way to obtain a suitable version of 
> > Python if one is not available on the build machine? With awk I can 
> > build it from source pretty much anywhere. Is building newer versions 
> > of Python on older targets a similarly straightforward process 
> > (somehow I doubt it)? What about Windows?
> 
> It's the same sort of thing: untar
> the sources, configure, make, make install.  The code is larger than awk but
> the process is no more difficult.

Python build chain on windows does not support building with gcc

It was surprise for me to discover that, but this is how it is.

Very inconvenient.

> For Windows there are pre-built kits.  Ditto
> for a number of other popular operating systems.

This suits for simple cases or for "popular" ones, but greatly complicate things if it isn't


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 22:42   ` Segher Boessenkool
@ 2018-07-19 12:28     ` Florian Weimer
  2018-07-19 20:08       ` Richard Earnshaw (lists)
  2018-07-19 15:56     ` Jeff Law
  1 sibling, 1 reply; 58+ messages in thread
From: Florian Weimer @ 2018-07-19 12:28 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Richard Biener, Martin Liška, GCC Development

* Segher Boessenkool:

> What would the advantage of using Python be?  I haven't heard any yet.
> Awk may be a bit clunky but at least it is easily readable for anyone.

I'm not an experienced awk programmer, but I don't think plain awk
supports arrays of arrays, so there's really no good way to emulate
user-defined data types and structure the code.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  9:51 ` Richard Biener
  2018-07-18 10:03   ` Richard Earnshaw (lists)
  2018-07-18 10:56   ` David Malcolm
@ 2018-07-18 22:42   ` Segher Boessenkool
  2018-07-19 12:28     ` Florian Weimer
  2018-07-19 15:56     ` Jeff Law
  2 siblings, 2 replies; 58+ messages in thread
From: Segher Boessenkool @ 2018-07-18 22:42 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Liška, GCC Development

On Wed, Jul 18, 2018 at 11:51:36AM +0200, Richard Biener wrote:
> We already conditionally require Perl for building for some targets so I wonder
> if using perl would be better ...

At least perl is GPL (Python is not).


What would the advantage of using Python be?  I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.


Segher

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 12:15         ` Jonathan Wakely
  2018-07-18 12:50           ` Joel Sherrill
@ 2018-07-18 21:28           ` Eric S. Raymond
  1 sibling, 0 replies; 58+ messages in thread
From: Eric S. Raymond @ 2018-07-18 21:28 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: David Malcolm, Richard Guenther, Martin Liška, gcc

Jonathan Wakely <jwakely.gcc@gmail.com>:
> I don't see any mention of avoiding dict comprehensions (not supported
> until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).

That is correct. The HOWTO introduction does say that its techniques
won't guarantee 2.6 compatibility.  That would have been a great deal more
difficult - some 3.x syntax backported into 2.7.2 makes a large difference
here.

In practice, no deployment of reposurgeon or src or doclifter or any
of the other polyglot Python code I maintain has tripped over this, or
at least I'm not seeing issue reports about it.

Python devteam support for Python 2.6 terminated in 2013.

> I maintain it's easy to unwittingly use a feature (such as dict
> comprehensions) which works fine on your machine, but aren't supported
> by all versions you intend to support. Regular testing with the oldest
> version is needed to prevent that (which was the point I was making).

Yes. This is why reposurgeon, doclifter, and cvs-fast-export both have
regression-test suites that exercise all Python code under both 2 and
3, a practice I strongly recommend.

Python 2.7 is scheduled for EOL in 2020.  My plan is to retain 2.7 support
in my code until 2022.

I report that my practices are keeping the frequency of Python port
defects I hear about to zero.  I understand that GCC may have different
constraints.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 17:44       ` Paul Koning
@ 2018-07-18 18:11         ` Matthias Klose
  2018-07-20 11:04           ` Martin Liška
  0 siblings, 1 reply; 58+ messages in thread
From: Matthias Klose @ 2018-07-18 18:11 UTC (permalink / raw)
  To: gcc

On 18.07.2018 19:29, Paul Koning wrote:
> 
> 
>> On Jul 18, 2018, at 1:22 PM, Boris Kolpackov <boris@codesynthesis.com> wrote:
>>
>> Paul Koning <paulkoning@comcast.net> writes:
>>
>>>> On Jul 18, 2018, at 11:13 AM, Boris Kolpackov <boris@codesynthesis.com> wrote:
>>>>
>>>> I wonder what will be the expected way to obtain a suitable version of
>>>> Python if one is not available on the build machine? With awk I can
>>>> build it from source pretty much anywhere. Is building newer versions
>>>> of Python on older targets a similarly straightforward process (somehow
>>>> I doubt it)? What about Windows?
>>>
>>> It's the same sort of thing: untar the sources, configure, make, make
>>> install.

Windows binaries and MacOSX binaries are available from upstream.  The build
process on *ix targets is autoconf based and easy as for awk/gawk.

>> Will this also install all the Python packages one might plausible want
>> to use in GCC?

some extension modules depend on external libraries, but even if those don't
exist, the build succeeds without building these extension modules. The sources
come with embedded libs for zlib, libmpdec,  libexpat.  They don't include
libffi (only in 3.7), libsqlite, libgdbm, libbluetooth, libdb.  I suppose the
usage of such modules should be banned by policy.  The only needed thing is any
of libdb (Berkley/SleepyCat) or gdbm to build the anydbm module which might be
necessary.

> It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual).  I expect that will easily cover anything GCC might want to do.

The current usage of awk and perl doesn't include any third party libraries.
That's where the usage of Python should start with.

Matthias

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 17:29     ` Boris Kolpackov
@ 2018-07-18 17:44       ` Paul Koning
  2018-07-18 18:11         ` Matthias Klose
  0 siblings, 1 reply; 58+ messages in thread
From: Paul Koning @ 2018-07-18 17:44 UTC (permalink / raw)
  To: Boris Kolpackov; +Cc: gcc, mliska



> On Jul 18, 2018, at 1:22 PM, Boris Kolpackov <boris@codesynthesis.com> wrote:
> 
> Paul Koning <paulkoning@comcast.net> writes:
> 
>>> On Jul 18, 2018, at 11:13 AM, Boris Kolpackov <boris@codesynthesis.com> wrote:
>>> 
>>> I wonder what will be the expected way to obtain a suitable version of
>>> Python if one is not available on the build machine? With awk I can
>>> build it from source pretty much anywhere. Is building newer versions
>>> of Python on older targets a similarly straightforward process (somehow
>>> I doubt it)? What about Windows?
>> 
>> It's the same sort of thing: untar the sources, configure, make, make
>> install.
> 
> Will this also install all the Python packages one might plausible want
> to use in GCC?

It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual).  I expect that will easily cover anything GCC might want to do.

	paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 16:56   ` Paul Koning
@ 2018-07-18 17:29     ` Boris Kolpackov
  2018-07-18 17:44       ` Paul Koning
  2018-07-19 14:47     ` Konovalov, Vadim
  1 sibling, 1 reply; 58+ messages in thread
From: Boris Kolpackov @ 2018-07-18 17:29 UTC (permalink / raw)
  To: Paul Koning; +Cc: gcc, mliska

Paul Koning <paulkoning@comcast.net> writes:

> > On Jul 18, 2018, at 11:13 AM, Boris Kolpackov <boris@codesynthesis.com> wrote:
> >
> > I wonder what will be the expected way to obtain a suitable version of
> > Python if one is not available on the build machine? With awk I can
> > build it from source pretty much anywhere. Is building newer versions
> > of Python on older targets a similarly straightforward process (somehow
> > I doubt it)? What about Windows?
> 
> It's the same sort of thing: untar the sources, configure, make, make
> install.

Will this also install all the Python packages one might plausible want
to use in GCC?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 16:41   ` doark
@ 2018-07-18 17:22     ` doark
  0 siblings, 0 replies; 58+ messages in thread
From: doark @ 2018-07-18 17:22 UTC (permalink / raw)
  To: gcc

On Tue, 17 Jul 2018 20:23:36 -0400
David Malcolm <dmalcolm@redhat.com> wrote:
> On Tue, 2018-07-17 at 16:37 -0400, David Niklas wrote:
> > > Hi.
> > > 
> > > I've recently touched AWK option generate machinery and it's quite
> > > unpleasant to make any adjustments. My question is simple: can we
> > > starting using a scripting language like Python and replace usage
> > > of
> > > the AWK scripts? It's probably question for Steering committee, but
> > > I
> > > would like to see feedback from community.
<snip>

> [disclosure: I'm a CPython core developer, albeit a rather dormant one,
> and have made contributions to PyPy]

Very good.

> > As a FLOSS dev and someone who is familiar with both languages in
> > question, I'd like to point out that python is an unstable
> > language.   
> 
> > It
> > has matured and changed a lot over the years.   
> 
> Depends on your meaning of "unstable".   The changes are, IMHO,
> extremely well-documented e.g.:
> 
>   https://docs.python.org/3/whatsnew/3.7.html
> 
> and the documentation tells you precisely in which version each feature
> became available; see e.g.:
>   https://docs.python.org/3/library/re.html#re.subn
> for examples of this.

And that is what I mean. I changes. I have compiled C code from 20 years
ago and it works as expected. Many Python packages are still awaiting
migration from 2 to 3 and 3.x series does change things.
My argument is based on the fact that maintaining python code requires
much more work than some other langs.

> > The tools like python's
> > 2to3 tool have gained an infamous reputation.
> > OTOH, awk is very stable. I have been on the GNU variant's ML for
> > some
> > time and I have noticed that when a question over implementation
> > arises
> > they go looking at and, when necessary, consulting what the other
> > awks are
> > doing. For Python there is only one implementation, thus only one way
> > of
> > thinking about how it works unless you want to change something in
> > the
> > core language.  
> 
> There are multiple implementations of Python.
> 
> CPython is the original one, but of the actively-developed
> implementations there's also PyPy and IronPython, along with Jython,
> and others.  And yes, people talk to each other.

If memory serves, ~1 year ago PyPy was not recommended by the Gentoo devs
for a python implementation because it was considered unstable. Jython
is integrating python with java so I did not consider it a "pure" python
implimentation. I did not know that CPython was the original. I seem to
remember that it was intended to convert python to C and was not yet
complete. I can't comment about the IronPython, but it is good to know
that crosstalk does occur.
I use python3 when I need python.

> > Gentoo's portage is an excellent example of a good language gone bad
> > through less than ideal programming in python and it seems to me
> > that,
> > based on the description above, the awk code in gcc needs a code base
> > cleanup and decrustification, not rewritten in the latest and
> > greatest 
> > language simply because it is *the fad* of the day.  
> 
> I get the impression you've had a bad experience with Python in the
> past, and that this is why you sent this email.

Not really... For the curios my story is this: I wanted to learn to
program and C was the dreaded language of the day. Ruby and Python3 were
recommended. I tried to learn first ruby and then python with little
success. I decided to try the hardest language I could find, since
2 years in, the "easy" ones were not working out. I leaned C in no time,
even a perfect understanding of pointers came to me in 6 months time and I
realized that OO and my brain did not like each other. I can program in
python and other OO langs, but I am always running into 2 vs. 3 problems
and each version seems to add something that I know other users might not
have the correct version of python to support or breaks something that
may or may not require changing ones program. Awk (my 4th lang), is a
scripting language that I am also quite good at. I learned it because I
needed to develop simple things faster.

> If it's "the fad of the day", then according to:
>   https://www.tiobe.com/tiobe-index/python/
> it's been the fad of the year in 2007 and 2010, and is current the #4
> programming language.  Maybe there's some inherent quality underlying
> that long-term popularity that makes it more than, say just a "fad".

Not to argue your point, but I have sadly witnessed as language after
language is promoted by employers and educators such that I fear that the
numbers of devs interested in a particular language is often times
skewed instead of developers developing their interests organically.

> Using a popular programming language will make it easier for GCC to get
> new contributors.

Until it becomes less popular...
And gcc is for compiling C code, so we need more C devs than any other
language :)

> > And yes, by spelling
> > python out as *the* language of choice without any other options Mr.
> > Martin is recommending to us what to choose without any reason
> > whatsoever
> > given.  
> 
> Martin is offering to do the work (and, in fact, already has prototyped
> it), and that counts for a lot in my book.

Forgive the misinterpretation of your email on my part, it looked like
Martin was trying to prototype and then ask everyone else to do most of
the work for him.

> > Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust,
> > (it's
> > also all the rage)? Or tex? Or SQL (that would at least be
> > interesting to
> > read :) ?  
> 
> Because I never want to maintain another non-trivial awk script if I
> can help it, and the thought of being able to do more stuff in Python
> makes me happy.

Good enough.

> Oh, and Python is more likely to be available on the developer's
> machine or build box than at least half of the languages you mention.

Probably.

> Admittedly there's the Python 2 vs Python 3 issue, but Python 2.6
> onwards is broadly compatible with Python 3.*, and there's a well-known 
> common subset that works in both languages.  Python 2.6 is almost 10
> years old at this point.

Well known? Wish I knew. And I did read all the standard library and
included docs cover to cover, plus a bunch of internet tutorials too...

> > A fast development cycle is the typical cry of python enthusiasts
> > (and my
> > foolish self at one point in time), but there are plenty of other
> > fast
> > development languages out there.   
> 
> And Python is superior to them all, in my opinion.  For example, Python
> makes it easy to embed unit tests in the support scripts.

Yes, including unit tests is a big advantage to any program, if they ever
get written :)

> Also, the
> Python standard library is "batteries included".

...

> > In my not so humble opinion, this aught to be approached with some
> > degree
> > of wisdom and intelligence as opposed to a zest for something new for
> > newnesses sake.  
> 
> Python is older than Java, and is almost as old as GCC itself.
<snip>

I make no objection. Your arguments are sound enough. Just bear in mind
that I worry that you will end up envying Sisyphus, or breaking things on
older platforms.

See for example, this bug I submitted:
https://bugs.gentoo.org/show_bug.cgi?id=634712
It remains unsolved. Furthermore, it was introduced in a recent and
continues to the latest version of bind. Bind is not a trivial piece of
SW. Nor is it a small or infrequently used one. More than 50% of bugs I
find are in python packages. Yes, I did count at one point in time.
Yes, the fixes for these packages normally evade me.

Sincerely,
David

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 15:13 ` Boris Kolpackov
@ 2018-07-18 16:56   ` Paul Koning
  2018-07-18 17:29     ` Boris Kolpackov
  2018-07-19 14:47     ` Konovalov, Vadim
  0 siblings, 2 replies; 58+ messages in thread
From: Paul Koning @ 2018-07-18 16:56 UTC (permalink / raw)
  To: Boris Kolpackov; +Cc: gcc, mliska



> On Jul 18, 2018, at 11:13 AM, Boris Kolpackov <boris@codesynthesis.com> wrote:
> 
> On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:
> 
>> My question is simple: can we starting using a scripting language like
>> Python and replace usage of the AWK scripts?
> 
> I wonder what will be the expected way to obtain a suitable version of
> Python if one is not available on the build machine? With awk I can
> build it from source pretty much anywhere. Is building newer versions
> of Python on older targets a similarly straightforward process (somehow
> I doubt it)? What about Windows?

It's the same sort of thing: untar the sources, configure, make, make install.  The code is larger than awk but the process is no more difficult.

For Windows there are pre-built kits.  Ditto for a number of other popular operating systems.

	paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  0:23 ` David Malcolm
  2018-07-18  0:38   ` Paul Koning
@ 2018-07-18 16:41   ` doark
  2018-07-18 17:22     ` doark
  1 sibling, 1 reply; 58+ messages in thread
From: doark @ 2018-07-18 16:41 UTC (permalink / raw)
  To: gcc

On Tue, 17 Jul 2018 20:23:36 -0400
David Malcolm <dmalcolm@redhat.com> wrote:
> On Tue, 2018-07-17 at 16:37 -0400, David Niklas wrote:
> > > Hi.
> > > 
> > > I've recently touched AWK option generate machinery and it's quite
> > > unpleasant to make any adjustments. My question is simple: can we
> > > starting using a scripting language like Python and replace usage
> > > of
> > > the AWK scripts? It's probably question for Steering committee, but
> > > I
> > > would like to see feedback from community.
<snip>

> [disclosure: I'm a CPython core developer, albeit a rather dormant one,
> and have made contributions to PyPy]

Very good.

> > As a FLOSS dev and someone who is familiar with both languages in
> > question, I'd like to point out that python is an unstable
> > language.   
> 
> > It
> > has matured and changed a lot over the years.   
> 
> Depends on your meaning of "unstable".   The changes are, IMHO,
> extremely well-documented e.g.:
> 
>   https://docs.python.org/3/whatsnew/3.7.html
> 
> and the documentation tells you precisely in which version each feature
> became available; see e.g.:
>   https://docs.python.org/3/library/re.html#re.subn
> for examples of this.

And that is what I mean. I changes. I have compiled C code from 20 years
ago and it works as expected. Many Python packages are still awaiting
migration from 2 to 3 and 3.x series does change things.
My argument is based on the fact that maintaining python code requires
much more work than some other langs.

> > The tools like python's
> > 2to3 tool have gained an infamous reputation.
> > OTOH, awk is very stable. I have been on the GNU variant's ML for
> > some
> > time and I have noticed that when a question over implementation
> > arises
> > they go looking at and, when necessary, consulting what the other
> > awks are
> > doing. For Python there is only one implementation, thus only one way
> > of
> > thinking about how it works unless you want to change something in
> > the
> > core language.  
> 
> There are multiple implementations of Python.
> 
> CPython is the original one, but of the actively-developed
> implementations there's also PyPy and IronPython, along with Jython,
> and others.  And yes, people talk to each other.

If memory serves, ~1 year ago PyPy was not recommended by the Gentoo devs
for a python implementation because it was considered unstable. Jython
is integrating python with java so I did not consider it a "pure" python
implimentation. I did not know that CPython was the original. I seem to
remember that it was intended to convert python to C and was not yet
complete. I can't comment about the IronPython, but it is good to know
that crosstalk does occur.
I use python3 when I need python.

> > Gentoo's portage is an excellent example of a good language gone bad
> > through less than ideal programming in python and it seems to me
> > that,
> > based on the description above, the awk code in gcc needs a code base
> > cleanup and decrustification, not rewritten in the latest and
> > greatest 
> > language simply because it is *the fad* of the day.  
> 
> I get the impression you've had a bad experience with Python in the
> past, and that this is why you sent this email.

Not really... For the curios my story is this: I wanted to learn to
program and C was the dreaded language of the day. Ruby and Python3 were
recommended. I tried to learn first ruby and then python with little
success. I decided to try the hardest language I could find, since
2 years in, the "easy" ones were not working out. I leaned C in no time,
even a perfect understanding of pointers came to me in 6 months time and I
realized that OO and my brain did not like each other. I can program in
python and other OO langs, but I am always running into 2 vs. 3 problems
and each version seems to add something that I know other users might not
have the correct version of python to support or breaks something that
may or may not require changing ones program. Awk (my 4th lang), is a
scripting language that I am also quite good at. I learned it because I
needed to develop simple things faster.

> If it's "the fad of the day", then according to:
>   https://www.tiobe.com/tiobe-index/python/
> it's been the fad of the year in 2007 and 2010, and is current the #4
> programming language.  Maybe there's some inherent quality underlying
> that long-term popularity that makes it more than, say just a "fad".

Not to argue your point, but I have sadly witnessed as language after
language is promoted by employers and educators such that I fear that the
numbers of devs interested in a particular language is often times
skewed instead of developers developing their interests organically.

> Using a popular programming language will make it easier for GCC to get
> new contributors.

Until it becomes less popular...
And gcc is for compiling C code, so we need more C devs than any other
language :)

> > And yes, by spelling
> > python out as *the* language of choice without any other options Mr.
> > Martin is recommending to us what to choose without any reason
> > whatsoever
> > given.  
> 
> Martin is offering to do the work (and, in fact, already has prototyped
> it), and that counts for a lot in my book.

Forgive the misinterpretation of your email on my part, it looked like
Martin was trying to prototype and then ask everyone else to do most of
the work for him.

> > Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust,
> > (it's
> > also all the rage)? Or tex? Or SQL (that would at least be
> > interesting to
> > read :) ?  
> 
> Because I never want to maintain another non-trivial awk script if I
> can help it, and the thought of being able to do more stuff in Python
> makes me happy.

Good enough.

> Oh, and Python is more likely to be available on the developer's
> machine or build box than at least half of the languages you mention.

Probably.

> Admittedly there's the Python 2 vs Python 3 issue, but Python 2.6
> onwards is broadly compatible with Python 3.*, and there's a well-known 
> common subset that works in both languages.  Python 2.6 is almost 10
> years old at this point.

Well known? Wish I knew. And I did read all the standard library and
included docs cover to cover, plus a bunch of internet tutorials too...

> > A fast development cycle is the typical cry of python enthusiasts
> > (and my
> > foolish self at one point in time), but there are plenty of other
> > fast
> > development languages out there.   
> 
> And Python is superior to them all, in my opinion.  For example, Python
> makes it easy to embed unit tests in the support scripts.

Yes, including unit tests is a big advantage to any program, if they ever
get written :)

> Also, the
> Python standard library is "batteries included".

...

> > In my not so humble opinion, this aught to be approached with some
> > degree
> > of wisdom and intelligence as opposed to a zest for something new for
> > newnesses sake.  
> 
> Python is older than Java, and is almost as old as GCC itself.
<snip>

I make no objection. Your arguments are sound enough. Just bear in mind
that I worry that you will end up envying Sisyphus, or breaking things on
older platforms.

See for example, this bug I submitted:
https://bugs.gentoo.org/show_bug.cgi?id=634712
It remains unsolved. Furthermore, it was introduced in a recent and
continues to the latest version of bind. Bind is not a trivial piece of
SW. Nor is it a small or infrequently used one. More than 50% of bugs I
find are in python packages. Yes, I did count at one point in time.
Yes, the fixes for these packages normally evade me.

Sincerely,
David

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
  2018-07-18  1:01 ` David Malcolm
  2018-07-18  9:51 ` Richard Biener
@ 2018-07-18 15:13 ` Boris Kolpackov
  2018-07-18 16:56   ` Paul Koning
  2018-07-23 14:21 ` Joseph Myers
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 58+ messages in thread
From: Boris Kolpackov @ 2018-07-18 15:13 UTC (permalink / raw)
  To: gcc, mliska

On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:

> My question is simple: can we starting using a scripting language like
> Python and replace usage of the AWK scripts?

I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 14:29             ` Matthias Klose
@ 2018-07-18 14:46               ` Janne Blomqvist
  2018-07-20 10:01               ` Martin Liška
  1 sibling, 0 replies; 58+ messages in thread
From: Janne Blomqvist @ 2018-07-18 14:46 UTC (permalink / raw)
  To: Matthias Klose; +Cc: gcc mailing list

On Wed, Jul 18, 2018 at 5:29 PM, Matthias Klose <doko@ubuntu.com> wrote:

> On 18.07.2018 14:49, Joel Sherrill wrote:
> > On Wed, Jul 18, 2018, 7:15 AM Jonathan Wakely <jwakely.gcc@gmail.com>
> wrote:
> >
> >> On Wed, 18 Jul 2018 at 13:06, Eric S. Raymond wrote:
> >>>
> >>> Jonathan Wakely <jwakely.gcc@gmail.com>:
> >>>> On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
> >>>>> Python 2.6 onwards is broadly compatible with Python 3.*. and is
> >> about
> >>>>> to be 10 years old.  (IIRC it was the system python implementation in
> >>>>> RHEL 6).
> >>>>
> >>>> It is indeed. Without some regular testing with Python 2.6 it could be
> >>>> easy to introduce code that doesn't actually work on that old version.
> >>>> I did that recently, see PR 86112.
> >>>>
> >>>> This isn't an objection to using Python (I like it, and anyway I don't
> >>>> touch the parts of GCC that you're talking about using it for). Just a
> >>>> caution that trying to restrict yourself to a portable subset isn't
> >>>> always easy for casual users of a language (also a problem with C++98
> >>>> vs C++11 vs C++14 as I'm sure many GCC devs are aware).
> >>>
> >>> It's not very difficult to write "polyglot" Python that is indifferent
> >>> to which version it runs under.  I had to solve this problem for
> >>> reposurgeon; techniques documented here...
> >>
> >> I don't see any mention of avoiding dict comprehensions (not supported
> >> until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
> >>
> >> I maintain it's easy to unwittingly use a feature (such as dict
> >> comprehensions) which works fine on your machine, but aren't supported
> >> by all versions you intend to support. Regular testing with the oldest
> >> version is needed to prevent that (which was the point I was making).
> >>
> >
> > I think the RTEMS Community may be a good precedence here. RTEMS is
> always
> > cross compiled and we are as host agnostic as possible. We use as close
> to
> > the latest release of GCC, binutils, gdb, and newlib as possible. Our
> host
> > side tools are in a combination of Python and C++. We use Sphinx for
> > documentation.
> >
> > We are careful to use the Python on RHEL 6 as a baseline. You can build
> an
> > RTEMS environment there. But at least one of the Sphinx pieces requires a
> > Python of at least RHEL 7 vintage.
> >
> > We have a lot of what I will politely call institutional and large
> > organization users who have to adhere to strict IT policies. I think
> RHEL 7
> > is common but can't swear there is no RHEL 6 out there and because of
> that,
> > we set the Python 2.x as a minimum.
> >
> > Yes these are old. And for native new distribution use, it doesn't
> matter.
> > But for cross and local upgrades, old distributions matter. Particularly
> > those targeting enterprise users. And those are glacially slow.
> >
> > As an aside, it was not being able to build the RTEMS documentation that
> > pushed me off RHEL 6 as my primary personal environment last year. I
> wanted
> > to be using the oldest distribution I thought was in use in our
> community.
>
> doesn't RHEL 6 has overlays for that very reason to install a newer
> Python3?
>

EPEL provides python 3.4 for RHEL6.

(EPEL is a non-official add-on repository, but I suspect the vast majority
who aren't running some single-task server have it enabled)

Don't know if there's something equivalent for SLES.


> Please don't start with Python2 anymore. It's discontinued in less than two
> years and then you'll have distributions not having Python2 anymore.


+1


-- 
Janne Blomqvist

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 12:50           ` Joel Sherrill
@ 2018-07-18 14:29             ` Matthias Klose
  2018-07-18 14:46               ` Janne Blomqvist
  2018-07-20 10:01               ` Martin Liška
  0 siblings, 2 replies; 58+ messages in thread
From: Matthias Klose @ 2018-07-18 14:29 UTC (permalink / raw)
  To: gcc

On 18.07.2018 14:49, Joel Sherrill wrote:
> On Wed, Jul 18, 2018, 7:15 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> 
>> On Wed, 18 Jul 2018 at 13:06, Eric S. Raymond wrote:
>>>
>>> Jonathan Wakely <jwakely.gcc@gmail.com>:
>>>> On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
>>>>> Python 2.6 onwards is broadly compatible with Python 3.*. and is
>> about
>>>>> to be 10 years old.  (IIRC it was the system python implementation in
>>>>> RHEL 6).
>>>>
>>>> It is indeed. Without some regular testing with Python 2.6 it could be
>>>> easy to introduce code that doesn't actually work on that old version.
>>>> I did that recently, see PR 86112.
>>>>
>>>> This isn't an objection to using Python (I like it, and anyway I don't
>>>> touch the parts of GCC that you're talking about using it for). Just a
>>>> caution that trying to restrict yourself to a portable subset isn't
>>>> always easy for casual users of a language (also a problem with C++98
>>>> vs C++11 vs C++14 as I'm sure many GCC devs are aware).
>>>
>>> It's not very difficult to write "polyglot" Python that is indifferent
>>> to which version it runs under.  I had to solve this problem for
>>> reposurgeon; techniques documented here...
>>
>> I don't see any mention of avoiding dict comprehensions (not supported
>> until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
>>
>> I maintain it's easy to unwittingly use a feature (such as dict
>> comprehensions) which works fine on your machine, but aren't supported
>> by all versions you intend to support. Regular testing with the oldest
>> version is needed to prevent that (which was the point I was making).
>>
> 
> I think the RTEMS Community may be a good precedence here. RTEMS is always
> cross compiled and we are as host agnostic as possible. We use as close to
> the latest release of GCC, binutils, gdb, and newlib as possible. Our host
> side tools are in a combination of Python and C++. We use Sphinx for
> documentation.
> 
> We are careful to use the Python on RHEL 6 as a baseline. You can build an
> RTEMS environment there. But at least one of the Sphinx pieces requires a
> Python of at least RHEL 7 vintage.
> 
> We have a lot of what I will politely call institutional and large
> organization users who have to adhere to strict IT policies. I think RHEL 7
> is common but can't swear there is no RHEL 6 out there and because of that,
> we set the Python 2.x as a minimum.
> 
> Yes these are old. And for native new distribution use, it doesn't matter.
> But for cross and local upgrades, old distributions matter. Particularly
> those targeting enterprise users. And those are glacially slow.
> 
> As an aside, it was not being able to build the RTEMS documentation that
> pushed me off RHEL 6 as my primary personal environment last year. I wanted
> to be using the oldest distribution I thought was in use in our community.

doesn't RHEL 6 has overlays for that very reason to install a newer Python3?

Please don't start with Python2 anymore. It's discontinued in less than two
years and then you'll have distributions not having Python2 anymore.  If you
don't have a recent Python3, then you probably can build it for your platform
itself.

Python3 is also cross-buildable, and much easier to cross-build than guile or perl.

Matthias

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 12:15         ` Jonathan Wakely
@ 2018-07-18 12:50           ` Joel Sherrill
  2018-07-18 14:29             ` Matthias Klose
  2018-07-18 21:28           ` Eric S. Raymond
  1 sibling, 1 reply; 58+ messages in thread
From: Joel Sherrill @ 2018-07-18 12:50 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Eric S. Raymond, David Malcolm, Richard Guenther, Martin Liška, gcc

On Wed, Jul 18, 2018, 7:15 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> On Wed, 18 Jul 2018 at 13:06, Eric S. Raymond wrote:
> >
> > Jonathan Wakely <jwakely.gcc@gmail.com>:
> > > On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
> > > > Python 2.6 onwards is broadly compatible with Python 3.*. and is
> about
> > > > to be 10 years old.  (IIRC it was the system python implementation in
> > > > RHEL 6).
> > >
> > > It is indeed. Without some regular testing with Python 2.6 it could be
> > > easy to introduce code that doesn't actually work on that old version.
> > > I did that recently, see PR 86112.
> > >
> > > This isn't an objection to using Python (I like it, and anyway I don't
> > > touch the parts of GCC that you're talking about using it for). Just a
> > > caution that trying to restrict yourself to a portable subset isn't
> > > always easy for casual users of a language (also a problem with C++98
> > > vs C++11 vs C++14 as I'm sure many GCC devs are aware).
> >
> > It's not very difficult to write "polyglot" Python that is indifferent
> > to which version it runs under.  I had to solve this problem for
> > reposurgeon; techniques documented here...
>
> I don't see any mention of avoiding dict comprehensions (not supported
> until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
>
> I maintain it's easy to unwittingly use a feature (such as dict
> comprehensions) which works fine on your machine, but aren't supported
> by all versions you intend to support. Regular testing with the oldest
> version is needed to prevent that (which was the point I was making).
>

I think the RTEMS Community may be a good precedence here. RTEMS is always
cross compiled and we are as host agnostic as possible. We use as close to
the latest release of GCC, binutils, gdb, and newlib as possible. Our host
side tools are in a combination of Python and C++. We use Sphinx for
documentation.

We are careful to use the Python on RHEL 6 as a baseline. You can build an
RTEMS environment there. But at least one of the Sphinx pieces requires a
Python of at least RHEL 7 vintage.

We have a lot of what I will politely call institutional and large
organization users who have to adhere to strict IT policies. I think RHEL 7
is common but can't swear there is no RHEL 6 out there and because of that,
we set the Python 2.x as a minimum.

Yes these are old. And for native new distribution use, it doesn't matter.
But for cross and local upgrades, old distributions matter. Particularly
those targeting enterprise users. And those are glacially slow.

As an aside, it was not being able to build the RTEMS documentation that
pushed me off RHEL 6 as my primary personal environment last year. I wanted
to be using the oldest distribution I thought was in use in our community.

--joel
RTEMS

>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 12:06       ` Eric S. Raymond
@ 2018-07-18 12:15         ` Jonathan Wakely
  2018-07-18 12:50           ` Joel Sherrill
  2018-07-18 21:28           ` Eric S. Raymond
  0 siblings, 2 replies; 58+ messages in thread
From: Jonathan Wakely @ 2018-07-18 12:15 UTC (permalink / raw)
  To: Eric Raymond; +Cc: David Malcolm, Richard Guenther, Martin Liška, gcc

On Wed, 18 Jul 2018 at 13:06, Eric S. Raymond wrote:
>
> Jonathan Wakely <jwakely.gcc@gmail.com>:
> > On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
> > > Python 2.6 onwards is broadly compatible with Python 3.*. and is about
> > > to be 10 years old.  (IIRC it was the system python implementation in
> > > RHEL 6).
> >
> > It is indeed. Without some regular testing with Python 2.6 it could be
> > easy to introduce code that doesn't actually work on that old version.
> > I did that recently, see PR 86112.
> >
> > This isn't an objection to using Python (I like it, and anyway I don't
> > touch the parts of GCC that you're talking about using it for). Just a
> > caution that trying to restrict yourself to a portable subset isn't
> > always easy for casual users of a language (also a problem with C++98
> > vs C++11 vs C++14 as I'm sure many GCC devs are aware).
>
> It's not very difficult to write "polyglot" Python that is indifferent
> to which version it runs under.  I had to solve this problem for
> reposurgeon; techniques documented here...

I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).

I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 11:31     ` Jonathan Wakely
@ 2018-07-18 12:06       ` Eric S. Raymond
  2018-07-18 12:15         ` Jonathan Wakely
  0 siblings, 1 reply; 58+ messages in thread
From: Eric S. Raymond @ 2018-07-18 12:06 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: David Malcolm, Richard Guenther, Martin Liška, gcc

Jonathan Wakely <jwakely.gcc@gmail.com>:
> On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
> > Python 2.6 onwards is broadly compatible with Python 3.*. and is about
> > to be 10 years old.  (IIRC it was the system python implementation in
> > RHEL 6).
> 
> It is indeed. Without some regular testing with Python 2.6 it could be
> easy to introduce code that doesn't actually work on that old version.
> I did that recently, see PR 86112.
> 
> This isn't an objection to using Python (I like it, and anyway I don't
> touch the parts of GCC that you're talking about using it for). Just a
> caution that trying to restrict yourself to a portable subset isn't
> always easy for casual users of a language (also a problem with C++98
> vs C++11 vs C++14 as I'm sure many GCC devs are aware).

It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under.  I had to solve this problem for
reposurgeon; techniques documented here...

http://www.catb.org/esr/faqs/practical-python-porting/
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 10:56   ` David Malcolm
  2018-07-18 11:08     ` Jakub Jelinek
@ 2018-07-18 11:31     ` Jonathan Wakely
  2018-07-18 12:06       ` Eric S. Raymond
  2018-07-23 14:31     ` Joseph Myers
  2 siblings, 1 reply; 58+ messages in thread
From: Jonathan Wakely @ 2018-07-18 11:31 UTC (permalink / raw)
  To: David Malcolm; +Cc: Richard Guenther, Martin Liška, gcc

On Wed, 18 Jul 2018 at 11:56, David Malcolm wrote:
> Python 2.6 onwards is broadly compatible with Python 3.*. and is about
> to be 10 years old.  (IIRC it was the system python implementation in
> RHEL 6).

It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.

This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18 10:56   ` David Malcolm
@ 2018-07-18 11:08     ` Jakub Jelinek
  2018-07-18 11:31     ` Jonathan Wakely
  2018-07-23 14:31     ` Joseph Myers
  2 siblings, 0 replies; 58+ messages in thread
From: Jakub Jelinek @ 2018-07-18 11:08 UTC (permalink / raw)
  To: David Malcolm; +Cc: Richard Biener, Martin Liška, GCC Development

On Wed, Jul 18, 2018 at 06:56:31AM -0400, David Malcolm wrote:
> > alternatively we could handle the generated files like those we still
> > need flex for:

We can't, because unlike the flex output, the option handling is heavily
target specific and storing in the tarball a collection of per-target
specially generated results would be a nightmare.

> Rationale:
> 
> Python 2.6 onwards is broadly compatible with Python 3.*. and is about
> to be 10 years old.  (IIRC it was the system python implementation in
> RHEL 6).  I'm guessing that many older systems have Python 2 installed,
> but not Python 3, and anything we write is likely to be compatible with
> even older Python 2.* implementations.
> 
> Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
> 414), which makes it much easier to write scripts that work with both
> 2.* and 3.*.  Python 3.3 is almost 6 years old.
> 
> (this is just a suggestion)

Then the question is also whether to use python2, python3 or python
binaries.  E.g. on some distros python without suffix generates ugly
warnings and that already affects dg-extract-results.sh which just runs
python -c ... rather than first looking for python2 or python3 and only
falling back to python if those don't exist.  Some other contrib/ scripts
look only for python3.

	Jakub

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  9:51 ` Richard Biener
  2018-07-18 10:03   ` Richard Earnshaw (lists)
@ 2018-07-18 10:56   ` David Malcolm
  2018-07-18 11:08     ` Jakub Jelinek
                       ` (2 more replies)
  2018-07-18 22:42   ` Segher Boessenkool
  2 siblings, 3 replies; 58+ messages in thread
From: David Malcolm @ 2018-07-18 10:56 UTC (permalink / raw)
  To: Richard Biener, Martin Liška; +Cc: GCC Development

On Wed, 2018-07-18 at 11:51 +0200, Richard Biener wrote:
> On Tue, Jul 17, 2018 at 2:49 PM Martin Liška <mliska@suse.cz> wrote:
> > 
> > Hi.
> > 
> > I've recently touched AWK option generate machinery and it's quite
> > unpleasant
> > to make any adjustments. My question is simple: can we starting
> > using a scripting
> > language like Python and replace usage of the AWK scripts? It's
> > probably question
> > for Steering committee, but I would like to see feedback from
> > community.
> > 
> > There are some bulletins why I would like to replace current AWK
> > scripts:
> > 
> > 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack
> > of flags type classes multiple
> > global variables are created (var_opt_char, var_opt_string, ...)
> > 
> > 2) similar happens in gcc/opth-gen.awk
> > 
> > 3) we do very many regex matches (mainly in gcc/opt-functions.awk), 
> > I believe
> >    we should come up with a structured option format that will make
> > parsing and
> >    processing much simpler.
> > 
> > 4) we can come up with new sanity checks of options:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
> > 
> > 5) there are various targets that generate *.opt files, one example
> > is ARM:
> > gcc/config/arm/parsecpu.awk
> > 
> > where transforms:
> > ./gcc/config/arm/arm-cpus.in
> > 
> > I guess having a well-defined structured format for *.opt files
> > will make
> > it easier to write generated opt files?
> > 
> > I'm attaching a prototype that can transform optionlist into
> > options-save.c
> > that can be compiled and works.
> > 
> > I'm looking forward to a feedback.
> 
> I guess we either need to document python as build requirement in
> install.texi then,

> it currently has
> 
> @item A POSIX or SVR4 awk
> 
> Necessary for creating some of the generated source files for GCC@.
> If in doubt, use a recent GNU awk version, as some of the older ones
> are broken.  GNU awk version 3.1.5 is known to work.
> 
> alternatively we could handle the generated files like those we still
> need flex for:

If we go down the "document Python as a build requirement" route, we
would need to decide on a minimum version, and what to do about Python
2 vs Python 3.  We could restrict ourselves to the common subset of the
two languages, or to require Python 3 (or Python 2, I suppose).

If we want somewhat conservative minimum versions, one strategy might
be to require (Python 2.* (2.6 or later) OR Python 3 (3.3 or later)),
and code to the common subset of 2.6 and 3.3.  Implicitly, this would
mean no 3rd-party modules; we'd be sticking to the Python standard
library.

Rationale:

Python 2.6 onwards is broadly compatible with Python 3.*. and is about
to be 10 years old.  (IIRC it was the system python implementation in
RHEL 6).  I'm guessing that many older systems have Python 2 installed,
but not Python 3, and anything we write is likely to be compatible with
even older Python 2.* implementations.

Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
414), which makes it much easier to write scripts that work with both
2.* and 3.*.  Python 3.3 is almost 6 years old.

(this is just a suggestion)

Dave


> @item --enable-generated-files-in-srcdir
> Neither the .c and .h files that are generated from Bison and flex
> nor the
> info manuals and man pages that are built from the .texi files are
> present
> in the SVN development tree.  When building GCC from that development
> tree,
> or from one of our snapshots, those generated files are placed in
> your
> build directory, which allows for the source to be in a readonly
> directory.
> 
> If you configure with @option{--enable-generated-files-in-srcdir}
> then those
> generated files will go into the source directory.  This is mainly
> intended
> for generating release or prerelease tarballs of the GCC sources,
> since it
> is not a requirement that the users of source releases to have flex,
> Bison,
> or makeinfo.
> 
> We already conditionally require Perl for building for some targets
> so I wonder
> if using perl would be better ...
> 
> Do we get rid of the AWK build requirement with your changes?
> 
> Richard.
> 
> > Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  9:51 ` Richard Biener
@ 2018-07-18 10:03   ` Richard Earnshaw (lists)
  2018-07-18 10:56   ` David Malcolm
  2018-07-18 22:42   ` Segher Boessenkool
  2 siblings, 0 replies; 58+ messages in thread
From: Richard Earnshaw (lists) @ 2018-07-18 10:03 UTC (permalink / raw)
  To: Richard Biener, Martin Liška; +Cc: GCC Development

On 18/07/18 10:51, Richard Biener wrote:
> On Tue, Jul 17, 2018 at 2:49 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> Hi.
>>
>> I've recently touched AWK option generate machinery and it's quite unpleasant
>> to make any adjustments. My question is simple: can we starting using a scripting
>> language like Python and replace usage of the AWK scripts? It's probably question
>> for Steering committee, but I would like to see feedback from community.
>>
>> There are some bulletins why I would like to replace current AWK scripts:
>>
>> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
>> global variables are created (var_opt_char, var_opt_string, ...)
>>
>> 2) similar happens in gcc/opth-gen.awk
>>
>> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
>>    we should come up with a structured option format that will make parsing and
>>    processing much simpler.
>>
>> 4) we can come up with new sanity checks of options:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
>>
>> 5) there are various targets that generate *.opt files, one example is ARM:
>> gcc/config/arm/parsecpu.awk
>>
>> where transforms:
>> ./gcc/config/arm/arm-cpus.in
>>
>> I guess having a well-defined structured format for *.opt files will make
>> it easier to write generated opt files?
>>
>> I'm attaching a prototype that can transform optionlist into options-save.c
>> that can be compiled and works.
>>
>> I'm looking forward to a feedback.
> 
> I guess we either need to document python as build requirement in
> install.texi then,
> it currently has
> 
> @item A POSIX or SVR4 awk
> 
> Necessary for creating some of the generated source files for GCC@.
> If in doubt, use a recent GNU awk version, as some of the older ones
> are broken.  GNU awk version 3.1.5 is known to work.
> 
> alternatively we could handle the generated files like those we still
> need flex for:
> 
> @item --enable-generated-files-in-srcdir
> Neither the .c and .h files that are generated from Bison and flex nor the
> info manuals and man pages that are built from the .texi files are present
> in the SVN development tree.  When building GCC from that development tree,
> or from one of our snapshots, those generated files are placed in your
> build directory, which allows for the source to be in a readonly
> directory.
> 
> If you configure with @option{--enable-generated-files-in-srcdir} then those
> generated files will go into the source directory.  This is mainly intended
> for generating release or prerelease tarballs of the GCC sources, since it
> is not a requirement that the users of source releases to have flex, Bison,
> or makeinfo.
> 
> We already conditionally require Perl for building for some targets so I wonder
> if using perl would be better ...
> 
> Do we get rid of the AWK build requirement with your changes?
> 

Nope, the Arm port uses AWK for handling the CPU description tables.  I
chose to use that specifically because it was already relied on for
other parts of the build system.

Please don't go down the Perl line, though...

R.

> Richard.
> 
>> Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
  2018-07-18  1:01 ` David Malcolm
@ 2018-07-18  9:51 ` Richard Biener
  2018-07-18 10:03   ` Richard Earnshaw (lists)
                     ` (2 more replies)
  2018-07-18 15:13 ` Boris Kolpackov
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 58+ messages in thread
From: Richard Biener @ 2018-07-18  9:51 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development

On Tue, Jul 17, 2018 at 2:49 PM Martin Liška <mliska@suse.cz> wrote:
>
> Hi.
>
> I've recently touched AWK option generate machinery and it's quite unpleasant
> to make any adjustments. My question is simple: can we starting using a scripting
> language like Python and replace usage of the AWK scripts? It's probably question
> for Steering committee, but I would like to see feedback from community.
>
> There are some bulletins why I would like to replace current AWK scripts:
>
> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
> global variables are created (var_opt_char, var_opt_string, ...)
>
> 2) similar happens in gcc/opth-gen.awk
>
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
>    we should come up with a structured option format that will make parsing and
>    processing much simpler.
>
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
>
> 5) there are various targets that generate *.opt files, one example is ARM:
> gcc/config/arm/parsecpu.awk
>
> where transforms:
> ./gcc/config/arm/arm-cpus.in
>
> I guess having a well-defined structured format for *.opt files will make
> it easier to write generated opt files?
>
> I'm attaching a prototype that can transform optionlist into options-save.c
> that can be compiled and works.
>
> I'm looking forward to a feedback.

I guess we either need to document python as build requirement in
install.texi then,
it currently has

@item A POSIX or SVR4 awk

Necessary for creating some of the generated source files for GCC@.
If in doubt, use a recent GNU awk version, as some of the older ones
are broken.  GNU awk version 3.1.5 is known to work.

alternatively we could handle the generated files like those we still
need flex for:

@item --enable-generated-files-in-srcdir
Neither the .c and .h files that are generated from Bison and flex nor the
info manuals and man pages that are built from the .texi files are present
in the SVN development tree.  When building GCC from that development tree,
or from one of our snapshots, those generated files are placed in your
build directory, which allows for the source to be in a readonly
directory.

If you configure with @option{--enable-generated-files-in-srcdir} then those
generated files will go into the source directory.  This is mainly intended
for generating release or prerelease tarballs of the GCC sources, since it
is not a requirement that the users of source releases to have flex, Bison,
or makeinfo.

We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...

Do we get rid of the AWK build requirement with your changes?

Richard.

> Martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 12:49 Martin Liška
@ 2018-07-18  1:01 ` David Malcolm
  2018-07-19 20:24   ` Karsten Merker
  2018-07-18  9:51 ` Richard Biener
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 58+ messages in thread
From: David Malcolm @ 2018-07-18  1:01 UTC (permalink / raw)
  To: Martin Liška, GCC Development

On Tue, 2018-07-17 at 14:49 +0200, Martin Liška wrote:
> Hi.
> 
> I've recently touched AWK option generate machinery and it's quite
> unpleasant
> to make any adjustments. My question is simple: can we starting using
> a scripting
> language like Python and replace usage of the AWK scripts? It's
> probably question
> for Steering committee, but I would like to see feedback from
> community.

As you know, I'm a fan of Python.  As I noted elsewhere in this thread,
one issue is Python 2 vs Python 3 (and minimum versions).  Within
Python 2.*, Python 2.6 onwards is broadly compatible with Python 3.*,
and there's a well-known common subset that works in both languages.

To what extent would this complicate bootstrap?  (I don't think so, in
that it would appear to be just an external build-time dependency on
the build machine).

Would this make it harder for people to build GCC?  It's one more
dependency, but CPython is widely available and relatively easy to
build.  (I don't have experience of doing bring-up of a new
architecture, though).

> There are some bulletins why I would like to replace current AWK
> scripts:
> 
> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
> flags type classes multiple
> global variables are created (var_opt_char, var_opt_string, ...)
> 
> 2) similar happens in gcc/opth-gen.awk
> 
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
> believe
>    we should come up with a structured option format that will make
> parsing and
>    processing much simpler.
> 
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397

Having some kind of .opt linting sounds useful.

> 5) there are various targets that generate *.opt files, one example
> is ARM:
> gcc/config/arm/parsecpu.awk
> 
> where transforms:
> ./gcc/config/arm/arm-cpus.in
> 
> I guess having a well-defined structured format for *.opt files will
> make
> it easier to write generated opt files?
> 
> I'm attaching a prototype that can transform optionlist into options-
> save.c
> that can be compiled and works.
> 
> I'm looking forward to a feedback.
> Martin

You named it "gcc-options.py", but I think we'll want something that
can be imported from other scripts, and this isn't valid to "import" as
a module, due to the "-".   It should have a filename that either uses
an underscore, or no separator.

> # parse content of optionlist
It's probably worth moving this into a class.  Maybe:

class OptionList:
    def __init__ (self, lines):
       # etc

or similar.

"optimization_flags" could be a member of that class.


> # start printing
This ought to be in a function, rather than having this at the top-
level.

Moving it into a function would allow for some unittest tests:
(a) tests of parsing some lines provided as string literals, to unit-
test the parser.

(b) integration tests of parsing the actual optionlist, maybe.

perhaps via a --unit-test command-line option to trigger
unittest.main().


Maybe a way to ensure no semantic changes during the transition would
be to diff the generated .c/.h files compared to the awk files, and
verifying that there are no significant whitespace changes, for all
supported configs?

Hope this is constructive.
Dave

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-18  0:23 ` David Malcolm
@ 2018-07-18  0:38   ` Paul Koning
  2018-07-18 16:41   ` doark
  1 sibling, 0 replies; 58+ messages in thread
From: Paul Koning @ 2018-07-18  0:38 UTC (permalink / raw)
  To: David Malcolm; +Cc: GCC Mailing List



> On Jul 17, 2018, at 8:23 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> 
>>> Hi.
>>> 
>>> I've recently touched AWK option generate machinery and it's quite
>>> unpleasant to make any adjustments. My question is simple: can we
>>> starting using a scripting language like Python and replace usage
>>> of
>>> the AWK scripts? It's probably question for Steering committee, but
>>> I
>>> would like to see feedback from community....
>>> I'm looking forward to a feedback.
>>> Martin

David gave a number of good arguments.  I support Martin's proposal, both as to replacing AWK and specifically the choice of Python for that purpose.

Python fits the bill very well in my experience.  I've used it to write several large programs, including such non-obvious ones as two network protocol stack implementations.

In roughly 40 years, and roughly 40 programming languages, I've only twice encountered a language where I could go from knowing nothing at all to writing a substantial real world program in just one week: Pascal (in college) and Python (about 15 years ago).  This is why Python became my language of choice whenever I don't need the speed or small memory footprint of C/C++.

	paul


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC] Adding Python as a possible language and it's usage
  2018-07-17 20:37 David Niklas
@ 2018-07-18  0:23 ` David Malcolm
  2018-07-18  0:38   ` Paul Koning
  2018-07-18 16:41   ` doark
  0 siblings, 2 replies; 58+ messages in thread
From: David Malcolm @ 2018-07-18  0:23 UTC (permalink / raw)
  To: David Niklas, gcc; +Cc: mliska

On Tue, 2018-07-17 at 16:37 -0400, David Niklas wrote:
> > Hi.
> > 
> > I've recently touched AWK option generate machinery and it's quite
> > unpleasant to make any adjustments. My question is simple: can we
> > starting using a scripting language like Python and replace usage
> > of
> > the AWK scripts? It's probably question for Steering committee, but
> > I
> > would like to see feedback from community.
> > 
> > There are some bulletins why I would like to replace current AWK
> > scripts:
> > 
> > 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack
> > of
> > flags type classes multiple global variables are created
> > (var_opt_char,
> > var_opt_string, ...)
> > 
> > 2) similar happens in gcc/opth-gen.awk
> > 
> > 3) we do very many regex matches (mainly in gcc/opt-functions.awk), 
> > I
> > believe we should come up with a structured option format that will
> > make parsing and processing much simpler.
> > 
> > 4) we can come up with new sanity checks of options:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
> > 
> > 5) there are various targets that generate *.opt files, one example
> > is
> > ARM: gcc/config/arm/parsecpu.awk
> > 
> > where transforms:
> > ./gcc/config/arm/arm-cpus.in
> > 
> > I guess having a well-defined structured format for *.opt files
> > will
> > make it easier to write generated opt files?
> > 
> > I'm attaching a prototype that can transform optionlist into
> > options-save.c that can be compiled and works.
> > 
> > I'm looking forward to a feedback.
> > Martin
> 
> <snip>
> 
> I was reading phoronix and came upon an article about this email.

[disclosure: I'm a CPython core developer, albeit a rather dormant one,
and have made contributions to PyPy]

> As a FLOSS dev and someone who is familiar with both languages in
> question, I'd like to point out that python is an unstable language. 

It
> has matured and changed a lot over the years. 

Depends on your meaning of "unstable".   The changes are, IMHO,
extremely well-documented e.g.:

  https://docs.python.org/3/whatsnew/3.7.html

and the documentation tells you precisely in which version each feature
became available; see e.g.:
  https://docs.python.org/3/library/re.html#re.subn
for examples of this.


> The tools like python's
> 2to3 tool have gained an infamous reputation.
> OTOH, awk is very stable. I have been on the GNU variant's ML for
> some
> time and I have noticed that when a question over implementation
> arises
> they go looking at and, when necessary, consulting what the other
> awks are
> doing. For Python there is only one implementation, thus only one way
> of
> thinking about how it works unless you want to change something in
> the
> core language.

There are multiple implementations of Python.

CPython is the original one, but of the actively-developed
implementations there's also PyPy and IronPython, along with Jython,
and others.  And yes, people talk to each other.

> Gentoo's portage is an excellent example of a good language gone bad
> through less than ideal programming in python and it seems to me
> that,
> based on the description above, the awk code in gcc needs a code base
> cleanup and decrustification, not rewritten in the latest and
> greatest 
> language simply because it is *the fad* of the day.

I get the impression you've had a bad experience with Python in the
past, and that this is why you sent this email.

If it's "the fad of the day", then according to:
  https://www.tiobe.com/tiobe-index/python/
it's been the fad of the year in 2007 and 2010, and is current the #4
programming language.  Maybe there's some inherent quality underlying
that long-term popularity that makes it more than, say just a "fad".

Using a popular programming language will make it easier for GCC to get
new contributors.

And yes, by spelling
> python out as *the* language of choice without any other options Mr.
> Martin is recommending to us what to choose without any reason
> whatsoever
> given.

Martin is offering to do the work (and, in fact, already has prototyped
it), and that counts for a lot in my book.

> Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust,
> (it's
> also all the rage)? Or tex? Or SQL (that would at least be
> interesting to
> read :) ?

Because I never want to maintain another non-trivial awk script if I
can help it, and the thought of being able to do more stuff in Python
makes me happy.  

Oh, and Python is more likely to be available on the developer's
machine or build box than at least half of the languages you mention.

Admittedly there's the Python 2 vs Python 3 issue, but Python 2.6
onwards is broadly compatible with Python 3.*, and there's a well-known 
common subset that works in both languages.  Python 2.6 is almost 10
years old at this point.

> A fast development cycle is the typical cry of python enthusiasts
> (and my
> foolish self at one point in time), but there are plenty of other
> fast
> development languages out there. 

And Python is superior to them all, in my opinion.  For example, Python
makes it easy to embed unit tests in the support scripts.  Also, the
Python standard library is "batteries included".

> In my not so humble opinion, this aught to be approached with some
> degree
> of wisdom and intelligence as opposed to a zest for something new for
> newnesses sake.

Python is older than Java, and is almost as old as GCC itself.

> Sincerely,
> David
> 
> PS: No, I am not volunteering myself.

Quite.

Dave

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC] Adding Python as a possible language and it's usage
@ 2018-07-17 20:37 David Niklas
  2018-07-18  0:23 ` David Malcolm
  0 siblings, 1 reply; 58+ messages in thread
From: David Niklas @ 2018-07-17 20:37 UTC (permalink / raw)
  To: gcc; +Cc: mliska

> Hi.
> 
> I've recently touched AWK option generate machinery and it's quite
> unpleasant to make any adjustments. My question is simple: can we
> starting using a scripting language like Python and replace usage of
> the AWK scripts? It's probably question for Steering committee, but I
> would like to see feedback from community.
> 
> There are some bulletins why I would like to replace current AWK
> scripts:
> 
> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
> flags type classes multiple global variables are created (var_opt_char,
> var_opt_string, ...)
> 
> 2) similar happens in gcc/opth-gen.awk
> 
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
> believe we should come up with a structured option format that will
> make parsing and processing much simpler.
> 
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
> 
> 5) there are various targets that generate *.opt files, one example is
> ARM: gcc/config/arm/parsecpu.awk
> 
> where transforms:
> ./gcc/config/arm/arm-cpus.in
> 
> I guess having a well-defined structured format for *.opt files will
> make it easier to write generated opt files?
> 
> I'm attaching a prototype that can transform optionlist into
> options-save.c that can be compiled and works.
> 
> I'm looking forward to a feedback.
> Martin
<snip>

I was reading phoronix and came upon an article about this email.

As a FLOSS dev and someone who is familiar with both languages in
question, I'd like to point out that python is an unstable language. It
has matured and changed a lot over the years. The tools like python's
2to3 tool have gained an infamous reputation.
OTOH, awk is very stable. I have been on the GNU variant's ML for some
time and I have noticed that when a question over implementation arises
they go looking at and, when necessary, consulting what the other awks are
doing. For Python there is only one implementation, thus only one way of
thinking about how it works unless you want to change something in the
core language.
Gentoo's portage is an excellent example of a good language gone bad
through less than ideal programming in python and it seems to me that,
based on the description above, the awk code in gcc needs a code base
cleanup and decrustification, not rewritten in the latest and greatest 
language simply because it is *the fad* of the day. And yes, by spelling
python out as *the* language of choice without any other options Mr.
Martin is recommending to us what to choose without any reason whatsoever
given.
Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust, (it's
also all the rage)? Or tex? Or SQL (that would at least be interesting to
read :) ?
A fast development cycle is the typical cry of python enthusiasts (and my
foolish self at one point in time), but there are plenty of other fast
development languages out there. 
In my not so humble opinion, this aught to be approached with some degree
of wisdom and intelligence as opposed to a zest for something new for
newnesses sake.

Sincerely,
David

PS: No, I am not volunteering myself.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [RFC] Adding Python as a possible language and it's usage
@ 2018-07-17 12:49 Martin Liška
  2018-07-18  1:01 ` David Malcolm
                   ` (5 more replies)
  0 siblings, 6 replies; 58+ messages in thread
From: Martin Liška @ 2018-07-17 12:49 UTC (permalink / raw)
  To: GCC Development

[-- Attachment #1: Type: text/plain, Size: 1310 bytes --]

Hi.

I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.

There are some bulletins why I would like to replace current AWK scripts:

1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)

2) similar happens in gcc/opth-gen.awk

3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
   we should come up with a structured option format that will make parsing and
   processing much simpler.

4) we can come up with new sanity checks of options:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397

5) there are various targets that generate *.opt files, one example is ARM:
gcc/config/arm/parsecpu.awk

where transforms:
./gcc/config/arm/arm-cpus.in

I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?

I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.

I'm looking forward to a feedback.
Martin

[-- Attachment #2: gcc-options.py --]
[-- Type: text/x-python, Size: 7622 bytes --]

#!/usr/bin/env python3

import re

class Option:
    def __init__(self, name, option_string, description):
        self.name = name
        self.option_string = option_string
        self.description = description
        self.options = {}

        self.parse_options()

    def parse_options(self):
        s = self.option_string

        while s != '':
            m = re.search('^(\w+)\(([^)]*)\)', s)
            if m != None:
                s = s[m.span(0)[1]:].strip()
                self.options[m.group(1)] = m.group(2)
                print(m.group(0))
            else:
                m2 = re.search('^[^\ ]*', s)
                s = s[m2.span(0)[1]:].strip()
                self.options[m2.group(0)] = None

    def flag_set_p(self, flag):
        return flag in self.options

    def get(self, key):
        return self.options[key]

    def get_c_type(self):
        if self.flag_set_p('UInteger'):
            return 'int'
        elif self.flag_set_p('Enum'):
            return 'enum'
        elif not self.flag_set_p('Joined') and not self.flag_set_p('Separate'):
            if self.flag_set_p('Mask'):
                if self.flag_set_p('HOST_WIDE_INT'):
                    return 'HOST_WIDE_INT'
                else:
                    return 'int'
            else:
                return 'signed char'
        else:
            return 'const char *'

    def get_c_type_size(self):
        type = self.get_c_type()
        if type == 'const char *' or type == 'HOST_WIDE_INT':
            return 8
        elif type == 'enum' or type == 'int':
            return 4
        elif type == 'signed char':
            return 1
        else:
            assert False

    def get_variable_name(self):
        name = self.get('Var')
        return name.split(',')[0]

    def get_full_c_type(self):
        t = self.get_c_type()
        if t == 'enum':
            return 'enum %s' % self.get('Enum')

    def generate_assignment(self, printer, lhs, rhs):
        name = self.get_variable_name()
        printer.print('%s->x_%s = %s->x_%s;' % (lhs, name, rhs, name), 2)

    def get_printf_format(self):
        t = self.get_c_type()
        return '%#x' if t != 'const char *' else '%s'

    def generate_print(self, printer):
        name = self.get_variable_name()
        format = self.get_printf_format() 
        printer.print('if (ptr->x_%s)' % name, 2)
        printer.print('fprintf (file, "%*s%s (' + format + ')\\n", indent_to, "", "' + name + '", ptr->x_' + name + ');', 4)

    def generate_print_diff(self, printer):
        name = self.get_variable_name()
        format = self.get_printf_format() 
        printer.print('if (ptr1->x_%s != ptr2->x_%s)' % (name, name), 2)
        printer.print('fprintf (file, "%*s%s (' + format + '/' + format + ')\\n", indent_to, "", "' + name + '", ptr1->x_' + name + ', ptr2->x_' + name +  ');', 4)

    def generate_hash(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('if (%s)' % v, 2)
            printer.print('hstate.add (%s, strlen (%s));' % (v, v), 4)
            printer.print('else', 2)
            printer.print('hstate.add_int (0);', 4)
        else:
            printer.print('hstate.add_hwi (%s);' % v, 2)

    def generate_stream_out(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('bp_pack_string (ob, bp, %s, true);' % v, 2)
        else:
            printer.print('bp_pack_value (bp, %s, 64);' % v, 2)

    def generate_stream_in(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('%s = bp_unpack_string (data_in, bp);' % v, 2)
            printer.print('if (%s)' % v, 2)
            printer.print('%s = xstrdup (%s);' % (v, v), 4)
        else:
            cast = '' if t != 'enum' else '(%s)' % self.get_full_c_type()
            printer.print('%s = %sbp_unpack_value (bp, 64);' % (v, cast), 2)

    def print(self):
        print('%s:%s:%s' % (self.name, self.options, self.description))

class Printer:
    def print_function_header(self, comment, return_type, name, args):
        print('/* %s */' % comment)
        print(return_type)
        print('%s (%s)' % (name, ', '.join(args)))
        print('{')

    def print_function_footer(self):
        print('}\n')

    def print(self, s, indent):
        print(' ' * indent + s)

delimiter = u'\x1c'

printer = Printer()

# parse content of optionlist
lines = [line.strip() for line in open('/dev/shm/objdir/gcc/optionlist').readlines()]
flags = []
for l in lines:
    parts = l.split(delimiter)

    description = None
    if len(parts) > 2:
        description = ' '.join(parts[2:])

    name = parts[0]
    ignored = set(['Language', 'TargetSave', 'Variable', 'TargetVariable', 'HeaderInclude', 'SourceInclude', 'Enum', 'EnumValue'])

    if not name in ignored:
        flags.append(Option(name, parts[1], description))

optimization_flags = [f for f in flags if (f.flag_set_p('Optimization') or f.flag_set_p('PerFunction')) and f.flag_set_p('Var')]
optimization_flags = sorted(optimization_flags, key = lambda x: (x.get_c_type_size(), x.get_c_type()), reverse = True)

# start printing
printer.print_function_header('Save optimization variables into a structure.',
        'void', 'cl_optimization_save', ['cl_optimization *ptr, gcc_options *opts'])
for f in optimization_flags:
    f.generate_assignment(printer, 'ptr', 'opts')
printer.print_function_footer()

printer.print_function_header('Restore optimization options from a structure.',
        'void', 'cl_optimization_restore', ['gcc_options *opts', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_assignment(printer, 'opts', 'ptr')
printer.print('targetm.override_options_after_change ();', 2)
printer.print_function_footer()

printer.print_function_header('Print optimization options from a structure.',
        'void', 'cl_optimization_print', ['FILE *file', 'int indent_to', 'cl_optimization *ptr'])
printer.print('fputs ("\\n", file);', 2)
for f in optimization_flags:
    f.generate_print(printer)
printer.print_function_footer()

printer.print_function_header('Print different optimization variables from structures provided as arguments.',
        'void', 'cl_optimization_print_diff', ['FILE *file', 'int indent_to', 'cl_optimization *ptr1', 'cl_optimization *ptr2'])
for f in optimization_flags:
    f.generate_print_diff(printer)
printer.print_function_footer()

optimization_flags = list(filter(lambda x: x.flag_set_p('Optimization'), optimization_flags))

printer.print_function_header('Hash optimization options.',
        'hashval_t', 'cl_optimization_hash', ['cl_optimization const *ptr'])
printer.print('inchash::hash hstate;', 2)
for f in optimization_flags:
    f.generate_hash(printer)
printer.print('return hstate.end();', 2)
printer.print_function_footer()

printer.print_function_header('Stream out optimization options.',
        'void', 'cl_optimization_stream_out', ['output_block *ob', 'bitpack_d *bp', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_stream_out(printer)
printer.print_function_footer()

printer.print_function_header('Stream in optimization options.',
        'void', 'cl_optimization_stream_in', ['data_in *data_in', 'bitpack_d *bp', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_stream_in(printer)
printer.print_function_footer()

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2018-07-30 15:13 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1531832440.64499.ezmlm@gcc.gnu.org>
2018-07-17 17:13 ` [RFC] Adding Python as a possible language and it's usage Basile Starynkevitch
2018-07-17 23:52   ` David Malcolm
2018-07-17 20:37 David Niklas
2018-07-18  0:23 ` David Malcolm
2018-07-18  0:38   ` Paul Koning
2018-07-18 16:41   ` doark
2018-07-18 17:22     ` doark
  -- strict thread matches above, loose matches on Subject: below --
2018-07-17 12:49 Martin Liška
2018-07-18  1:01 ` David Malcolm
2018-07-19 20:24   ` Karsten Merker
2018-07-20 10:02     ` Matthias Klose
2018-07-20 10:07     ` Martin Liška
2018-07-18  9:51 ` Richard Biener
2018-07-18 10:03   ` Richard Earnshaw (lists)
2018-07-18 10:56   ` David Malcolm
2018-07-18 11:08     ` Jakub Jelinek
2018-07-18 11:31     ` Jonathan Wakely
2018-07-18 12:06       ` Eric S. Raymond
2018-07-18 12:15         ` Jonathan Wakely
2018-07-18 12:50           ` Joel Sherrill
2018-07-18 14:29             ` Matthias Klose
2018-07-18 14:46               ` Janne Blomqvist
2018-07-20 10:01               ` Martin Liška
2018-07-20 16:54                 ` Segher Boessenkool
2018-07-20 17:12                   ` Paul Koning
2018-07-20 17:59                     ` Segher Boessenkool
2018-07-20 18:59                       ` Konovalov, Vadim
2018-07-20 20:09                         ` Matthias Klose
2018-07-20 20:15                           ` Konovalov, Vadim
2018-07-18 21:28           ` Eric S. Raymond
2018-07-23 14:31     ` Joseph Myers
2018-07-18 22:42   ` Segher Boessenkool
2018-07-19 12:28     ` Florian Weimer
2018-07-19 20:08       ` Richard Earnshaw (lists)
2018-07-20  9:49         ` Michael Clark
2018-07-19 15:56     ` Jeff Law
2018-07-19 16:12       ` Eric Gallager
2018-07-20 10:05       ` Martin Liška
2018-07-18 15:13 ` Boris Kolpackov
2018-07-18 16:56   ` Paul Koning
2018-07-18 17:29     ` Boris Kolpackov
2018-07-18 17:44       ` Paul Koning
2018-07-18 18:11         ` Matthias Klose
2018-07-20 11:04           ` Martin Liška
2018-07-19 14:47     ` Konovalov, Vadim
2018-07-23 14:21 ` Joseph Myers
2018-07-27 14:31 ` Michael Matz
2018-07-27 14:38   ` Michael Matz
2018-07-28  3:01     ` Matthias Klose
2018-07-27 14:54   ` Joseph Myers
2018-07-27 15:11     ` Michael Matz
2018-07-28  0:26       ` Paul Smith
2018-07-30 14:34         ` Joseph Myers
2018-07-28 12:11     ` Ramana Radhakrishnan
2018-07-28 17:23       ` David Malcolm
2018-07-30 14:51       ` Joseph Myers
2018-07-30 16:29         ` Andreas Schwab
2018-07-28  2:29 ` konsolebox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).