public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Adding Python as a possible language and it's usage
@ 2018-07-17 12:49 Martin Liška
  2018-07-18  1:01 ` David Malcolm
                   ` (5 more replies)
  0 siblings, 6 replies; 58+ messages in thread
From: Martin Liška @ 2018-07-17 12:49 UTC (permalink / raw)
  To: GCC Development

[-- Attachment #1: Type: text/plain, Size: 1310 bytes --]

Hi.

I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.

There are some bulletins why I would like to replace current AWK scripts:

1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)

2) similar happens in gcc/opth-gen.awk

3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
   we should come up with a structured option format that will make parsing and
   processing much simpler.

4) we can come up with new sanity checks of options:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397

5) there are various targets that generate *.opt files, one example is ARM:
gcc/config/arm/parsecpu.awk

where transforms:
./gcc/config/arm/arm-cpus.in

I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?

I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.

I'm looking forward to a feedback.
Martin

[-- Attachment #2: gcc-options.py --]
[-- Type: text/x-python, Size: 7622 bytes --]

#!/usr/bin/env python3

import re

class Option:
    def __init__(self, name, option_string, description):
        self.name = name
        self.option_string = option_string
        self.description = description
        self.options = {}

        self.parse_options()

    def parse_options(self):
        s = self.option_string

        while s != '':
            m = re.search('^(\w+)\(([^)]*)\)', s)
            if m != None:
                s = s[m.span(0)[1]:].strip()
                self.options[m.group(1)] = m.group(2)
                print(m.group(0))
            else:
                m2 = re.search('^[^\ ]*', s)
                s = s[m2.span(0)[1]:].strip()
                self.options[m2.group(0)] = None

    def flag_set_p(self, flag):
        return flag in self.options

    def get(self, key):
        return self.options[key]

    def get_c_type(self):
        if self.flag_set_p('UInteger'):
            return 'int'
        elif self.flag_set_p('Enum'):
            return 'enum'
        elif not self.flag_set_p('Joined') and not self.flag_set_p('Separate'):
            if self.flag_set_p('Mask'):
                if self.flag_set_p('HOST_WIDE_INT'):
                    return 'HOST_WIDE_INT'
                else:
                    return 'int'
            else:
                return 'signed char'
        else:
            return 'const char *'

    def get_c_type_size(self):
        type = self.get_c_type()
        if type == 'const char *' or type == 'HOST_WIDE_INT':
            return 8
        elif type == 'enum' or type == 'int':
            return 4
        elif type == 'signed char':
            return 1
        else:
            assert False

    def get_variable_name(self):
        name = self.get('Var')
        return name.split(',')[0]

    def get_full_c_type(self):
        t = self.get_c_type()
        if t == 'enum':
            return 'enum %s' % self.get('Enum')

    def generate_assignment(self, printer, lhs, rhs):
        name = self.get_variable_name()
        printer.print('%s->x_%s = %s->x_%s;' % (lhs, name, rhs, name), 2)

    def get_printf_format(self):
        t = self.get_c_type()
        return '%#x' if t != 'const char *' else '%s'

    def generate_print(self, printer):
        name = self.get_variable_name()
        format = self.get_printf_format() 
        printer.print('if (ptr->x_%s)' % name, 2)
        printer.print('fprintf (file, "%*s%s (' + format + ')\\n", indent_to, "", "' + name + '", ptr->x_' + name + ');', 4)

    def generate_print_diff(self, printer):
        name = self.get_variable_name()
        format = self.get_printf_format() 
        printer.print('if (ptr1->x_%s != ptr2->x_%s)' % (name, name), 2)
        printer.print('fprintf (file, "%*s%s (' + format + '/' + format + ')\\n", indent_to, "", "' + name + '", ptr1->x_' + name + ', ptr2->x_' + name +  ');', 4)

    def generate_hash(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('if (%s)' % v, 2)
            printer.print('hstate.add (%s, strlen (%s));' % (v, v), 4)
            printer.print('else', 2)
            printer.print('hstate.add_int (0);', 4)
        else:
            printer.print('hstate.add_hwi (%s);' % v, 2)

    def generate_stream_out(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('bp_pack_string (ob, bp, %s, true);' % v, 2)
        else:
            printer.print('bp_pack_value (bp, %s, 64);' % v, 2)

    def generate_stream_in(self, printer):
        t = self.get_c_type()
        name = self.get_variable_name()
        v = 'ptr->x_' + name
        if t == 'const char *':
            printer.print('%s = bp_unpack_string (data_in, bp);' % v, 2)
            printer.print('if (%s)' % v, 2)
            printer.print('%s = xstrdup (%s);' % (v, v), 4)
        else:
            cast = '' if t != 'enum' else '(%s)' % self.get_full_c_type()
            printer.print('%s = %sbp_unpack_value (bp, 64);' % (v, cast), 2)

    def print(self):
        print('%s:%s:%s' % (self.name, self.options, self.description))

class Printer:
    def print_function_header(self, comment, return_type, name, args):
        print('/* %s */' % comment)
        print(return_type)
        print('%s (%s)' % (name, ', '.join(args)))
        print('{')

    def print_function_footer(self):
        print('}\n')

    def print(self, s, indent):
        print(' ' * indent + s)

delimiter = u'\x1c'

printer = Printer()

# parse content of optionlist
lines = [line.strip() for line in open('/dev/shm/objdir/gcc/optionlist').readlines()]
flags = []
for l in lines:
    parts = l.split(delimiter)

    description = None
    if len(parts) > 2:
        description = ' '.join(parts[2:])

    name = parts[0]
    ignored = set(['Language', 'TargetSave', 'Variable', 'TargetVariable', 'HeaderInclude', 'SourceInclude', 'Enum', 'EnumValue'])

    if not name in ignored:
        flags.append(Option(name, parts[1], description))

optimization_flags = [f for f in flags if (f.flag_set_p('Optimization') or f.flag_set_p('PerFunction')) and f.flag_set_p('Var')]
optimization_flags = sorted(optimization_flags, key = lambda x: (x.get_c_type_size(), x.get_c_type()), reverse = True)

# start printing
printer.print_function_header('Save optimization variables into a structure.',
        'void', 'cl_optimization_save', ['cl_optimization *ptr, gcc_options *opts'])
for f in optimization_flags:
    f.generate_assignment(printer, 'ptr', 'opts')
printer.print_function_footer()

printer.print_function_header('Restore optimization options from a structure.',
        'void', 'cl_optimization_restore', ['gcc_options *opts', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_assignment(printer, 'opts', 'ptr')
printer.print('targetm.override_options_after_change ();', 2)
printer.print_function_footer()

printer.print_function_header('Print optimization options from a structure.',
        'void', 'cl_optimization_print', ['FILE *file', 'int indent_to', 'cl_optimization *ptr'])
printer.print('fputs ("\\n", file);', 2)
for f in optimization_flags:
    f.generate_print(printer)
printer.print_function_footer()

printer.print_function_header('Print different optimization variables from structures provided as arguments.',
        'void', 'cl_optimization_print_diff', ['FILE *file', 'int indent_to', 'cl_optimization *ptr1', 'cl_optimization *ptr2'])
for f in optimization_flags:
    f.generate_print_diff(printer)
printer.print_function_footer()

optimization_flags = list(filter(lambda x: x.flag_set_p('Optimization'), optimization_flags))

printer.print_function_header('Hash optimization options.',
        'hashval_t', 'cl_optimization_hash', ['cl_optimization const *ptr'])
printer.print('inchash::hash hstate;', 2)
for f in optimization_flags:
    f.generate_hash(printer)
printer.print('return hstate.end();', 2)
printer.print_function_footer()

printer.print_function_header('Stream out optimization options.',
        'void', 'cl_optimization_stream_out', ['output_block *ob', 'bitpack_d *bp', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_stream_out(printer)
printer.print_function_footer()

printer.print_function_header('Stream in optimization options.',
        'void', 'cl_optimization_stream_in', ['data_in *data_in', 'bitpack_d *bp', 'cl_optimization *ptr'])
for f in optimization_flags:
    f.generate_stream_in(printer)
printer.print_function_footer()

^ permalink raw reply	[flat|nested] 58+ messages in thread
[parent not found: <1531832440.64499.ezmlm@gcc.gnu.org>]
* RE: [RFC] Adding Python as a possible language and it's usage
@ 2018-07-17 20:37 David Niklas
  2018-07-18  0:23 ` David Malcolm
  0 siblings, 1 reply; 58+ messages in thread
From: David Niklas @ 2018-07-17 20:37 UTC (permalink / raw)
  To: gcc; +Cc: mliska

> Hi.
> 
> I've recently touched AWK option generate machinery and it's quite
> unpleasant to make any adjustments. My question is simple: can we
> starting using a scripting language like Python and replace usage of
> the AWK scripts? It's probably question for Steering committee, but I
> would like to see feedback from community.
> 
> There are some bulletins why I would like to replace current AWK
> scripts:
> 
> 1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
> flags type classes multiple global variables are created (var_opt_char,
> var_opt_string, ...)
> 
> 2) similar happens in gcc/opth-gen.awk
> 
> 3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
> believe we should come up with a structured option format that will
> make parsing and processing much simpler.
> 
> 4) we can come up with new sanity checks of options:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
> 
> 5) there are various targets that generate *.opt files, one example is
> ARM: gcc/config/arm/parsecpu.awk
> 
> where transforms:
> ./gcc/config/arm/arm-cpus.in
> 
> I guess having a well-defined structured format for *.opt files will
> make it easier to write generated opt files?
> 
> I'm attaching a prototype that can transform optionlist into
> options-save.c that can be compiled and works.
> 
> I'm looking forward to a feedback.
> Martin
<snip>

I was reading phoronix and came upon an article about this email.

As a FLOSS dev and someone who is familiar with both languages in
question, I'd like to point out that python is an unstable language. It
has matured and changed a lot over the years. The tools like python's
2to3 tool have gained an infamous reputation.
OTOH, awk is very stable. I have been on the GNU variant's ML for some
time and I have noticed that when a question over implementation arises
they go looking at and, when necessary, consulting what the other awks are
doing. For Python there is only one implementation, thus only one way of
thinking about how it works unless you want to change something in the
core language.
Gentoo's portage is an excellent example of a good language gone bad
through less than ideal programming in python and it seems to me that,
based on the description above, the awk code in gcc needs a code base
cleanup and decrustification, not rewritten in the latest and greatest 
language simply because it is *the fad* of the day. And yes, by spelling
python out as *the* language of choice without any other options Mr.
Martin is recommending to us what to choose without any reason whatsoever
given.
Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust, (it's
also all the rage)? Or tex? Or SQL (that would at least be interesting to
read :) ?
A fast development cycle is the typical cry of python enthusiasts (and my
foolish self at one point in time), but there are plenty of other fast
development languages out there. 
In my not so humble opinion, this aught to be approached with some degree
of wisdom and intelligence as opposed to a zest for something new for
newnesses sake.

Sincerely,
David

PS: No, I am not volunteering myself.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2018-07-30 15:13 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-17 12:49 [RFC] Adding Python as a possible language and it's usage Martin Liška
2018-07-18  1:01 ` David Malcolm
2018-07-19 20:24   ` Karsten Merker
2018-07-20 10:02     ` Matthias Klose
2018-07-20 10:07     ` Martin Liška
2018-07-18  9:51 ` Richard Biener
2018-07-18 10:03   ` Richard Earnshaw (lists)
2018-07-18 10:56   ` David Malcolm
2018-07-18 11:08     ` Jakub Jelinek
2018-07-18 11:31     ` Jonathan Wakely
2018-07-18 12:06       ` Eric S. Raymond
2018-07-18 12:15         ` Jonathan Wakely
2018-07-18 12:50           ` Joel Sherrill
2018-07-18 14:29             ` Matthias Klose
2018-07-18 14:46               ` Janne Blomqvist
2018-07-20 10:01               ` Martin Liška
2018-07-20 16:54                 ` Segher Boessenkool
2018-07-20 17:12                   ` Paul Koning
2018-07-20 17:59                     ` Segher Boessenkool
2018-07-20 18:59                       ` Konovalov, Vadim
2018-07-20 20:09                         ` Matthias Klose
2018-07-20 20:15                           ` Konovalov, Vadim
2018-07-18 21:28           ` Eric S. Raymond
2018-07-23 14:31     ` Joseph Myers
2018-07-18 22:42   ` Segher Boessenkool
2018-07-19 12:28     ` Florian Weimer
2018-07-19 20:08       ` Richard Earnshaw (lists)
2018-07-20  9:49         ` Michael Clark
2018-07-19 15:56     ` Jeff Law
2018-07-19 16:12       ` Eric Gallager
2018-07-20 10:05       ` Martin Liška
2018-07-18 15:13 ` Boris Kolpackov
2018-07-18 16:56   ` Paul Koning
2018-07-18 17:29     ` Boris Kolpackov
2018-07-18 17:44       ` Paul Koning
2018-07-18 18:11         ` Matthias Klose
2018-07-20 11:04           ` Martin Liška
2018-07-19 14:47     ` Konovalov, Vadim
2018-07-23 14:21 ` Joseph Myers
2018-07-27 14:31 ` Michael Matz
2018-07-27 14:38   ` Michael Matz
2018-07-28  3:01     ` Matthias Klose
2018-07-27 14:54   ` Joseph Myers
2018-07-27 15:11     ` Michael Matz
2018-07-28  0:26       ` Paul Smith
2018-07-30 14:34         ` Joseph Myers
2018-07-28 12:11     ` Ramana Radhakrishnan
2018-07-28 17:23       ` David Malcolm
2018-07-30 14:51       ` Joseph Myers
2018-07-30 16:29         ` Andreas Schwab
2018-07-28  2:29 ` konsolebox
     [not found] <1531832440.64499.ezmlm@gcc.gnu.org>
2018-07-17 17:13 ` Basile Starynkevitch
2018-07-17 23:52   ` David Malcolm
2018-07-17 20:37 David Niklas
2018-07-18  0:23 ` David Malcolm
2018-07-18  0:38   ` Paul Koning
2018-07-18 16:41   ` doark
2018-07-18 17:22     ` doark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).