Elimination of all floating point code in the tiny assembler

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* Elimination of all floating point code in the tiny assembler
@ 2023-09-10 17:41 jacob navia
  2023-09-13 11:18 ` Nick Clifton
  0 siblings, 1 reply; 9+ messages in thread
From: jacob navia @ 2023-09-10 17:41 UTC (permalink / raw)
  To: binutils

GAS has its own floating point code in very high precision. This code is quite cumbersome, and never used. All compilers emit floating point numbers as 32 or 64 bit integers, so, the directives for reading floating point numbers go unused.

Still, I maintained all those directives just in case. Internally, they all lead to a call to strtold(), that does all the work done previously by the floating point code in the assembler. The resulting long double is then down-casted to the appropriate precision (double, single, half).

This will cater all the needs of an assembler.

The gains are impressive: Around 1500 lines of floating point code are gone. From the standpoint of the maintaining GAS this should make quite a difference, since the code that was erased is completely incomprehensible unless you get into it after at least a week of work. True, it is running, but it could become a big maintenance problem somewhere in the future.

And, above all, it is never used! I have yet to see any compiler that uses it. For people using GAS, the directives .double, etc are largely enough.

True, back when this code was written, maybe the C library wasn’t as advanced as it is today, strtold I think it is C99, and the floating point code is way older.

This situation is clearly described in the (funny) comments:

/*
 * Seems atof_machine can backscan through generic_bignum and hit whatever
 * happens to be loaded before it in memory.  And its way too complicated for
 * me to fix right.  Thus a hack.  JF:  Just make generic_bignum bigger,and
 * never write into the early words,thus they'll always be zero. I hate Dean's
 * floating-point code.  Bleh.
 */

That is a very old comment. GNU developers have fought long battles to maintain this code… let’s forget it.

Of course these are personal thoughts, no need to do anything right now. But if anyone is interested, here is my new version of  parse_one_float:

/* DISCLAIMER: 
 * Here I give up any copyright rights that I may or may not have for this code and declare it property of GNU 
 */

static int  parse_one_float(int float_type,char temp[MAXIMUM_NUMBER_OF_CHARS_FOR_FLOAT])
{
    int     length;

    SKIP_WHITESPACE();

    /* Accept :xxxx,where the x's are hex digits,for a floating point
     * with the exact digits specified.  */
    if (input_line_pointer[0] == ':') {
        ++input_line_pointer;
        length = hex_float(float_type,temp);
        if (length < 0) {
            ignore_rest_of_line();
            return length;
        }     
    } else { // New code
        long double ld;
        double d; float f; _Float16 f16;

        errno=0;
        ld = strtold(input_line_pointer,&input_line_pointer);                                                                                          
        if (errno) goto err;
        switch (float_type) {
            case 'H': case 'h':
                f16 = ld; 
                length = 2;
                memcpy(temp,&f16,2);
                break;
            case 'f': case 'F': case 's': case 'S':
                f = ld; 
                length = 4;
                memcpy(temp,&f,4);
                break;
            case 'd': case 'D': case 'r': case 'R':
                d = ld; 
                length = 8;
                memcpy(temp,&d,8);
                break;

            default:
err:
                as_bad("bad floating point literal");
                ignore_rest_of_line();
                return (-1); 
        }     
    }
    SKIP_WHITESPACE();
    return length;
}

You see?
It is TRIVIAL to follow, and even a total noob will understand it!

Jacob

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-10 17:41 Elimination of all floating point code in the tiny assembler jacob navia
@ 2023-09-13 11:18 ` Nick Clifton
  2023-09-13 13:54   ` Christian Groessler
  2023-09-14  7:49   ` jacob navia
  0 siblings, 2 replies; 9+ messages in thread
From: Nick Clifton @ 2023-09-13 11:18 UTC (permalink / raw)
  To: jacob navia, binutils

Hi Jacob,

> GAS has its own floating point code in very high precision. This code is quite cumbersome, and never used. All compilers emit floating point numbers as 32 or 64 bit integers, so, the directives for reading floating point numbers go unused.

Except of course when assembling hand written assembler source code.
You can bet that there is code out there that relies upon this feature
of the assembler.

A second point is that GAS actually has three different versions of the
text-to-float conversion code: atof-generic.c, atof-ieee.c and atof-vax.c.
These are to support the requirements of different architectures.  Any
replacement code would ideally remove all three of these implementations,
although of course it would have to take care to not break anything.

Another issue is that the code needs to work when running in a cross
assembly environment.  So for example it must work when running on a big
endian host but assembling for a little endian target, or when running
on a 32-bit host assembling for a 64-bit target.

All of which is not to say "don't do this".  We absolutely would be
interested in any patches to improve/simplify the assembler.  Just please
do consider that the code needs to be portable, paranoid and pleasing.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-13 11:18 ` Nick Clifton
@ 2023-09-13 13:54   ` Christian Groessler
  2023-09-13 14:01     ` Paul Koning
  2023-09-14  7:49   ` jacob navia
  1 sibling, 1 reply; 9+ messages in thread
From: Christian Groessler @ 2023-09-13 13:54 UTC (permalink / raw)
  To: binutils

Hello Jacob,


On 9/13/23 13:18, Nick Clifton via Binutils wrote:
>> GAS has its own floating point code in very high precision. This code 
>> is quite cumbersome, and never used. All compilers emit floating point 
>> numbers as 32 or 64 bit integers, so, the directives for reading 
>> floating point numbers go unused.
> 
> Except of course when assembling hand written assembler source code.
> You can bet that there is code out there that relies upon this feature
> of the assembler.
> 
> A second point is that GAS actually has three different versions of the
> text-to-float conversion code: atof-generic.c, atof-ieee.c and atof-vax.c.
> These are to support the requirements of different architectures.  Any
> replacement code would ideally remove all three of these implementations,
> although of course it would have to take care to not break anything.
> 
> Another issue is that the code needs to work when running in a cross
> assembly environment.  So for example it must work when running on a big
> endian host but assembling for a little endian target, or when running
> on a 32-bit host assembling for a 64-bit target.
> 
> All of which is not to say "don't do this".  We absolutely would be
> interested in any patches to improve/simplify the assembler.  Just please
> do consider that the code needs to be portable, paranoid and pleasing.

appending to Nick's response I want to say that it's not so straight 
forward.

There are different reprensations of floating point numbers. So a 
generic "strtoul" doesn't help. It's dependent on the build machine and 
not on the target.

regards,
chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-13 13:54   ` Christian Groessler
@ 2023-09-13 14:01     ` Paul Koning
  0 siblings, 0 replies; 9+ messages in thread
From: Paul Koning @ 2023-09-13 14:01 UTC (permalink / raw)
  To: Christian Groessler; +Cc: binutils

GCC has for years had an accurate and flexible way of handling target floating point representations for any build system.  Can Binutils simply use that code?  It would seem logical to do so.  Among other things, that will painlessly handle things like VAX float on IEEE build machines, or vice versa.

	paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-13 11:18 ` Nick Clifton
  2023-09-13 13:54   ` Christian Groessler
@ 2023-09-14  7:49   ` jacob navia
  2023-09-14  8:35     ` Jan Beulich
  2023-09-14  8:38     ` Simon Richter
  1 sibling, 2 replies; 9+ messages in thread
From: jacob navia @ 2023-09-14  7:49 UTC (permalink / raw)
  To: Nick Clifton; +Cc: binutils

> Le 13 sept. 2023 à 13:18, Nick Clifton <nickc@redhat.com> a écrit :
> 
> Hi Jacob,
> 
>> GAS has its own floating point code in very high precision. This code is quite cumbersome, and never used. All compilers emit floating point numbers as 32 or 64 bit integers, so, the directives for reading floating point numbers go unused.
> 
> Except of course when assembling hand written assembler source code.
> You can bet that there is code out there that relies upon this feature
> of the assembler.

To make things clear: 

I haven’t taken away ANY user visible features. .double, etc, still continue to run as before. What changes is that instead of calling the floating point code in the assembler, I call the standard C library function « strtold ». That’s all. It continues a counter-trend: GNU software tends to reinvent the C library, and that is a bad idea. « strtold » knows well the machine it is running on..

> 
> A second point is that GAS actually has three different versions of the
> text-to-float conversion code: atof-generic.c, atof-ieee.c and atof-vax.c.

I know. I have replaced all three:

The first one, atof-ieee.c was replaced by strtold. The second has been dropped, since I do not support any floating point calculations in expressions any more. OK, you can try to support them forever, but I do not see why an assembler should support floating point calculations in assembler expressions!!!

Another problem with free software is that they are full of features nobody really uses. I just do not see in which context, making floating point calculations in assembler expressions is justified. Please enlighten me about that.

The third one (atof-vax.c) is obvious. The VAX stopped production in 2000. DEC corporation disappeared in 1998, bought by Compaq that stopped all VAXes in 2000. We are almost in 2024 now. It has been more than 25 years since DEC disappeared, and more than 20 years that the VAX disappeared. The grace period is over. 

> These are to support the requirements of different architectures.  Any
> replacement code would ideally remove all three of these implementations,
> although of course it would have to take care to not break anything.
> 

Look, free software is always trying to get funds. Normal. Declare all VAX support and floating point support OPTIONAL and PAYING consulting fees. :-)
You will see immediately that nobody uses those features.

> Another issue is that the code needs to work when running in a cross
> assembly environment.  So for example it must work when running on a big
> endian host but assembling for a little endian target, or when running
> on a 32-bit host assembling for a 64-bit target.

Big endian hosts have disappeared long ago. SUN Microsystems died, Motorola died, I can’t name any big endian host, but maybe there are some left, I do not know. Doing cross assembly in a 32 bit host for a 64 bit host… that looks weird but maybe possible, even if I would say that doing cross assembly in a 64 bit host for a 32 bit target would be more easy to find. Now: strtold doesn’t have anything to do with 32/64 bit stuff since the binary format of floating point numbers is specified by the ieee standard and should be the same in all little-endian machines.
> 
> All of which is not to say "don't do this".  We absolutely would be
> interested in any patches to improve/simplify the assembler.  Just please
> do consider that the code needs to be portable, paranoid and pleasing.
> 

Sure, I know that. But that has made the assembler frozen in a complexity that is absolutely incredible. Since the code should run in all kinds of machines that do not longer exist, developers can’t test!!!!!!!!!!!

I repeat: DEVELOPERS CAN’T TEST!!!!!!!!!

And when you can’t test your changes you do not make any changes at all, and software gets ever more complex because the necessary cleanups are NEVER done!

That’s what I am trying to do now.

Have you ever run "gcc -analyze as.c" ???

You can’t… GCC can’t follow all the librairies (bfd, liberty, whatever) that are included in the code, can’t see the bugs in each of them!

But NOW it is possible. I have run « gcc -analyze asm.c » and after 15 minutes (in a riscv machine) I HAD ANSWERS pointing me to some bugs that I reported here. This is NEW and represents a great step in developing code in GAS.

I have created a framework where you CAN do changes in a relatively tiny piece of software and see all effects immediately. Just download the tiny assembler and you are all set. You have a small 35 000 lines asm.c and a 10 000 lines asm.h. Period. Nothing else. And compiles everywhere since it is standard C. Building it is as easy as typing "gcc -o asm asm.c -lz -lm «   And it has a 150 pages TECHNICAL DOCUMENTATION!!!!!, something that is completely missing for  the megabytes long source code of GAS.

Thanks for your input.

Jacob

P.S. The technical doc has 150 pages, as I said. It would be nice if somebody would read it...

> Cheers
>  Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-14  7:49   ` jacob navia
@ 2023-09-14  8:35     ` Jan Beulich
  2023-09-14 11:10       ` jacob navia
  2023-09-14  8:38     ` Simon Richter
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2023-09-14  8:35 UTC (permalink / raw)
  To: jacob navia; +Cc: binutils, Nick Clifton

On 14.09.2023 09:49, jacob navia wrote:
>> Le 13 sept. 2023 à 13:18, Nick Clifton <nickc@redhat.com> a écrit :
>>> GAS has its own floating point code in very high precision. This code is quite cumbersome, and never used. All compilers emit floating point numbers as 32 or 64 bit integers, so, the directives for reading floating point numbers go unused.
>>
>> Except of course when assembling hand written assembler source code.
>> You can bet that there is code out there that relies upon this feature
>> of the assembler.
> 
> To make things clear: 
> 
> I haven’t taken away ANY user visible features. .double, etc, still continue to run as before. What changes is that instead of calling the floating point code in the assembler, I call the standard C library function « strtold ». That’s all. It continues a counter-trend: GNU software tends to reinvent the C library, and that is a bad idea. « strtold » knows well the machine it is running on..

But the machine the assembler is running on isn't necessarily relevant.

>> Another issue is that the code needs to work when running in a cross
>> assembly environment.  So for example it must work when running on a big
>> endian host but assembling for a little endian target, or when running
>> on a 32-bit host assembling for a 64-bit target.
> 
> Big endian hosts have disappeared long ago. SUN Microsystems died, Motorola died, I can’t name any big endian host, but maybe there are some left, I do not know.

Even the relatively new RISC-V still allows either endianness to be used,
if I'm not mistaken.

> Doing cross assembly in a 32 bit host for a 64 bit host… that looks weird but maybe possible, even if I would say that doing cross assembly in a 64 bit host for a 32 bit target would be more easy to find. Now: strtold doesn’t have anything to do with 32/64 bit stuff since the binary format of floating point numbers is specified by the ieee standard and should be the same in all little-endian machines.

Assuming all architectures you care about use IEEE representation only.
Think about x86'es 80-bit floating point data type, which - apart from
ia64 - probably isn't used much elsewhere. In x86 compilers you even
often can control via command line option what exactly long double is.
How is that going to work for your use of strtold() in all cases?

It also looks as if you don't really appreciate the value of cross tool
chains. That goes well beyond building a 32-bit assembler on a 64-bit
host (or vice versa) of otherwise the same architecture and OS (or, for
the latter, at least the same underlying ABI).

As to floating point "expressions" - I don't think gas supports these,
at least not the general case. It supports floating point values, but
that's (about?) it. And that has been necessary - beyond just .float
and .double - for something as commonly used as as glibc, until less
than a year ago. See
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=114e299ca66353fa7be1ee45bb4e1307d3de1fa2.

>> All of which is not to say "don't do this".  We absolutely would be
>> interested in any patches to improve/simplify the assembler.  Just please
>> do consider that the code needs to be portable, paranoid and pleasing.
> 
> Sure, I know that. But that has made the assembler frozen in a complexity that is absolutely incredible. Since the code should run in all kinds of machines that do not longer exist, developers can’t test!!!!!!!!!!!
> 
> I repeat: DEVELOPERS CAN’T TEST!!!!!!!!!

May I please ask that you stop shouting?

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-14  7:49   ` jacob navia
  2023-09-14  8:35     ` Jan Beulich
@ 2023-09-14  8:38     ` Simon Richter
  2023-09-14 13:48       ` Paul Koning
  1 sibling, 1 reply; 9+ messages in thread
From: Simon Richter @ 2023-09-14  8:38 UTC (permalink / raw)
  To: binutils

Hi,

On 9/14/23 16:49, jacob navia wrote:

> Big endian hosts have disappeared long ago.

In the hobbyist market, yes. POWER and zSeries still exist, and are in 
active use.

> Doing cross assembly in a 32 bit host for a 64 bit host… that looks weird but maybe possible, even if I would say that doing cross assembly in a 64 bit host for a 32 bit target would be more easy to find.

Both are fairly normal, and I'd argue that the ability to bootstrap a 64 
bit system from a 32 bit system is quite important.

This is also a matter of code quality. Baking assumptions into the code 
leads to maintainability issues down the line as someone will have to 
identify the problem and manually trace it back to the spot in the code 
where the assumption was made that doesn't quite hold.

> I have created a framework where you CAN do changes in a relatively tiny piece of software and see all effects immediately. Just download the tiny assembler and you are all set. You have a small 35 000 lines asm.c and a 10 000 lines asm.h. Period. Nothing else. And compiles everywhere since it is standard C.

But does it work everywhere, or does it silently fail in a way that it 
generates broken data, causing other people to spend significant amounts 
of time finding out what the problem is, report it and be told that 
their use case is out of scope? Because we have way too many "free 
software" projects like that.

    Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-14  8:35     ` Jan Beulich
@ 2023-09-14 11:10       ` jacob navia
  0 siblings, 0 replies; 9+ messages in thread
From: jacob navia @ 2023-09-14 11:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils, Nick Clifton

[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

> Le 14 sept. 2023 à 10:35, Jan Beulich <jbeulich@suse.com> a écrit :
> 
> Assuming all architectures you care about use IEEE representation only.
> Think about x86'es 80-bit floating point data type, which - apart from
> ia64 - probably isn't used much elsewhere. In x86 compilers you even
> often can control via command line option what exactly long double is.
> How is that going to work for your use of strtold() in all cases?

This is a valid objection.
Scenario 1:
Strtold yields a 80 bits long double but the user wants a 128 bits (IEEE) long double.
This will not work. In this case a 128 bit strtold should be called. Since the compiler MUST be prepared for BOTH eventualities, it has a mean of calling the 128 bits strtold, that should be named strtold128, or whatever. In that case, the only modification needed here is to change « strtold » to strtold128. The libc of most targets has already a thing like that.

Scenario 2:
Strtold yields a 128 bit long double but user wants an 80 bit long double. In this case there is a quite trivial conversion needed. 

Scenario 3:
_Float16 is not implemented in gcc. For instance gcc 11 doesn’t support _Float16. For those cases I wrote a conversion, and now I call the conversion for older gcc versions.
In any case I am sure it is much easier to write a 16 bit number as an integer than as a _Float16. So, this is not very important.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Elimination of all floating point code in the tiny assembler
  2023-09-14  8:38     ` Simon Richter
@ 2023-09-14 13:48       ` Paul Koning
  0 siblings, 0 replies; 9+ messages in thread
From: Paul Koning @ 2023-09-14 13:48 UTC (permalink / raw)
  To: Simon Richter; +Cc: binutils



> On Sep 14, 2023, at 4:38 AM, Simon Richter <Simon.Richter@hogyros.de> wrote:
> 
> Hi,
> 
> On 9/14/23 16:49, jacob navia wrote:
> 
>> Big endian hosts have disappeared long ago.
> 
> In the hobbyist market, yes. POWER and zSeries still exist, and are in active use.

MIPS also supports either endian, and big endian appears to be commonly viewed as the primary choice. For that matter, the documentation states that ARM is bi-endian (with little endian being the common choice).

	paul



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-09-14 13:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-10 17:41 Elimination of all floating point code in the tiny assembler jacob navia
2023-09-13 11:18 ` Nick Clifton
2023-09-13 13:54   ` Christian Groessler
2023-09-13 14:01     ` Paul Koning
2023-09-14  7:49   ` jacob navia
2023-09-14  8:35     ` Jan Beulich
2023-09-14 11:10       ` jacob navia
2023-09-14  8:38     ` Simon Richter
2023-09-14 13:48       ` Paul Koning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).