Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Christophe Lyon <christophe.lyon@linaro.org>
To: Daniel Engel <libgcc@danielengel.com>
Cc: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>,
	gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0
Date: Mon, 25 Jan 2021 18:48:54 +0100	[thread overview]
Message-ID: <CAKdteOYptCFvha9X_aOZzDptRkmdEyuiGtpC-an3+sRX+UbYig@mail.gmail.com> (raw)
In-Reply-To: <CAKdteObxRH5GbpfpWOU3DBL=HyFpL9Yc6xznWF4nHPfiaeqSGQ@mail.gmail.com>

On Fri, 22 Jan 2021 at 19:28, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
>
> On Thu, 21 Jan 2021 at 21:35, Daniel Engel <libgcc@danielengel.com> wrote:
> >
> > Hi Christophe,
> >
> > On Thu, Jan 21, 2021, at 2:29 AM, Christophe Lyon wrote:
> > > On Sat, 16 Jan 2021 at 17:13, Daniel Engel <libgcc@danielengel.com> wrote:
> > > >
> > > > Hi Christophe,
> > > >
> > > > On Fri, Jan 15, 2021, at 4:30 AM, Christophe Lyon wrote:
> > > > > On Fri, 15 Jan 2021 at 12:39, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > >
> > > > > > Hi Christophe,
> > > > > >
> > > > > > On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote:
> > > > > > > On Mon, 11 Jan 2021 at 17:18, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote:
> > > > > > > > > On Sat, 9 Jan 2021 at 14:09, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Sat, 9 Jan 2021 at 13:27, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jan 7, 2021, at 4:56 AM, Richard Earnshaw wrote:
> > > > > > > > > > > > On 07/01/2021 00:59, Daniel Engel wrote:
> > > > > > > > > > > > > --snip--
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Jan 6, 2021, at 9:05 AM, Richard Earnshaw wrote:
> > > > > > > > > > > > > --snip--
> > > > > > > > > > > > >
> > > > > > > > > > > > >> - finally, your popcount implementations have data in the code segment.
> > > > > > > > > > > > >>  That's going to cause problems when we have compilation options such as
> > > > > > > > > > > > >> -mpure-code.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am just following the precedent of existing lib1funcs (e.g. __clz2si).
> > > > > > > > > > > > > If this matters, you'll need to point in the right direction for the
> > > > > > > > > > > > > fix.  I'm not sure it does matter, since these functions are PIC anyway.
> > > > > > > > > > > >
> > > > > > > > > > > > That might be a bug in the clz implementations - Christophe: Any thoughts?
> > > > > > > > > > >
> > > > > > > > > > > __clzsi2() has test coverage in "gcc.c-torture/execute/builtin-bitops-1.c"
> > > > > > > > > > Thanks, I'll have a closer look at why I didn't see problems.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > So, that's because the code goes to the .text section (as opposed to
> > > > > > > > > .text.noread)
> > > > > > > > > and does not have the PURECODE flag. The compiler takes care of this
> > > > > > > > > when generating code with -mpure-code.
> > > > > > > > > And the simulator does not complain because it only checks loads from
> > > > > > > > > the segment with the PURECODE flag set.
> > > > > > > > >
> > > > > > > > This is far out of my depth, but can something like:
> > > > > > > >
> > > > > > > > ifeq (,$(findstring __symbian__,$(shell $(gcc_compile_bare) -dM -E - </dev/null)))
> > > > > > > >
> > > > > > > > be adapted to:
> > > > > > > >
> > > > > > > > a) detect the state of the -mpure-code switch, and
> > > > > > > > b) pass that flag to the preprocessor?
> > > > > > > >
> > > > > > > > If so, I can probably fix both the target section and the data usage.
> > > > > > > > Just have to add a few instructions to finish unrolling the loop.
> > > > > > >
> > > > > > > I must confess I never checked libgcc's Makefile deeply before,
> > > > > > > but it looks like you can probably detect whether -mpure-code is
> > > > > > > part of $CFLAGS.
> > > > > > >
> > > > > > > However, it might be better to write pure-code-safe code
> > > > > > > unconditionally because the toolchain will probably not
> > > > > > > be rebuilt with -mpure-code as discussed before.
> > > > > > > Or that could mean adding a -mpure-code multilib....
> > > > > >
> > > > > > I have learned a few things since the last update.  I think I know how
> > > > > > to get -mpure-code out of CFLAGS and into a macro.  However, I have hit
> > > > > > something of a wall with testing.  I can't seem to compile any flavor of
> > > > > > libgcc with CFLAGS_FOR_TARGET="-mpure-code".
> > > > > >
> > > > > > 1.  Configuring --with-multilib-list=rmprofile results in build failure:
> > > > > >
> > > > > >     checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/libgcc':
> > > > > >     configure: error: cannot compute suffix of object files: cannot compile
> > > > > >     See `config.log' for more details
> > > > > >
> > > > > >    cc1: error: -mpure-code only supports non-pic code on M-profile targets
> > > > > >
> > > > >
> > > > > Yes, I did hit that wall too :-)
> > > > >
> > > > > Hence what we discussed earlier: the toolchain is not rebuilt with -mpure-code.
> > > > >
> > > > > Note that there are problems in newlib too, but users of -mpure-code seem
> > > > > to be able to work around that (eg. using their own startup code and no stdlib)
> > > >
> > > > Is there a current side project to solve the makefile problems?
> > > None that I know of.
> > >
> > >
> > > > I think I'm back to my original question: If libgcc can't be built
> > > > with -mpure-code, and users bypass it completely with -nostdlib, then
> > > > why this conversation about pure-code compatibility of __clzsi2() etc?
> > > I think Richard noticed this pre-existing problem as part of the review
> > > of your patches. I don't think I meant fixing this is a prerequisite.
> > > But maybe I misunderstood :-)
> >
> > I might have misunderstood too then.  It was certainly a pre-existing
> > problem, but I took the comments to mean that I had to own it as part of
> > touching those functions.
> >
> > > > > > 2.  Attempting to filter the multib list results in configuration error.
> > > > > >     This might have been misguided, but it was something I tried:
> > > > > >
> > > > > >     Error: --with-multilib-list=armv6s-m not supported.
> > > > > >
> > > > > >     Error: --with-multilib-list=mthumb/march=armv6s-m/mfloat-abi=soft not supported
> > > > >
> > > > > I think only 2 values are supported: aprofile and rmprofile.
> > > >
> > > > It looks like this might require a custom t-* multilib in gcc/config/arm.
> > > Or we could add -mpure-code to the rmprofile list.
> >
> > I have no strong opinions here.  Are you proposing that the "m" versions
> > of libgcc be built with -mpure-code enabled by default, or are you
> > proposing a parallel set of multilibs?  -mpure-code by default would
> > have costs in both size and speed.
> I'm not sure how large the penalty is for thumb-2 cores?
> Maybe it's acceptable to build thumb-2 with -mpure-code by default,
> but probably not for cortex-m0.
>
> > > > > > 3.  Attempting to configure a single architecture results in a build error.
> > > > > >
> > > > > >     --with-mode=thumb --with-arch=armv6s-m --with-float=soft
> > > > > >
> > > > > >     checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > > > > >     configure: error: cannot compute suffix of object files: cannot compile
> > > > > >     See `config.log' for more details
> > > > > >
> > > > > >     conftest.c:9:10: fatal error: ac_nonexistent.h: No such file or directory
> > > > > >         9 | #include <ac_nonexistent.h>
> > > > > >           |          ^~~~~~~~~~~~~~~~~~
> > > > > I never saw that error message, but I never build using --with-arch.
> > > > > I do use --with-cpu though.
> > > > >
> > > > > > This has me wondering whether pure-code in libgcc is a real issue ...
> > > > > > If there's a way to build libgcc with -mpure-code, please enlighten me.
> > > > > I haven't done so yet. Maybe building the toolchain --with-cpu=cortex-m0
> > > > > works?
> > > >
> > > > No luck with that.  Same error message as before:
> > > >
> > > > 4.  --with-mode=thumb --with-arch=armv6s-m --with-float=soft --with-cpu=cortex-m0
> > > >
> > > >     Switch "--with-arch" may not be used with switch "--with-cpu"
> > > >
> > > > 5.  Then: --with-mode=thumb --with-float=soft --with-cpu=cortex-m0
> > > >
> > > >     checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > > >     configure: error: cannot compute suffix of object files: cannot compile
> > > >     See `config.log' for more details
> > > >
> > > >     cc1: error: -mpure-code only supports non-pic code on M-profile targets
> > > Yes that's because default multilibs include targets incompatible with
> > > -mpure-code
> > >
> > > > 6.  Finally! --with-float=soft --with-cpu=cortex-m0 --disable-multilib
> > > >
> > > > Once you know this, and read the docs sideways, the previous errors are
> > > > all probably "works as designed".  But, I can still grumble.
> > > Yes, I think it's "as designed". I faced the "incompatible multilibs" issue
> > > too some time ago. Hence testing is not easy.
> > >
> > > > With libgcc compiled with -mpure-code, I can confirm that
> > > > 'builtin-bitops-1.c' (the test for __clzsi2) passed with libgcc as-is.
> > > >
> > > > I then added the SHF_ARM_PURECODE flag to the libgcc assembly functions
> > > > and re-ran the test.  Still passed.  I then added -mpure-code to
> > > > RUNTESTFLAGS and re-ran the test.  Still passed.  readelf confirmed that
> > > > the test program is compiling as expected [1]:
> > > >
> > > >     [ 2] .text             PROGBITS        0000800c 00800c 003314 00 AXy  0   0  4
> > > >     Key to Flags:
> > > >     W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> > > >     L (link order), O (extra OS processing required), G (group), T (TLS),
> > > >     C (compressed), x (unknown), o (OS specific), E (exclude),
> > > >     y (purecode), p (processor specific)
> > > >
> > > > It was only when I started inserting pure-code test directives into
> > > > 'builtin-bitops-1.c' that 'make check' began to report errors.
> > > >
> > > >     /* { dg-do compile } */
> > > >     ...
> > > >     /* { dg-options "-mpure-code -mfp16-format=ieee" } */
> > > >     /* { dg-final { scan-assembler-not "\\.(float|l\\?double|\d?byte|short|int|long|quad|word)\\s+\[^.\]" } } */
> > > >
> > > > However, for reasons [2] [3] [4] [5], this wasn't actually useful.  It's
> > > > sufficient to say that there are many reasons that non-pure-code
> > > > compatible functions exist in libgcc.
> > >
> > > I filed a PR to better track this discussion:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98779
> >
> > Possibly worth noting that my patch series addresses __clzsi2, __ctzsi2,
> > __aeabi_ldivmod, __aeabi_uldivmod, and most of __gnu_float2h_internal as
> > long as CFLAGS_FOR_TARGET contains -mpure-code.
> >
> Great.
>
> > >
> > > >
> > > > Although I'm not sure how useful this will be in light of the previous
> > > > findings, I did take the opportunity with a working compile process to
> > > > modify the relevant assembly functions for -mpure-code compatibility.
> > > > I can manually disassemble the library and verify correct compilation.
> > > > I can manually run a non-pure-code builtin-bitops-1 with a pure-code
> > > > library to verify correct execution.  But, I don't think your standard
> > > > regression suite will be able to exercise the new paths.
> > > Thanks for doing that. With the SHF_ARM_PURECODE flag you
> > > added to clz, I think my simulator would catch problems.
> >
> > See my more detailed comments following.  Unless you're also using a
> > custom linker script, I would have expected any simulator capable of
> > catching errors to have already caught them.  Putting the
> > SHF_ARM_PURECODE flag on clz actually seems rather cosmetic.
> >
>
> Yes, I have a custom linker script with
>   .text.noread      :
>     {
>         INPUT_SECTION_FLAGS (SHF_ARM_PURECODE) *(.text*)
>     } > purecode_memory
>
> > > > The patch is below; you can consider this as 34/33 in the series.
> > > >
> > > > Regards,
> > > > Daniel
> > > >
> > > > [1] It's pretty clear that the section flags in libgcc have never really
> > > >     mattered.  When the linker strings all of the used objects together,
> > > >     the original sections disappear into a single output object. The
> > > >     compiler controls those flags regardless of what libgcc does.)
> > > Not sure what you mean? The linker creates two segments, one
> > > with and one without the SHF_ARM_PURECODE flag.
> >
> > When libgcc is compiled "normally", individual objects in libgcc.a are
> > compiled _without_ SHF_ARM_PURECODE.  Note line 4 below, with flags
> > "AX" only (no "y").
> >
> > `readelf -S arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a`
> >
> >     File: arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a(_clzsi2.o)
> >     There are 19 section headers, starting at offset 0x43c:
> >
> >     Section Headers:
> >       [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
> >       [ 0]                   NULL            00000000 000000 000000 00      0   0  0
> >       [ 1] .text             PROGBITS        00000000 000034 000000 00  AX  0   0  2
> >       [ 2] .data             PROGBITS        00000000 000034 000000 00  WA  0   0  1
> >       [ 3] .bss              NOBITS          00000000 000034 000000 00  WA  0   0  1
> >   ==> [ 4] .text.sorted[...] PROGBITS        00000000 000034 000034 00  AX  0   0  4
> >       ...
> >     Key to Flags:
> >       W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> >       L (link order), O (extra OS processing required), G (group), T (TLS),
> >       C (compressed), x (unknown), o (OS specific), E (exclude),
> >       y (purecode), p (processor specific)
> >
> > When this "normal" libgcc is linked into an -mpure-code program
> > (e.g. with RUNTESTFLAGS), the linker script flattens all of the
> > sections together into a single output.  The relevant portion of the
> > linker script:
> >
> >       .text           :
> >       {
> >         *(.text.unlikely .text.*_unlikely .text.unlikely.*)
> >         *(.text.exit .text.exit.*)
> >         *(.text.startup .text.startup.*)
> >         *(.text.hot .text.hot.*)
> >         *(SORT(.text.sorted.*)) // _clzsi2.o matches here
> >         *(.text .stub .text.* .gnu.linkonce.t.*) // main.o matches here
> >         /* .gnu.warning sections are handled specially by elf.em.  */
> >         *(.gnu.warning)
> >         *(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)
> >       }
> >
> > I can't pretend to know how the linker merges conflicting flags from the
> > various input sections, but the final binary has the attributes
> > "AXy" as expected from the top level compile (line 2):
> >
> > `readelf -Sl builtin-bitops-1.exe`
> >
> >     There are 22 section headers, starting at offset 0x10934:
> >
> >     Section Headers:
> >       [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
> >       [ 0]                   NULL            00000000 000000 000000 00      0   0  0
> >       [ 1] .init             PROGBITS        00008000 008000 00000c 00  AX  0   0  4
> >   ==> [ 2] .text             PROGBITS        0000800c 00800c 00455c 00 AXy  0   0  4
> >       [ 3] .fini             PROGBITS        0000c568 00c568 00000c 00  AX  0   0  4
> >       [ 4] .rodata           PROGBITS        0000c574 00c574 000050 00   A  0   0  4
> >       [ 5] .ARM.exidx        ARM_EXIDX       0000c5c4 00c5c4 000008 00  AL  2   0  4
> >       [ 6] .eh_frame         PROGBITS        0000c5cc 00c5cc 000124 00   A  0   0  4
> >       [ 7] .init_array       INIT_ARRAY      0001c6f0 00c6f0 000004 04  WA  0   0  4
> >       [ 8] .fini_array       FINI_ARRAY      0001c6f4 00c6f4 000004 04  WA  0   0  4
> >       [ 9] .data             PROGBITS        0001c6f8 00c6f8 000a30 00  WA  0   0  8
> >       [10] .bss              NOBITS          0001d128 00d128 000114 00  WA  0   0  4
> >       ...
> >     Key to Flags:
> >       W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> >       L (link order), O (extra OS processing required), G (group), T (TLS),
> >       C (compressed), x (unknown), o (OS specific), E (exclude),
> >       y (purecode), p (processor specific)
> >
> >     There are 3 program headers, starting at offset 52
> >
> >     Program Headers:
> >       Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> >       EXIDX          0x00c5c4 0x0000c5c4 0x0000c5c4 0x00008 0x00008 R   0x4
> >       LOAD           0x000000 0x00000000 0x00000000 0x0c6f0 0x0c6f0 R E 0x10000
> >       LOAD           0x00c6f0 0x0001c6f0 0x0001c6f0 0x00a38 0x00b4c RW  0x10000
> >
> >      Section to Segment mapping:
> >       Segment Sections...
> >        00     .ARM.exidx
> >   ==>   01     .init .text .fini .rodata .ARM.exidx .eh_frame
> >        02     .init_array .fini_array .data .bss
> >
> > A segment contains complete sections, not the other way around.  As you
> > can see above, the first LOAD segment contains the entire ".text" plus
> > some other sections.   Thus, SHF_ARM_PURECODE flag really appears
> > to apply to all of "text", even though the bits linked in from libgcc
> > weren't built or flagged this way.
> >
> > >
> > > > [2] The existing pure-code tests are compile-only and cover just the
> > > >     disassembled 'main.o'.  There is no test of a complete executable
> > > >     and there is no execution/simulation.
> > > That's something I did manually: run the full gcc testsuite, forcing -mpure-code
> > > in RUNTESTFLAGS. This way, all execution tests are compiled with -mpure-code,
> > > and this is how I found several bugs.
> > >
> > > > [3] While other parts of binutils may understand SHF_ARM_PURECODE, I
> > > >     don't think the simulator checks section flags or throws exceptions.
> > > Indeed, I know of no public simulator that honors this flag. I do have
> > > one though.
> > >
> > > > [4] builtin-bitops-1 modified this way will always fail due to the array
> > > >     data definitions (longs, longlongs, etc).  GCC can't translate those
> > > >     to instructions.  While the ".data" section would presumably be
> > > >     readable, scan-assembler-not doesn't know the difference.
> > > Sure, adding such scan-assembler-not is not suitable for any existing testcase.
> > > That's why it is only in place for testcases dedicated to -mpure-code.
> > >
> > > > [5] Even if the simulator were modified to throw exceptions, this will
> > > >     continue to fail because _mainCRTStartup uses a literal pool.
> > > Yes, in general, and that's why I mentioned problems with newlib
> > > earlier in this thread.
> > >
> > > However, the simulator I use only throws an exception for code executed with
> > > SHF_ARM_PURECODE. Code in the "regular" code segment is not checked.
> > > So this does not catch errors in hand-written assembly code using regular .text,
> > > but it enables to run larger validations (such as the whole GCC testsuite)
> > > without having to fix all of newlib.
> > > Not perfect, as it left the issues in libgcc we are discussing, but it
> > > helped me fix
> > > several bugs in -mpure-code.
> >
> > Yet again I suspect that you have a custom linker script, or there's
> > some other major difference.  Using the public releases of binutils,
> > newlib, etc, my experiences just aren't lining up with yours.
>
> Yep, the linker script makes the difference.
>
> >
> > > Thanks,
> > >
> > > Christophe
> > >
> > > >
> > > > > Thanks,
> > > > >
> > > > > Christophe
> > > > >
> > > > > --snip--
> >
> > If the test server farm is free at some point, would you mind running
> > another set of regression tests on my v5 patch series?
>
> Sure. Given the number of sub-patches, can you send it to me as a
> single patch file
> (git format) that I can directly apply to GCC trunk?
> My mailer does not want to help with saving each patch as a proper
> patch file :-(
>

The validation results came back clean (no regression found).
Thanks

Christophe

> Thanks
>
> Christophe
>
> >
> > Regards,
> > Daniel

next prev parent reply	other threads:[~2021-01-25 17:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 23:04 [PATCH] " Daniel Engel
2020-11-26  9:14 ` Christophe Lyon
2020-12-02  3:32   ` Daniel Engel
2020-12-16 17:15     ` Christophe Lyon
2021-01-06 11:20       ` [PATCH v3] " Daniel Engel
2021-01-06 17:05         ` Richard Earnshaw
2021-01-07  0:59           ` Daniel Engel
2021-01-07 12:56             ` Richard Earnshaw
2021-01-07 13:27               ` Christophe Lyon
2021-01-07 16:44                 ` Richard Earnshaw
2021-01-09 12:28               ` Daniel Engel
2021-01-09 13:09                 ` Christophe Lyon
2021-01-09 18:04                   ` Daniel Engel
2021-01-11 14:49                     ` Richard Earnshaw
2021-01-09 18:48                   ` Daniel Engel
2021-01-11 16:07                   ` Christophe Lyon
2021-01-11 16:18                     ` Daniel Engel
2021-01-11 16:39                       ` Christophe Lyon
2021-01-15 11:40                         ` Daniel Engel
2021-01-15 12:30                           ` Christophe Lyon
2021-01-16 16:14                             ` Daniel Engel
2021-01-21 10:29                               ` Christophe Lyon
2021-01-21 20:35                                 ` Daniel Engel
2021-01-22 18:28                                   ` Christophe Lyon
2021-01-25 17:48                                     ` Christophe Lyon [this message]
2021-01-25 23:36                                       ` Daniel Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKdteOYptCFvha9X_aOZzDptRkmdEyuiGtpC-an3+sRX+UbYig@mail.gmail.com \
    --to=christophe.lyon@linaro.org \
    --cc=Richard.Earnshaw@foss.arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libgcc@danielengel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).