From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by sourceware.org (Postfix) with ESMTPS id 58CF33861030 for ; Fri, 22 Jan 2021 18:28:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 58CF33861030 Received: by mail-ot1-x333.google.com with SMTP id 63so6024668oty.0 for ; Fri, 22 Jan 2021 10:28:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Y/jHlbHbUVE3uoKAZG4eE2pBb2tlGvQWsNVhNIWL3gI=; b=bCQCG4/uxBKBvDDpBkf5LvaTEU9qr2JVezMCmsQMOd8qzrpUAuZEIs7fFg/QKaqHVJ dA3uT7B3SsRP4Zzyq3UPHffi/BdBFXuJUK73Jp1rVg1He6bvDi/0WKoLdlykG7m/Ta2+ q7m+QZ7cF8PVZZz/Lxf6i7qr91bRobWvFyOu8WB1c3fXhJaNQd7GDSWPSGvdE+KRfcyu XW7fj7v8e3UK0k+hEOS9DDOrszL8ZQw54Kccb4L9Nx0m9yTAby7Cz925qztwFL9PsROq okGi0EjpjIo0Aa1ndX5KiRNz5dOQkz+oVLPnmbABjNWZhAsPKJP6fAIFZz29G57uJW9W guIw== X-Gm-Message-State: AOAM530XivYz7bjGzpJCFll5d/fOJv9vJzdCRzxKH8reRYli/lU14MdT rcCiqk0TX34G8qcCjiizQeVvFBMIdqrJEqb9yuBQC6jxyGWcEA== X-Google-Smtp-Source: ABdhPJzvrcfQz67FP7LNwGTu6B1Ynx3B+GjoYmPRXefIdT3WlW/3RSCgEGpbIO+OQ0Cj3eTJnNIthbnL5Lm1ntrLWvo= X-Received: by 2002:a05:6830:1081:: with SMTP id y1mr4220391oto.73.1611340097272; Fri, 22 Jan 2021 10:28:17 -0800 (PST) MIME-Version: 1.0 References: <3b36bc72-e92e-4372-8da4-43ade34d868b@www.fastmail.com> <962a0e7d-f431-42ee-aa42-e4e4cc823a10@www.fastmail.com> <35908b07-8822-e415-2abc-e04b8c9a692d@foss.arm.com> <7e0f7bb7-c2c4-4c77-8dc2-5f5c20a7bcec@www.fastmail.com> <3ba1767e-26f3-96d2-e5c7-987ca0a10f1d@foss.arm.com> <17c601ba-60e4-446c-b7b4-b1674d8500e5@www.fastmail.com> <85294ea5-1e59-4e7c-874b-97a62dde56b5@www.fastmail.com> <5daa8e98-3324-4a1d-9d0a-08f5d3971bc8@www.fastmail.com> <46598590-9bbe-4183-8f09-9beb964ffa31@www.fastmail.com> In-Reply-To: From: Christophe Lyon Date: Fri, 22 Jan 2021 19:28:06 +0100 Message-ID: Subject: Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0 To: Daniel Engel Cc: Richard Earnshaw , gcc Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2021 18:28:20 -0000 On Thu, 21 Jan 2021 at 21:35, Daniel Engel wrote: > > Hi Christophe, > > On Thu, Jan 21, 2021, at 2:29 AM, Christophe Lyon wrote: > > On Sat, 16 Jan 2021 at 17:13, Daniel Engel wrote: > > > > > > Hi Christophe, > > > > > > On Fri, Jan 15, 2021, at 4:30 AM, Christophe Lyon wrote: > > > > On Fri, 15 Jan 2021 at 12:39, Daniel Engel wrote: > > > > > > > > > > Hi Christophe, > > > > > > > > > > On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote: > > > > > > On Mon, 11 Jan 2021 at 17:18, Daniel Engel wrote: > > > > > > > > > > > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote: > > > > > > > > On Sat, 9 Jan 2021 at 14:09, Christophe Lyon wrote: > > > > > > > > > > > > > > > > > > On Sat, 9 Jan 2021 at 13:27, Daniel Engel wrote: > > > > > > > > > > > > > > > > > > > > On Thu, Jan 7, 2021, at 4:56 AM, Richard Earnshaw wrote: > > > > > > > > > > > On 07/01/2021 00:59, Daniel Engel wrote: > > > > > > > > > > > > --snip-- > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jan 6, 2021, at 9:05 AM, Richard Earnshaw wrote: > > > > > > > > > > > > --snip-- > > > > > > > > > > > > > > > > > > > > > > > >> - finally, your popcount implementations have data in the code segment. > > > > > > > > > > > >> That's going to cause problems when we have compilation options such as > > > > > > > > > > > >> -mpure-code. > > > > > > > > > > > > > > > > > > > > > > > > I am just following the precedent of existing lib1funcs (e.g. __clz2si). > > > > > > > > > > > > If this matters, you'll need to point in the right direction for the > > > > > > > > > > > > fix. I'm not sure it does matter, since these functions are PIC anyway. > > > > > > > > > > > > > > > > > > > > > > That might be a bug in the clz implementations - Christophe: Any thoughts? > > > > > > > > > > > > > > > > > > > > __clzsi2() has test coverage in "gcc.c-torture/execute/builtin-bitops-1.c" > > > > > > > > > Thanks, I'll have a closer look at why I didn't see problems. > > > > > > > > > > > > > > > > > > > > > > > > > So, that's because the code goes to the .text section (as opposed to > > > > > > > > .text.noread) > > > > > > > > and does not have the PURECODE flag. The compiler takes care of this > > > > > > > > when generating code with -mpure-code. > > > > > > > > And the simulator does not complain because it only checks loads from > > > > > > > > the segment with the PURECODE flag set. > > > > > > > > > > > > > > > This is far out of my depth, but can something like: > > > > > > > > > > > > > > ifeq (,$(findstring __symbian__,$(shell $(gcc_compile_bare) -dM -E - > > > > > > > > > > > > > be adapted to: > > > > > > > > > > > > > > a) detect the state of the -mpure-code switch, and > > > > > > > b) pass that flag to the preprocessor? > > > > > > > > > > > > > > If so, I can probably fix both the target section and the data usage. > > > > > > > Just have to add a few instructions to finish unrolling the loop. > > > > > > > > > > > > I must confess I never checked libgcc's Makefile deeply before, > > > > > > but it looks like you can probably detect whether -mpure-code is > > > > > > part of $CFLAGS. > > > > > > > > > > > > However, it might be better to write pure-code-safe code > > > > > > unconditionally because the toolchain will probably not > > > > > > be rebuilt with -mpure-code as discussed before. > > > > > > Or that could mean adding a -mpure-code multilib.... > > > > > > > > > > I have learned a few things since the last update. I think I know how > > > > > to get -mpure-code out of CFLAGS and into a macro. However, I have hit > > > > > something of a wall with testing. I can't seem to compile any flavor of > > > > > libgcc with CFLAGS_FOR_TARGET="-mpure-code". > > > > > > > > > > 1. Configuring --with-multilib-list=rmprofile results in build failure: > > > > > > > > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/libgcc': > > > > > configure: error: cannot compute suffix of object files: cannot compile > > > > > See `config.log' for more details > > > > > > > > > > cc1: error: -mpure-code only supports non-pic code on M-profile targets > > > > > > > > > > > > > Yes, I did hit that wall too :-) > > > > > > > > Hence what we discussed earlier: the toolchain is not rebuilt with -mpure-code. > > > > > > > > Note that there are problems in newlib too, but users of -mpure-code seem > > > > to be able to work around that (eg. using their own startup code and no stdlib) > > > > > > Is there a current side project to solve the makefile problems? > > None that I know of. > > > > > > > I think I'm back to my original question: If libgcc can't be built > > > with -mpure-code, and users bypass it completely with -nostdlib, then > > > why this conversation about pure-code compatibility of __clzsi2() etc? > > I think Richard noticed this pre-existing problem as part of the review > > of your patches. I don't think I meant fixing this is a prerequisite. > > But maybe I misunderstood :-) > > I might have misunderstood too then. It was certainly a pre-existing > problem, but I took the comments to mean that I had to own it as part of > touching those functions. > > > > > > 2. Attempting to filter the multib list results in configuration error. > > > > > This might have been misguided, but it was something I tried: > > > > > > > > > > Error: --with-multilib-list=armv6s-m not supported. > > > > > > > > > > Error: --with-multilib-list=mthumb/march=armv6s-m/mfloat-abi=soft not supported > > > > > > > > I think only 2 values are supported: aprofile and rmprofile. > > > > > > It looks like this might require a custom t-* multilib in gcc/config/arm. > > Or we could add -mpure-code to the rmprofile list. > > I have no strong opinions here. Are you proposing that the "m" versions > of libgcc be built with -mpure-code enabled by default, or are you > proposing a parallel set of multilibs? -mpure-code by default would > have costs in both size and speed. I'm not sure how large the penalty is for thumb-2 cores? Maybe it's acceptable to build thumb-2 with -mpure-code by default, but probably not for cortex-m0. > > > > > 3. Attempting to configure a single architecture results in a build error. > > > > > > > > > > --with-mode=thumb --with-arch=armv6s-m --with-float=soft > > > > > > > > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc': > > > > > configure: error: cannot compute suffix of object files: cannot compile > > > > > See `config.log' for more details > > > > > > > > > > conftest.c:9:10: fatal error: ac_nonexistent.h: No such file or directory > > > > > 9 | #include > > > > > | ^~~~~~~~~~~~~~~~~~ > > > > I never saw that error message, but I never build using --with-arch. > > > > I do use --with-cpu though. > > > > > > > > > This has me wondering whether pure-code in libgcc is a real issue ... > > > > > If there's a way to build libgcc with -mpure-code, please enlighten me. > > > > I haven't done so yet. Maybe building the toolchain --with-cpu=cortex-m0 > > > > works? > > > > > > No luck with that. Same error message as before: > > > > > > 4. --with-mode=thumb --with-arch=armv6s-m --with-float=soft --with-cpu=cortex-m0 > > > > > > Switch "--with-arch" may not be used with switch "--with-cpu" > > > > > > 5. Then: --with-mode=thumb --with-float=soft --with-cpu=cortex-m0 > > > > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc': > > > configure: error: cannot compute suffix of object files: cannot compile > > > See `config.log' for more details > > > > > > cc1: error: -mpure-code only supports non-pic code on M-profile targets > > Yes that's because default multilibs include targets incompatible with > > -mpure-code > > > > > 6. Finally! --with-float=soft --with-cpu=cortex-m0 --disable-multilib > > > > > > Once you know this, and read the docs sideways, the previous errors are > > > all probably "works as designed". But, I can still grumble. > > Yes, I think it's "as designed". I faced the "incompatible multilibs" issue > > too some time ago. Hence testing is not easy. > > > > > With libgcc compiled with -mpure-code, I can confirm that > > > 'builtin-bitops-1.c' (the test for __clzsi2) passed with libgcc as-is. > > > > > > I then added the SHF_ARM_PURECODE flag to the libgcc assembly functions > > > and re-ran the test. Still passed. I then added -mpure-code to > > > RUNTESTFLAGS and re-ran the test. Still passed. readelf confirmed that > > > the test program is compiling as expected [1]: > > > > > > [ 2] .text PROGBITS 0000800c 00800c 003314 00 AXy 0 0 4 > > > Key to Flags: > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > y (purecode), p (processor specific) > > > > > > It was only when I started inserting pure-code test directives into > > > 'builtin-bitops-1.c' that 'make check' began to report errors. > > > > > > /* { dg-do compile } */ > > > ... > > > /* { dg-options "-mpure-code -mfp16-format=ieee" } */ > > > /* { dg-final { scan-assembler-not "\\.(float|l\\?double|\d?byte|short|int|long|quad|word)\\s+\[^.\]" } } */ > > > > > > However, for reasons [2] [3] [4] [5], this wasn't actually useful. It's > > > sufficient to say that there are many reasons that non-pure-code > > > compatible functions exist in libgcc. > > > > I filed a PR to better track this discussion: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98779 > > Possibly worth noting that my patch series addresses __clzsi2, __ctzsi2, > __aeabi_ldivmod, __aeabi_uldivmod, and most of __gnu_float2h_internal as > long as CFLAGS_FOR_TARGET contains -mpure-code. > Great. > > > > > > > > Although I'm not sure how useful this will be in light of the previous > > > findings, I did take the opportunity with a working compile process to > > > modify the relevant assembly functions for -mpure-code compatibility. > > > I can manually disassemble the library and verify correct compilation. > > > I can manually run a non-pure-code builtin-bitops-1 with a pure-code > > > library to verify correct execution. But, I don't think your standard > > > regression suite will be able to exercise the new paths. > > Thanks for doing that. With the SHF_ARM_PURECODE flag you > > added to clz, I think my simulator would catch problems. > > See my more detailed comments following. Unless you're also using a > custom linker script, I would have expected any simulator capable of > catching errors to have already caught them. Putting the > SHF_ARM_PURECODE flag on clz actually seems rather cosmetic. > Yes, I have a custom linker script with .text.noread : { INPUT_SECTION_FLAGS (SHF_ARM_PURECODE) *(.text*) } > purecode_memory > > > The patch is below; you can consider this as 34/33 in the series. > > > > > > Regards, > > > Daniel > > > > > > [1] It's pretty clear that the section flags in libgcc have never really > > > mattered. When the linker strings all of the used objects together, > > > the original sections disappear into a single output object. The > > > compiler controls those flags regardless of what libgcc does.) > > Not sure what you mean? The linker creates two segments, one > > with and one without the SHF_ARM_PURECODE flag. > > When libgcc is compiled "normally", individual objects in libgcc.a are > compiled _without_ SHF_ARM_PURECODE. Note line 4 below, with flags > "AX" only (no "y"). > > `readelf -S arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a` > > File: arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a(_clzsi2.o) > There are 19 section headers, starting at offset 0x43c: > > Section Headers: > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > [ 0] NULL 00000000 000000 000000 00 0 0 0 > [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 2 > [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 > [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 > ==> [ 4] .text.sorted[...] PROGBITS 00000000 000034 000034 00 AX 0 0 4 > ... > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > y (purecode), p (processor specific) > > When this "normal" libgcc is linked into an -mpure-code program > (e.g. with RUNTESTFLAGS), the linker script flattens all of the > sections together into a single output. The relevant portion of the > linker script: > > .text : > { > *(.text.unlikely .text.*_unlikely .text.unlikely.*) > *(.text.exit .text.exit.*) > *(.text.startup .text.startup.*) > *(.text.hot .text.hot.*) > *(SORT(.text.sorted.*)) // _clzsi2.o matches here > *(.text .stub .text.* .gnu.linkonce.t.*) // main.o matches here > /* .gnu.warning sections are handled specially by elf.em. */ > *(.gnu.warning) > *(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx) > } > > I can't pretend to know how the linker merges conflicting flags from the > various input sections, but the final binary has the attributes > "AXy" as expected from the top level compile (line 2): > > `readelf -Sl builtin-bitops-1.exe` > > There are 22 section headers, starting at offset 0x10934: > > Section Headers: > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > [ 0] NULL 00000000 000000 000000 00 0 0 0 > [ 1] .init PROGBITS 00008000 008000 00000c 00 AX 0 0 4 > ==> [ 2] .text PROGBITS 0000800c 00800c 00455c 00 AXy 0 0 4 > [ 3] .fini PROGBITS 0000c568 00c568 00000c 00 AX 0 0 4 > [ 4] .rodata PROGBITS 0000c574 00c574 000050 00 A 0 0 4 > [ 5] .ARM.exidx ARM_EXIDX 0000c5c4 00c5c4 000008 00 AL 2 0 4 > [ 6] .eh_frame PROGBITS 0000c5cc 00c5cc 000124 00 A 0 0 4 > [ 7] .init_array INIT_ARRAY 0001c6f0 00c6f0 000004 04 WA 0 0 4 > [ 8] .fini_array FINI_ARRAY 0001c6f4 00c6f4 000004 04 WA 0 0 4 > [ 9] .data PROGBITS 0001c6f8 00c6f8 000a30 00 WA 0 0 8 > [10] .bss NOBITS 0001d128 00d128 000114 00 WA 0 0 4 > ... > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > y (purecode), p (processor specific) > > There are 3 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > EXIDX 0x00c5c4 0x0000c5c4 0x0000c5c4 0x00008 0x00008 R 0x4 > LOAD 0x000000 0x00000000 0x00000000 0x0c6f0 0x0c6f0 R E 0x10000 > LOAD 0x00c6f0 0x0001c6f0 0x0001c6f0 0x00a38 0x00b4c RW 0x10000 > > Section to Segment mapping: > Segment Sections... > 00 .ARM.exidx > ==> 01 .init .text .fini .rodata .ARM.exidx .eh_frame > 02 .init_array .fini_array .data .bss > > A segment contains complete sections, not the other way around. As you > can see above, the first LOAD segment contains the entire ".text" plus > some other sections. Thus, SHF_ARM_PURECODE flag really appears > to apply to all of "text", even though the bits linked in from libgcc > weren't built or flagged this way. > > > > > > [2] The existing pure-code tests are compile-only and cover just the > > > disassembled 'main.o'. There is no test of a complete executable > > > and there is no execution/simulation. > > That's something I did manually: run the full gcc testsuite, forcing -mpure-code > > in RUNTESTFLAGS. This way, all execution tests are compiled with -mpure-code, > > and this is how I found several bugs. > > > > > [3] While other parts of binutils may understand SHF_ARM_PURECODE, I > > > don't think the simulator checks section flags or throws exceptions. > > Indeed, I know of no public simulator that honors this flag. I do have > > one though. > > > > > [4] builtin-bitops-1 modified this way will always fail due to the array > > > data definitions (longs, longlongs, etc). GCC can't translate those > > > to instructions. While the ".data" section would presumably be > > > readable, scan-assembler-not doesn't know the difference. > > Sure, adding such scan-assembler-not is not suitable for any existing testcase. > > That's why it is only in place for testcases dedicated to -mpure-code. > > > > > [5] Even if the simulator were modified to throw exceptions, this will > > > continue to fail because _mainCRTStartup uses a literal pool. > > Yes, in general, and that's why I mentioned problems with newlib > > earlier in this thread. > > > > However, the simulator I use only throws an exception for code executed with > > SHF_ARM_PURECODE. Code in the "regular" code segment is not checked. > > So this does not catch errors in hand-written assembly code using regular .text, > > but it enables to run larger validations (such as the whole GCC testsuite) > > without having to fix all of newlib. > > Not perfect, as it left the issues in libgcc we are discussing, but it > > helped me fix > > several bugs in -mpure-code. > > Yet again I suspect that you have a custom linker script, or there's > some other major difference. Using the public releases of binutils, > newlib, etc, my experiences just aren't lining up with yours. Yep, the linker script makes the difference. > > > Thanks, > > > > Christophe > > > > > > > > > Thanks, > > > > > > > > Christophe > > > > > > > > --snip-- > > If the test server farm is free at some point, would you mind running > another set of regression tests on my v5 patch series? Sure. Given the number of sub-patches, can you send it to me as a single patch file (git format) that I can directly apply to GCC trunk? My mailer does not want to help with saving each patch as a proper patch file :-( Thanks Christophe > > Regards, > Daniel