From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 2960D3858D3C for ; Fri, 11 Mar 2022 15:20:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2960D3858D3C Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7D0E2175A; Fri, 11 Mar 2022 07:20:27 -0800 (PST) Received: from [10.57.21.174] (unknown [10.57.21.174]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0355D3F7F5; Fri, 11 Mar 2022 07:20:26 -0800 (PST) Message-ID: <43782800-80df-c4ce-9379-c4bb3c6c60a4@foss.arm.com> Date: Fri, 11 Mar 2022 15:20:25 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: ARM: code size increase starting from gcc 10 Content-Language: en-GB To: Gabriele Favalessa , gcc-help@gcc.gnu.org References: From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3490.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2022 15:20:29 -0000 On 11/03/2022 09:57, Gabriele Favalessa via Gcc-help wrote: > Hi, > > up to gcc 9 this function > > #include > #include > > bool f() { > return *(volatile uint32_t*)0x42143fa8 == 0; > } > > compiles (arm-none-eabi-gcc -mcpu=cortex-m4 -Os) to: > > 0: 4b02 ldr r3, [pc, #8] ; (c ) > 2: 6818 ldr r0, [r3, #0] > 4: fab0 f080 clz r0, r0 > 8: 0940 lsrs r0, r0, #5 > a: 4770 bx lr > c: 42143fa8 .word 0x42143fa8 > > Starting with gcc 10 it compiles to: > > 0: 4b03 ldr r3, [pc, #12] ; (10 ) > 2: f8d3 0fa8 ldr.w r0, [r3, #4008] ; 0xfa8 > 6: fab0 f080 clz r0, r0 > a: 0940 lsrs r0, r0, #5 > c: 4770 bx lr > e: bf00 nop > 10: 42143000 .word 0x42143000 > > Questions: > > 1) why newer gcc versions don't generate the smallest possible size in > spite of -Os? The compiler is trying to identify opportunities to generate even better code for more common cases. For example, if your testcase is changed to: int f() { return (*(volatile unsigned*)0x42143fa8 + *(volatile unsigned*)0x42143e00)== 0; } Then we see: ldr r3, .L2 ldr r2, [r3, #4008] ldr r3, [r3, #3584] cmn r2, r3 ite eq moveq r0, #1 movne r0, #0 bx lr .L3: .align 2 .L2: .word 1108619264 being generated which is clearly better than loading two completely different constants from the literal pool to use as bases: (gcc-9): ldr r3, .L2 ldr r2, .L2+4 ldr r3, [r3] ldr r2, [r2] cmn r3, r2 ite eq moveq r0, #1 movne r0, #0 bx lr .L3: .align 2 .L2: .word 1108623272 .word 1108622848 Unfortunately, the code that does this has limited visibility of what other operations may be accessing nearby memory, so is not able to work out the optimal situation for every case. > 2) is there a way to get the smaller code with newer gcc versions? Unfortunately, no. At least not at present. R. > > Thanks > > Gabriele