From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Richard.Earnshaw@foss.arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id D2AFF38930DE
 for <gcc-help@gcc.gnu.org>; Tue,  6 Apr 2021 11:24:05 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D2AFF38930DE
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6605131B;
 Tue,  6 Apr 2021 04:24:05 -0700 (PDT)
Received: from [10.57.1.4] (unknown [10.57.1.4])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DD0213F73D;
 Tue,  6 Apr 2021 04:24:04 -0700 (PDT)
Subject: Re: Fast multipliers on ARM
To: Michael Robins <mike.robins@talktalk.net>, gcc-help@gcc.gnu.org
References: <cc80a339-f9b6-1bcf-efc1-0495248cb812@talktalk.net>
From: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
Message-ID: <e73641f2-77d5-97be-2a1d-628f74809502@foss.arm.com>
Date: Tue, 6 Apr 2021 12:23:23 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <cc80a339-f9b6-1bcf-efc1-0495248cb812@talktalk.net>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-3492.2 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-help@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-help mailing list <gcc-help.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-help>,
 <mailto:gcc-help-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-help>,
 <mailto:gcc-help-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Apr 2021 11:24:07 -0000

On 02/04/2021 23:27, Michael Robins via Gcc-help wrote:
> I am cross-compiling using "arm-none-eabi-gcc -mcpu=cortex-m0plus -O3" 
> for a target architecture that performs a multiply in a single cycle, 
> using gcc version 10.2.0 on a PC running Fedora Linux.
> 
> Is there an option to persuade the compiler to use the multiply 
> instruction automatically instead of shifts and adds when multiplying by 
> a constant?
> 
> In the example code below, gcc uses the trick of multiplying by a big 
> number instead of dividing by a small one (12 in this case). For my 
> target, the code from "-O3" is both longer and slower then that for "-Os".
> 
> foobar.c:
> 
> typedef struct {int x[3];} threeInts;
> int foo(threeInts * p, threeInts * q)
> {
>      return p - q;
> }
> #pragma GCC push_options
> #pragma GCC optimize("-Os")
> int bar(threeInts * p, threeInts * q)
> {
>      return p - q;
> }
> #pragma GCC pop_options
> 
> 
> foobar.s:
> 
>      .cpu cortex-m0plus
>      .eabi_attribute 20, 1
>      .eabi_attribute 21, 1
>      .eabi_attribute 23, 3
>      .eabi_attribute 24, 1
>      .eabi_attribute 25, 1
>      .eabi_attribute 26, 1
>      .eabi_attribute 30, 2
>      .eabi_attribute 34, 0
>      .eabi_attribute 18, 4
>      .file    "foobar.c"
>      .text
>      .align    1
>      .p2align 2,,3
>      .global    foo
>      .arch armv6s-m
>      .syntax unified
>      .code    16
>      .thumb_func
>      .fpu softvfp
>      .type    foo, %function
> foo:
>      @ args = 0, pretend = 0, frame = 0
>      @ frame_needed = 0, uses_anonymous_args = 0
>      @ link register save eliminated.
>      subs    r1, r0, r1
>      asrs    r1, r1, #2
>      lsls    r3, r1, #2
>      adds    r3, r3, r1
>      lsls    r0, r3, #4
>      adds    r3, r3, r0
>      lsls    r0, r3, #8
>      adds    r3, r3, r0
>      lsls    r0, r3, #16
>      adds    r0, r3, r0
>      lsls    r0, r0, #1
>      adds    r0, r0, r1
>      @ sp needed
>      bx    lr
>      .size    foo, .-foo
>      .align    1
>      .global    bar
>      .syntax unified
>      .code    16
>      .thumb_func
>      .fpu softvfp
>      .type    bar, %function
> bar:
>      @ args = 0, pretend = 0, frame = 0
>      @ frame_needed = 0, uses_anonymous_args = 0
>      @ link register save eliminated.
>      subs    r0, r0, r1
>      ldr    r1, .L4
>      asrs    r0, r0, #2
>      muls    r0, r1
>      @ sp needed
>      bx    lr
> .L5:
>      .align    2
> .L4:
>      .word    -1431655765
>      .size    bar, .-bar
>      .ident    "GCC: (GNU) 10.2.0"
> 
> 
> 
> Kind regards
> 
> Mike Robins
> 

Could you raise a report in GCC bugzilla, with your testcase attached, 
please?

R.