From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-help-return-54509-listarch-gcc-help=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 28154 invoked by alias); 6 Dec 2013 13:52:24 -0000
Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-help.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-help@gcc.gnu.org>
Sender: gcc-help-owner@gcc.gnu.org
Received: (qmail 28134 invoked by uid 89); 6 Dec 2013 13:52:23 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-Spam-User: qpsmtpd, 2 recipients
X-HELO: mail-qa0-f53.google.com
Received: from Unknown (HELO mail-qa0-f53.google.com) (209.85.216.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 06 Dec 2013 13:52:22 +0000
Received: by mail-qa0-f53.google.com with SMTP id j5so556353qaq.5        for <multiple recipients>; Fri, 06 Dec 2013 05:52:14 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.49.48.69 with SMTP id j5mr58751951qen.71.1386337934009; Fri, 06 Dec 2013 05:52:14 -0800 (PST)
Received: by 10.229.183.6 with HTTP; Fri, 6 Dec 2013 05:52:13 -0800 (PST)
In-Reply-To: <CAFiYyc30jVTphXoOeSnf9DbqpMg=C3069ZGWaBFVj1yAzbLMqg@mail.gmail.com>
References: <CADn89gRZPDo1Z4gvime-PTC9aaeO6G5jgbN+0hOSZrnD8M1vtw@mail.gmail.com>	<CAFiYyc0kZGi5XgXikvhqH5kHCXgBc=HtrB5OXHrqv2Z+HdNmhg@mail.gmail.com>	<CADn89gSK0P4qXfGTNddW43XnP4KzThwNTQOBBNFBP9fO=raV3g@mail.gmail.com>	<CAFiYyc30jVTphXoOeSnf9DbqpMg=C3069ZGWaBFVj1yAzbLMqg@mail.gmail.com>
Date: Fri, 06 Dec 2013 13:52:00 -0000
Message-ID: <CADn89gRQmp_yTmW+8aWpWMxZckV0GAVUFgZZ8W1-RSki4QqiBA@mail.gmail.com>
Subject: Re: x86 gcc lacks simple optimization
From: Konstantin Vladimirov <konstantin.vladimirov@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Development <gcc@gcc.gnu.org>, GCC-help <gcc-help@gcc.gnu.org>
Content-Type: text/plain; charset=UTF-8
X-IsSubscribed: yes
X-SW-Source: 2013-12/txt/msg00046.txt.bz2

Hi,

Richard, I tried to add LSHIFT_EXPR case to tree-scalar-evolution.c
and now it yields code like (x86 again):

.L5:
movzbl 4(%esi,%eax,4), %edx
movb %dl, 4(%ebx,%eax,4)
addl $1, %eax
cmpl %ecx, %eax
jne .L5

So, excessive lea is gone. It is great, thank you so much. But I
wonder what else can I do to move add upper to simplify memory
accesses (I am guessing, this is some arithmetical re-associations,
still not sure where to look). For architecture, I am working on, it
is important. What would you advise?

---
With best regards, Konstantin

On Fri, Dec 6, 2013 at 2:25 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 11:19 AM, Konstantin Vladimirov
> <konstantin.vladimirov@gmail.com> wrote:
>> Hi,
>>
>> nothing changes if everything is unsigned and we are guaranteed to not
>> raise UB on overflow:
>>
>> unsigned foo(unsigned char *t, unsigned char *v, unsigned w)
>> {
>> unsigned i;
>>
>> for (i = 1; i != w; ++i)
>> {
>> unsigned x = i << 2;
>> v[x + 4] = t[x + 4];
>> }
>>
>> return 0;
>> }
>>
>> yields:
>>
>> .L5:
>> leal 0(,%eax,4), %edx
>> addl $1, %eax
>> movzbl 4(%edi,%edx), %ecx
>> cmpl %ebx, %eax
>> movb %cl, 4(%esi,%edx)
>> jne .L5
>>
>> What is SCEV infrastructure (guessing scalar evolutions?) and what
>> files/passes to look in?
>
> tree-scalar-evolution.c, look at where it handles MULT_EXPR but
> lacks LSHIFT_EXPR support.
>
> Richard.
>
>> ---
>> With best regards, Konstantin
>>
>> On Fri, Dec 6, 2013 at 2:10 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Fri, Dec 6, 2013 at 9:30 AM, Konstantin Vladimirov
>>> <konstantin.vladimirov@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> Consider code:
>>>>
>>>> int foo(char *t, char *v, int w)
>>>> {
>>>> int i;
>>>>
>>>> for (i = 1; i != w; ++i)
>>>> {
>>>> int x = i << 2;
>>>> v[x + 4] = t[x + 4];
>>>> }
>>>>
>>>> return 0;
>>>> }
>>>>
>>>> Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with options:
>>>>
>>>> gcc -O2 -m32 -S test.c
>>>>
>>>> You will see loop, formed like:
>>>>
>>>> .L5:
>>>> leal 0(,%eax,4), %edx
>>>> addl $1, %eax
>>>> movzbl 4(%edi,%edx), %ecx
>>>> cmpl %ebx, %eax
>>>> movb %cl, 4(%esi,%edx)
>>>> jne .L5
>>>>
>>>> But it can be easily simplified to something like this:
>>>>
>>>> .L5:
>>>> addl $1, %eax
>>>> movzbl (%esi,%eax,4), %edx
>>>> cmpl %ecx, %eax
>>>> movb %dl, (%ebx,%eax,4)
>>>> jne .L5
>>>>
>>>> (i.e. left shift may be moved to address).
>>>>
>>>> First question to gcc-help maillist. May be there are some options,
>>>> that I've missed, and there IS a way to explain gcc my intention to do
>>>> this?
>>>>
>>>> And second question to gcc developers mail list. I am working on
>>>> private backend and want to add this optimization to my backend. What
>>>> do you advise me to do -- custom gimple pass, or rtl pass, or modify
>>>> some existent pass, etc?
>>>
>>> This looks like a deficiency in induction variable optimization.  Note
>>> that i << 2 may overflow and this overflow does not invoke undefined
>>> behavior but is in the implementation defined behavior category.
>>>
>>> The issue in this case is likely that the SCEV infrastructure does not handle
>>> left-shifts.
>>>
>>> Richard.
>>>
>>>> ---
>>>> With best regards, Konstantin