From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-169763-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 23638 invoked by alias); 8 Aug 2011 17:14:29 -0000
Received: (qmail 23612 invoked by uid 22791); 8 Aug 2011 17:14:28 -0000
X-SWARE-Spam-Status: No, hits=-2.1 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,TW_DC,TW_DD,TW_DQ,TW_OV,TW_PX,TW_VD,TW_ZJ
X-Spam-Check-By: sourceware.org
Received: from mail-qw0-f47.google.com (HELO mail-qw0-f47.google.com) (209.85.216.47)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 08 Aug 2011 17:14:14 +0000
Received: by qwh5 with SMTP id 5so1134773qwh.20        for <multiple recipients>; Mon, 08 Aug 2011 10:14:13 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.229.40.10 with SMTP id i10mr4429985qce.161.1312823653504; Mon, 08 Aug 2011 10:14:13 -0700 (PDT)
Received: by 10.229.29.7 with HTTP; Mon, 8 Aug 2011 10:14:13 -0700 (PDT)
In-Reply-To: <CAFULd4Y5mie9zBP5TbvFpuDxSwbKpp=n3sh8iyr3UpSfiqiYxQ@mail.gmail.com>
References: <CAFULd4byLVikq6tTudXVUpqLvaC6vTsM3Kd=1eT0fTka=ZOKgg@mail.gmail.com>	<201108081530.p78FUAgM029764@d06av02.portsmouth.uk.ibm.com>	<CAFULd4Y5mie9zBP5TbvFpuDxSwbKpp=n3sh8iyr3UpSfiqiYxQ@mail.gmail.com>
Date: Mon, 08 Aug 2011 17:14:00 -0000
Message-ID: <CAMe9rOp0DOaQ+6P8CaVH6tLbkXKsNpLDBy5o4wt-SoJy+2+iMg@mail.gmail.com>
Subject: Re: [RFC PATCH, i386]: Allow zero_extended addresses (+ problems with reload and offsetable address, "o" constraint)
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Ulrich Weigand <uweigand@de.ibm.com>, gcc-patches@gcc.gnu.org, 	GCC Development <gcc@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
X-SW-Source: 2011-08/txt/msg00170.txt.bz2

On Mon, Aug 8, 2011 at 10:11 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Aug 8, 2011 at 5:30 PM, Ulrich Weigand <uweigand@de.ibm.com> wrot=
e:
>> Uros Bizjak wrote:
>>
>>> Although, it would be nice for reload to subsequently fix CSE'd
>>> non-offsetable memory by copying address to temporary reg (*as said in
>>> the documentation*), we could simply require an XMM temporary for
>>> TImode reloads to/from integer registers, and this fixes ICE for x32.
>>
>> Moves are special as far as reload is concerned. =A0If there is already
>> a move instruction present *before* reload, it will get fixed up
>> according to its constraints as any other instruction.
>>
>> However, reload will *introduce* new moves as part of its operation,
>> and those will *not* themselves get reloaded. =A0Instead, reload simply
>> assumes that every plain move will just succeed without requiring
>> any reload; if this is not true, the target *must* provide a
>> secondary reload for this move.
>>
>> (Note that the secondary reload could also work by reloading the
>> target address into a temporary; that's up to the target to
>> implement.)
>
> Whoa, indeed.
>
> Using attached patch that reloads memory address instead of going
> through XMM register, the code for the testcase improves from:
>
> test:
> .LFB0:
> =A0 =A0 =A0 =A0.cfi_startproc
> =A0 =A0 =A0 =A0pushq =A0 %rbx
> =A0 =A0 =A0 =A0.cfi_def_cfa_offset 16
> =A0 =A0 =A0 =A0.cfi_offset 3, -16
> =A0 =A0 =A0 =A0sall =A0 =A0$4, %esi
> =A0 =A0 =A0 =A0addl =A0 =A0%edi, %esi
> =A0 =A0 =A0 =A0movdqa =A0(%esi), %xmm0
> =A0 =A0 =A0 =A0movdqa =A0%xmm0, -16(%rsp)
> =A0 =A0 =A0 =A0movq =A0 =A0-16(%rsp), %rcx
> =A0 =A0 =A0 =A0movq =A0 =A0-8(%rsp), %rbx
> =A0 =A0 =A0 =A0addq =A0 =A0$1, %rcx
> =A0 =A0 =A0 =A0adcq =A0 =A0$0, %rbx
> =A0 =A0 =A0 =A0movq =A0 =A0%rcx, -16(%rsp)
> =A0 =A0 =A0 =A0sall =A0 =A0$4, %edx
> =A0 =A0 =A0 =A0movq =A0 =A0%rbx, -8(%rsp)
> =A0 =A0 =A0 =A0movdqa =A0-16(%rsp), %xmm0
> =A0 =A0 =A0 =A0movdqa =A0%xmm0, (%esi)
> =A0 =A0 =A0 =A0pxor =A0 =A0%xmm0, %xmm0
> =A0 =A0 =A0 =A0movdqa =A0%xmm0, (%edx,%esi)
> =A0 =A0 =A0 =A0popq =A0 =A0%rbx
> =A0 =A0 =A0 =A0.cfi_def_cfa_offset 8
> =A0 =A0 =A0 =A0ret
>
> to:
>
> test:
> .LFB0:
> =A0 =A0 =A0 =A0.cfi_startproc
> =A0 =A0 =A0 =A0sall =A0 =A0$4, %esi
> =A0 =A0 =A0 =A0pushq =A0 %rbx
> =A0 =A0 =A0 =A0.cfi_def_cfa_offset 16
> =A0 =A0 =A0 =A0.cfi_offset 3, -16
> =A0 =A0 =A0 =A0addl =A0 =A0%edi, %esi
> =A0 =A0 =A0 =A0pxor =A0 =A0%xmm0, %xmm0
> =A0 =A0 =A0 =A0mov =A0 =A0 %esi, %eax
> =A0 =A0 =A0 =A0movq =A0 =A0(%rax), %rcx
> =A0 =A0 =A0 =A0movq =A0 =A08(%rax), %rbx
> =A0 =A0 =A0 =A0addq =A0 =A0$1, %rcx
> =A0 =A0 =A0 =A0adcq =A0 =A0$0, %rbx
> =A0 =A0 =A0 =A0sall =A0 =A0$4, %edx
> =A0 =A0 =A0 =A0movq =A0 =A0%rcx, (%rax)
> =A0 =A0 =A0 =A0movq =A0 =A0%rbx, 8(%rax)
> =A0 =A0 =A0 =A0movdqa =A0%xmm0, (%edx,%esi)
> =A0 =A0 =A0 =A0popq =A0 =A0%rbx
> =A0 =A0 =A0 =A0.cfi_def_cfa_offset 8
> =A0 =A0 =A0 =A0ret
>
> H.J., can you please test attached patch? This optimization won't
> trigger on x86_64 anymore.
>

I will test it.

Thanks.


--=20
H.J.