From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-459503-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 115455 invoked by alias); 1 Aug 2017 11:47:18 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 115418 invoked by uid 89); 1 Aug 2017 11:47:16 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=Ba, 1h, H*RU:209.85.213.67, b.a
X-HELO: mail-vk0-f67.google.com
Received: from mail-vk0-f67.google.com (HELO mail-vk0-f67.google.com) (209.85.213.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 01 Aug 2017 11:47:15 +0000
Received: by mail-vk0-f67.google.com with SMTP id d124so608248vkf.1        for <gcc-patches@gcc.gnu.org>; Tue, 01 Aug 2017 04:47:15 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20161025;        h=x-gm-message-state:mime-version:in-reply-to:references:from:date         :message-id:subject:to:cc;        bh=zkWkQav2FkSETf1kBZ1PM4X3j27/jfDPlqvd/9+stuo=;        b=T7QnlPJLBMVEyXWgRnn1A0+ziGvTiqxe7ynGORRXH/VxuiEiAfIGSre0DwqpkERqsk         8ktznjWDztFFVf6Z/Avyn4hf9K+yu1TTHe/nud87TmVF49gF2opomTWYvqc2TtgmoqmP         tFIHtu4MSWMOB7wMZQdcSQEYCkMug4Kitvp/4kwtf1JD9zfWlBwl+td5ZEq/y2xsXRGI         gjtcGVVucZf/a8hj2v0FcnOyZgWlZsD4GBJXeU3F12gTLpve0Oe4CIzDYgPZmV5NahUF         7zkcVwwUCWv5S0Nt8+EuNbhKtB5M7wDPI3DunXzpYNujb5VeizjGSEXgPdtXO7khRWYj         XgBw==
X-Gm-Message-State: AIVw1131W4W+xXyGjhNN4yJ+PZ3fHFtuVdeXNgLZ49aKFXqTsVxY8x1Y	+frzbYW85Aag01LRAzD2phIhagyghSoC
X-Received: by 10.31.115.3 with SMTP id o3mr10206718vkc.87.1501588033546; Tue, 01 Aug 2017 04:47:13 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.103.34.5 with HTTP; Tue, 1 Aug 2017 04:47:13 -0700 (PDT)
In-Reply-To: <VI1PR0801MB20314F94AA32F6209F57D06CFFDF0@VI1PR0801MB2031.eurprd08.prod.outlook.com>
References: <VI1PR0801MB203117D353E8A26AA9909037FFC80@VI1PR0801MB2031.eurprd08.prod.outlook.com> <20170614084233.GA15599@arm.com> <VI1PR0801MB2031791FA5E8C0CF053CB699FFDA0@VI1PR0801MB2031.eurprd08.prod.outlook.com> <VI1PR0801MB20314F94AA32F6209F57D06CFFDF0@VI1PR0801MB2031.eurprd08.prod.outlook.com>
From: "Bin.Cheng" <amker.cheng@gmail.com>
Date: Tue, 01 Aug 2017 11:47:00 -0000
Message-ID: <CAHFci28Kxt=VeFSZHd896eAbp+mJMiN7fZcbDe_P8CCOfNeVfQ@mail.gmail.com>
Subject: Re: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode.
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: James Greenhalgh <James.Greenhalgh@arm.com>, GCC Patches <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, 	Marcus Shawcroft <Marcus.Shawcroft@arm.com>, Richard Earnshaw <Richard.Earnshaw@arm.com>
Content-Type: text/plain; charset="UTF-8"
X-IsSubscribed: yes
X-SW-Source: 2017-08/txt/msg00066.txt.bz2

On Mon, Jun 26, 2017 at 11:50 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi all,
>
> Here's the re-spun patch.
> Aside from the grouping of the split patterns it now also uses h register for the fmov for HF when available,
> otherwise it forces a literal load.
>
> Regression tested on aarch64-none-linux-gnu and no regressions.
Hi,
There are lots of test failures on aarch64_be-none-elf, I verified two:
gcc.dg/vect/pr61680.c execution test
gcc.dg/vect/pr63148.c execution test

are caused by svn+ssh://gcc.gnu.org/svn/gcc/trunk@250673

Given review comment already pointed out big-endian issue and patch
was updated to address it, I would expect reg-test on a big-endian
target before applying patch, right?

Thanks,
bin
>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-06-26  Tamar Christina  <tamar.christina@arm.com>
>             Richard Sandiford <richard.sandiford@linaro.org>
>
>         * config/aarch64/aarch64.md (mov<mode>): Generalize.
>         (*movhf_aarch64, *movsf_aarch64, *movdf_aarch64):
>         Add integer and movi cases.
>         (movi-split-hf-df-sf split, fp16): New.
>         (enabled): Added TARGET_FP_F16INST.
>         * config/aarch64/iterators.md (GPF_HF): New.
> ________________________________________
> From: Tamar Christina
> Sent: Wednesday, June 21, 2017 11:48:33 AM
> To: James Greenhalgh
> Cc: GCC Patches; nd; Marcus Shawcroft; Richard Earnshaw
> Subject: RE: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode.
>
>> >     movi\\t%0.4h, #0
>> > -   mov\\t%0.h[0], %w1
>> > +   fmov\\t%s0, %w1
>>
>> Should this not be %h0?
>
> The problem is that H registers are only available in ARMv8.2+,
> I'm not sure what to do about ARMv8.1 given your other feedback
> Pointing out that the bit patterns between how it's stored in s vs h registers
> differ.
>
>>
>> >     umov\\t%w0, %1.h[0]
>> >     mov\\t%0.h[0], %1.h[0]
>> > +   fmov\\t%s0, %1
>>
>> Likewise, and much more important for correctness as it changes the way the
>> bit pattern ends up in the register (see table C2-1 in release B.a of the ARM
>> Architecture Reference Manual for ARMv8-A), here.
>>
>> > +   * return aarch64_output_scalar_simd_mov_immediate (operands[1],
>> > + SImode);
>> >     ldr\\t%h0, %1
>> >     str\\t%h1, %0
>> >     ldrh\\t%w0, %1
>> >     strh\\t%w1, %0
>> >     mov\\t%w0, %w1"
>> > -  [(set_attr "type"
>> "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>> > -                     f_loads,f_stores,load1,store1,mov_reg")
>> > -   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")]
>> > +  "&& can_create_pseudo_p ()
>> > +   && !aarch64_can_const_movi_rtx_p (operands[1], HFmode)
>> > +   && !aarch64_float_const_representable_p (operands[1])
>> > +   &&  aarch64_float_const_rtx_p (operands[1])"
>> > +  [(const_int 0)]
>> > +  "{
>> > +    unsigned HOST_WIDE_INT ival;
>> > +    if (!aarch64_reinterpret_float_as_int (operands[1], &ival))
>> > +      FAIL;
>> > +
>> > +    rtx tmp = gen_reg_rtx (SImode);
>> > +    aarch64_expand_mov_immediate (tmp, GEN_INT (ival));
>> > +    tmp = simplify_gen_subreg (HImode, tmp, SImode, 0);
>> > +    emit_move_insn (operands[0], gen_lowpart (HFmode, tmp));
>> > +    DONE;
>> > +  }"
>> > +  [(set_attr "type" "neon_move,f_mcr,neon_to_gp,neon_move,fconsts,
>> \
>> > +                neon_move,f_loads,f_stores,load1,store1,mov_reg")
>> > +   (set_attr "simd" "yes,*,yes,yes,*,yes,*,*,*,*,*")]
>> >  )
>>
>> Thanks,
>> James
>