From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-496456-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 72541 invoked by alias); 17 Feb 2019 15:57:34 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 72520 invoked by uid 89); 17 Feb 2019 15:57:33 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-6.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,GIT_PATCH_1,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mail-it1-f194.google.com
Received: from mail-it1-f194.google.com (HELO mail-it1-f194.google.com) (209.85.166.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 17 Feb 2019 15:57:32 +0000
Received: by mail-it1-f194.google.com with SMTP id y184so35643507itc.1        for <gcc-patches@gcc.gnu.org>; Sun, 17 Feb 2019 07:57:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=gmail.com; s=20161025;        h=mime-version:references:in-reply-to:from:date:message-id:subject:to         :cc;        bh=huDlqWY/b34CAnbZMig9kCftoMxN5dIp5Mbye/Tnu7g=;        b=uWZclB1wwPqJYFOZoeWZV+ApmPPdD6KLAI8RvDKMjeIzdVUoPLFGPsn8WgUH25gcE+         qLcN1Fe/zA+EWjhUudN9Kr/GzfZP3KjrhvpAfaZXkAX8lVW4ZC/+HlGuno0xSkIcMaJX         jjUA7fsjM+fUsYV97DYF2ek3/wqxV9KzcG4cI2/u2YewKm5SOnr3XRTnYWQGuZ19Yl2s         enmILeld1GlqmPas3nxsjqoQaUu/IxvGn6tAWVMj5XuRgjy/S3nXBRiwyvbcD6+mEVZ/         hJIzKL1+825F2dRWVbdRRJdJbI5c53GyjTVADhlRGmHdr0Q09hMF0CHViIpSlOSxIBSD         PlsQ==
MIME-Version: 1.0
References: <20190216224032.4889-1-hjl.tools@gmail.com> <CAFULd4btm_bc_u8b9zsf9rzr0odmXjXyR_6Oj07G2-T__EOuKQ@mail.gmail.com> <CAMe9rOpYwHBzzcxxrX+z=w+er-8M0sbhP8CuXtzCNx2Ct8S4sw@mail.gmail.com> <CAFULd4ZYuTSTV0+P7nYJPC782v=P41iLc7y918W2ZC6K_EnTyg@mail.gmail.com>
In-Reply-To: <CAFULd4ZYuTSTV0+P7nYJPC782v=P41iLc7y918W2ZC6K_EnTyg@mail.gmail.com>
From: Uros Bizjak <ubizjak@gmail.com>
Date: Sun, 17 Feb 2019 15:57:00 -0000
Message-ID: <CAFULd4ZMZFNDabxKw11Y4grMHRja5649G1oD4MvZfqhjQjW1-Q@mail.gmail.com>
Subject: Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
X-SW-Source: 2019-02/txt/msg01399.txt.bz2

On Sun, Feb 17, 2019 at 4:53 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > On x86-64, since __m64 is returned and passed in XMM registers, we can
> > > > emulate MMX intrinsics with SSE instructions. To support it, we added
> > > >
> > > >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> > > >
> > > > ;; Define instruction set of MMX instructions
> > > > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> > > >   (const_string "base"))
> > > >
> > > >          (eq_attr "mmx_isa" "native")
> > > >            (symbol_ref "!TARGET_MMX_WITH_SSE")
> > > >          (eq_attr "mmx_isa" "x64")
> > > >            (symbol_ref "TARGET_MMX_WITH_SSE")
> > > >          (eq_attr "mmx_isa" "x64_avx")
> > > >            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
> > > >          (eq_attr "mmx_isa" "x64_noavx")
> > > >            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
> > > >
> > > > We added SSE emulation to MMX patterns and disabled MMX alternatives with
> > > > TARGET_MMX_WITH_SSE.
> > > >
> > > > Most of MMX instructions have equivalent SSE versions and results of some
> > > > SSE versions need to be reshuffled to the right order for MMX.  Thee are
> > > > couple tricky cases:
> > > >
> > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> > > > mask operand and handle unmapped bits 64:127 at memory address by
> > > > adjusting source and mask operands together with memory address.
> > > >
> > > > 2. MMX movntq is emulated with SSE2 DImode movnti, which is available
> > > > in 64-bit mode.
> > > >
> > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
> > > > SSE emulation must clear the bit 4 in the shuffle control mask.
> > > >
> > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
> > > > the upper 64 bits of destination XMM register.
> > > >
> > > > Tests are also added to check each SSE emulation of MMX intrinsics.
> > > >
> > > > There are no regressions on i686 and x86-64.  For x86-64, GCC is also
> > > > tested with
> > > >
> > > > --with-arch=native --with-cpu=native
> > > >
> > > > on AVX2 and AVX512F machines.
> > >
> > > An idea that would take patch a step further also on 32 bit targets:
> > >
> > > *Assuming* that operations on XMM registers are as fast (or perhaps
> > > faster) than operations on MMX registers, we can change mmx_isa
> > > attribute in e.g.
> > >
> > > +  "@
> > > +   p<logic>\t{%2, %0|%0, %2}
> > > +   p<logic>\t{%2, %0|%0, %2}
> > > +   vp<logic>\t{%2, %1, %0|%0, %1, %2}"
> > > +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> > >
> > > to:
> > >
> > > [(set_attr "isa" "*,noavx,avx")
> > >  (set_attr "mmx_isa" "native,*,*")]
> > >
> > > So, for x86_64 everything stays the same, but for x86_32 we now allow
> > > intrinsics to use xmm registers in addition to mmx registers. We can't
> > > disable MMX for x64_32 anyway due to ISA constraints (and some tricky
> > > cases, e.g. monvti that works only for 64bit targets and e.g. maskmovq
> > > & similar, which are more efficient with MMX regs), but RA has much
> > > more freedom to allocate the most effective register set even for
> > > 32bit targets.
> > >
> > > WDYT?
> > >
> >
> > Since MMX registers are used to pass and return __m64 values,
> > we can't really get rid of MMX instructions in 32-bit mode.  If people
> > have to stay with 32-bit mode, they need MMX.  I don't think we should
> > extend TARGET_MMX_WITH_SSE to 32-bit mode.
>
> No, TARGET_MMX_WITH_SSE is still enabled only for 64bit targets. We
> should not *disable* SSE alternatives on 32bit targets.

The correct isa attribute definition would be:

[(set_attr "isa" "*,sse2_noavx,avx")
 (set_attr "mmx_isa" "native,*,*")]

Uros.