From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 93187 invoked by alias); 17 Feb 2019 13:42:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 93168 invoked by uid 89); 17 Feb 2019 13:42:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,GIT_PATCH_1,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=extend X-HELO: mail-ot1-f68.google.com Received: from mail-ot1-f68.google.com (HELO mail-ot1-f68.google.com) (209.85.210.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 17 Feb 2019 13:42:07 +0000 Received: by mail-ot1-f68.google.com with SMTP id n71so24013249ota.10 for ; Sun, 17 Feb 2019 05:42:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Wxxh7+8uK30ryIHvoPKDaDKmYqhECOqJ5SXdiIBW1+c=; b=jTmvwCG4pxx/ftwm8nu4NMgig0aK7oG58UdN99Wl1gtRdnNSIDZYW5NrzyNroeqQYm r+7U5s3vbYVvD/eniOLpKVGhUosf808dJruh7yoluLMhFKgdYZA8QZi5Uz3HwhZpEVs6 pv4ATjB1xSV2d9qVPEmN+WuJKBOmS/Gvobr4+aUlB7B178eh5HDJLYtWVBjkRDKWU5fL L6f/F5O7OQqBugwLAPrkEUa/0jkohoYj0tYFZasug2RA+zH0fUI2JxnG2jHfUTSDB9HY w/+QnO5Glweylr+dUFLzUSKde686qBMa1faVkXzMA5lrYcpkMSS+wQqvbCLPEMZMaS+Z pFxw== MIME-Version: 1.0 References: <20190216224032.4889-1-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Sun, 17 Feb 2019 13:42:00 -0000 Message-ID: Subject: Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE To: Uros Bizjak Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-02/txt/msg01397.txt.bz2 On Sun, Feb 17, 2019 at 2:33 AM Uros Bizjak wrote: > > On 2/16/19, H.J. Lu wrote: > > On x86-64, since __m64 is returned and passed in XMM registers, we can > > emulate MMX intrinsics with SSE instructions. To support it, we added > > > > #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) > > > > ;; Define instruction set of MMX instructions > > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" > > (const_string "base")) > > > > (eq_attr "mmx_isa" "native") > > (symbol_ref "!TARGET_MMX_WITH_SSE") > > (eq_attr "mmx_isa" "x64") > > (symbol_ref "TARGET_MMX_WITH_SSE") > > (eq_attr "mmx_isa" "x64_avx") > > (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX") > > (eq_attr "mmx_isa" "x64_noavx") > > (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX") > > > > We added SSE emulation to MMX patterns and disabled MMX alternatives with > > TARGET_MMX_WITH_SSE. > > > > Most of MMX instructions have equivalent SSE versions and results of some > > SSE versions need to be reshuffled to the right order for MMX. Thee are > > couple tricky cases: > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent. We emulate MMX > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the > > mask operand and handle unmapped bits 64:127 at memory address by > > adjusting source and mask operands together with memory address. > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, which is available > > in 64-bit mode. > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index. > > SSE emulation must clear the bit 4 in the shuffle control mask. > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve > > the upper 64 bits of destination XMM register. > > > > Tests are also added to check each SSE emulation of MMX intrinsics. > > > > There are no regressions on i686 and x86-64. For x86-64, GCC is also > > tested with > > > > --with-arch=native --with-cpu=native > > > > on AVX2 and AVX512F machines. > > An idea that would take patch a step further also on 32 bit targets: > > *Assuming* that operations on XMM registers are as fast (or perhaps > faster) than operations on MMX registers, we can change mmx_isa > attribute in e.g. > > + "@ > + p\t{%2, %0|%0, %2} > + p\t{%2, %0|%0, %2} > + vp\t{%2, %1, %0|%0, %1, %2}" > + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") > > to: > > [(set_attr "isa" "*,noavx,avx") > (set_attr "mmx_isa" "native,*,*")] > > So, for x86_64 everything stays the same, but for x86_32 we now allow > intrinsics to use xmm registers in addition to mmx registers. We can't > disable MMX for x64_32 anyway due to ISA constraints (and some tricky > cases, e.g. monvti that works only for 64bit targets and e.g. maskmovq > & similar, which are more efficient with MMX regs), but RA has much > more freedom to allocate the most effective register set even for > 32bit targets. > > WDYT? > Since MMX registers are used to pass and return __m64 values, we can't really get rid of MMX instructions in 32-bit mode. If people have to stay with 32-bit mode, they need MMX. I don't think we should extend TARGET_MMX_WITH_SSE to 32-bit mode. -- H.J.