From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by sourceware.org (Postfix) with ESMTPS id 215CF3858CDB for ; Fri, 14 Jul 2023 07:36:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 215CF3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-51bece5d935so1887799a12.1 for ; Fri, 14 Jul 2023 00:36:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689320191; x=1691912191; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Xc+XR7ZKwVi/vxeIm/g/i90g+pUzq73hAfgbL7yGOzo=; b=DRAdXEHiwoZgka0bq1+6t30ybjdi+cB+s7Yt37Ss5T8eezOUIc5JlYFjUw1401FyEo hAeOdArismvx09Iqp+K5/GIOouYaGkqnds+/gAKC4VdBvEVHqqlHQVFVn+njqeCdJ2va nZtawihR/PrwNSZQzrzV32x27o7GAAmf0QqSA1Jv05acpRdSrFTi7GErxE7monzbd0GQ GvaHGwFU6cLX+CjmvZkZmEDb/Ca7nlgmVMLj/RqYhx0vHxD69ZIQ5wdkImClPXDr/SQk YH3TgTH+WXIf/SQponOnhHHPRrXepWZSl+7AGYVGPdUlJej0rMN154JSQqg71qWf+YpC dhTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689320191; x=1691912191; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xc+XR7ZKwVi/vxeIm/g/i90g+pUzq73hAfgbL7yGOzo=; b=TP9MGHIukVnN/11I4vLyAS6MQV6CqPruJC0bZA4trJwbErOKePbwOl0fGH9obTyM6S sdGnB95IhhSXahSvxY8hDG0SsrOGK6wWHCFqa3qo4JZlmLdxNGoBD8cz+ncaR/1YnA0M wZUmYjchscpOMBUgfo559t8Et7LsVID7zjcnRzkJHHFbZUua93Uj1u/rFh6msQge4cMx 4pmwSO8SLRcbUCUZSKczwwYJPsw/x+EtiDGUOxD111MkwTQA4k96t6RkWd1ucNIUHfxB IwVPHAI+Qt2oL6R3ATuL04jHMjfRkHamZpqEsN6WpctrYw80YkCso5ius8Z7YdUZTYSL M81w== X-Gm-Message-State: ABy/qLaV6BJTQDNuxel4DXDsSbTF0dGF7521Dp3aAGavu9cJWcfrR0n5 oOyGaKWGBlvApUHJ86YBiR6E+3rcmN6Rvm4HluG4oxmQHjU= X-Google-Smtp-Source: APBJJlEnKawGR4my9kCzlaiZH85Emc00EOBIUMAp1efKkg46AL8pVzBlMOIexG09AlTXUFfYxaGKZrdvLQSHgpp6sNI= X-Received: by 2002:a05:6402:164d:b0:51d:e2c4:f94a with SMTP id s13-20020a056402164d00b0051de2c4f94amr3526704edx.20.1689320191237; Fri, 14 Jul 2023 00:36:31 -0700 (PDT) MIME-Version: 1.0 References: <03e401d9b5a9$661e42e0$325ac8a0$@nextmovesoftware.com> In-Reply-To: <03e401d9b5a9$661e42e0$325ac8a0$@nextmovesoftware.com> From: Uros Bizjak Date: Fri, 14 Jul 2023 09:36:20 +0200 Message-ID: Subject: Re: [x86_64 PATCH] Improved insv of DImode/DFmode {high,low}parts into TImode. To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jul 13, 2023 at 6:45=E2=80=AFPM Roger Sayle wrote: > > > This is the next piece towards a fix for (the x86_64 ABI issues affecting= ) > PR 88873. This patch generalizes the recent tweak to ix86_expand_move > for setting the highpart of a TImode reg from a DImode source using > *insvti_highpart_1, to handle both DImode and DFmode sources, and also > use the recently added *insvti_lowpart_1 for setting the lowpart. > > Although this is another intermediate step (not yet a fix), towards > enabling *insvti and *concat* patterns to be candidates for TImode STV > (by using V2DI/V2DF instructions), it already improves things a little. > > For the test case from PR 88873 > > typedef struct { double x, y; } s_t; > typedef double v2df __attribute__ ((vector_size (2 * sizeof(double)))); > > s_t foo (s_t a, s_t b, s_t c) > { > return (s_t) { fma(a.x, b.x, c.x), fma (a.y, b.y, c.y) }; > } > > > With -O2 -march=3Dcascadelake, GCC currently generates: > > Before (29 instructions): > vmovq %xmm2, -56(%rsp) > movq -56(%rsp), %rdx > vmovq %xmm4, -40(%rsp) > movq $0, -48(%rsp) > movq %rdx, -56(%rsp) > movq -40(%rsp), %rdx > vmovq %xmm0, -24(%rsp) > movq %rdx, -40(%rsp) > movq -24(%rsp), %rsi > movq -56(%rsp), %rax > movq $0, -32(%rsp) > vmovq %xmm3, -48(%rsp) > movq -48(%rsp), %rcx > vmovq %xmm5, -32(%rsp) > vmovq %rax, %xmm6 > movq -40(%rsp), %rax > movq $0, -16(%rsp) > movq %rsi, -24(%rsp) > movq -32(%rsp), %rsi > vpinsrq $1, %rcx, %xmm6, %xmm6 > vmovq %rax, %xmm7 > vmovq %xmm1, -16(%rsp) > vmovapd %xmm6, %xmm3 > vpinsrq $1, %rsi, %xmm7, %xmm7 > vfmadd132pd -24(%rsp), %xmm7, %xmm3 > vmovapd %xmm3, -56(%rsp) > vmovsd -48(%rsp), %xmm1 > vmovsd -56(%rsp), %xmm0 > ret > > After (20 instructions): > vmovq %xmm2, -56(%rsp) > movq -56(%rsp), %rax > vmovq %xmm3, -48(%rsp) > vmovq %xmm4, -40(%rsp) > movq -48(%rsp), %rcx > vmovq %xmm5, -32(%rsp) > vmovq %rax, %xmm6 > movq -40(%rsp), %rax > movq -32(%rsp), %rsi > vpinsrq $1, %rcx, %xmm6, %xmm6 > vmovq %xmm0, -24(%rsp) > vmovq %rax, %xmm7 > vmovq %xmm1, -16(%rsp) > vmovapd %xmm6, %xmm2 > vpinsrq $1, %rsi, %xmm7, %xmm7 > vfmadd132pd -24(%rsp), %xmm7, %xmm2 > vmovapd %xmm2, -56(%rsp) > vmovsd -48(%rsp), %xmm1 > vmovsd -56(%rsp), %xmm0 > ret > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=3Dunix{-m32} > with no new failures. No testcase yet, as the above code will hopefully > change dramatically with the next pieces. Ok for mainline? > > > 2023-07-13 Roger Sayle > > gcc/ChangeLog > * config/i386/i386-expand.cc (ix86_expand_move): Generalize speci= al > case inserting of 64-bit values into a TImode register, to handle > both DImode and DFmode using either *insvti_lowpart_1 > or *isnvti_highpart_1. LGTM, but please watch out for fallout. Thanks, Uros.