From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by sourceware.org (Postfix) with ESMTPS id 0EA55388981C for ; Thu, 17 Jun 2021 10:09:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0EA55388981C Received: by mail-qv1-xf2d.google.com with SMTP id dj3so1327473qvb.11 for ; Thu, 17 Jun 2021 03:09:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kXsI+jNaS7FlgEhh5FV+zyVyn1ZbSNpwZdkd/QuW850=; b=dG8Ytb7EkQUwaIn7hnOMKjB1dYAj2GbWMmouwLKksFH0n+dryc26BdzTaKYA4zaxrE vcQ3xCLjRYeo/TT+SYTqiBWYt25zv1k5fNbxeFy7TYiNXROIO2TQZtqsGEA8jmlMG7Eu wspZnIHJpDKYPdVpLmPa/7cxcRTYB2AUJWypNqqTejjO2IUrlDVLzbq0EMW4Lv+LEX6L +4gFLT7Gp0L3sQ8Xcuwzt/to6bPTHkRCyTxRwTNSzL5R9A4ilDO7IzgOaD3IlH58QYkF dJyhHgupuQy6HMrOSounvzpasNc9/gBF/BY92BnEZpkYRZlorvNkdXgc/dk2tRiPJcba mj4g== X-Gm-Message-State: AOAM533wX2nZdCCHsCm2Nw3LRdo3Sm7FpS5mu0BbpyLKbd5sXD3F5m8O 7AaGx2g6pXQ9BUtHds4+O97z6jXN0+0fjqWfyKh2KS2YtHA= X-Google-Smtp-Source: ABdhPJzMqRaScLhiWE6lNfAnDp3Vwebe/RmGng39yAYkj659+W7UU2sDze18yLnITnQup6IvJRN5SQ2qEPhu4mlfBLM= X-Received: by 2002:a0c:c612:: with SMTP id v18mr5100731qvi.20.1623924587530; Thu, 17 Jun 2021 03:09:47 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Thu, 17 Jun 2021 12:09:36 +0200 Message-ID: Subject: Re: [PATCH][RFC] Add x86 subadd SLP pattern To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" , Hongtao Liu , Tamar Christina Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jun 2021 10:09:49 -0000 On Thu, Jun 17, 2021 at 12:00 PM Richard Biener wrote: > > On Thu, 17 Jun 2021, Uros Bizjak wrote: > > > On Thu, Jun 17, 2021 at 11:44 AM Richard Biener wrote: > > > > > > This addds SLP pattern recognition for the SSE3/AVX [v]addsubp{ds} v0, v1 > > > instructions which compute { v0[0] - v1[0], v0[1], + v1[1], ... } > > > thus subtract, add alternating on lanes, starting with subtract. > > > > > > It adds a corresponding optab and direct internal function, > > > vec_subadd$a3 and at the moment to make the i386 backend changes > > > "obvious", duplicates the existing avx_addsubv4df3 pattern with > > > the new canonical name (CODE_FOR_* and gen_* are used throughout the > > > intrinsic code, so the actual change to rename all existing patterns > > > will be quite a bit bigger). I expect some bike-shedding on > > > subadd vs. addsub so I delay that change ;) > > > > > > ARC seems to have both addsub{V2HI,V2SI,V4HI} and > > > subadd{V2HI,V2SI,V4HI}. ARM iwmmxt has wsubaddx on V4HI but > > > I didn't dare to decipher whether it's -, + or +, -. bfin > > > has both variants as well plus a saturating variant of both (on v2hi). > > > But none of the major vector ISAs besides x86 seem to have it. > > > Still some targets having both and opting for the naming I choose > > > would make that a better fit IMHO. > > > > > > Uros/Hontao - would mass-renaming of the x86 addsub patterns > > > to subadd be OK? (not touching the FMA ones which have both) > > > > We can add define_expand (this is how we approached the problem in the > > past for x86). If there are not too many of them, then this is the > > preferred approach (see below) ... > > OK. > > > ... until someone comes along and mass-renames everything to a generic name. > > Though now that I tried it isn't so bad for the addsub case (see > patch below). Would that be prefered or would it be an incomplete > renaming and thus the define_expand prefered? The approach below is OK, too. If you rename the patterns to addsub (see my previous mail for reasoning), then the rename is fully preferred as far as x86 is concerned. Thanks, Uros. > Thanks, > Richard. > > diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def > index 80c2a2c0294..3222eaedbd0 100644 > --- a/gcc/config/i386/i386-builtin.def > +++ b/gcc/config/i386/i386-builtin.def > @@ -855,8 +855,8 @@ BDESC (OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv1di3, "__ > BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_movshdup, "__builtin_ia32_movshdup", IX86_BUILTIN_MOVSHDUP, UNKNOWN, (int) V4SF_FTYPE_V4SF) > BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_movsldup, "__builtin_ia32_movsldup", IX86_BUILTIN_MOVSLDUP, UNKNOWN, (int) V4SF_FTYPE_V4SF) > > -BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_addsubv4sf3, "__builtin_ia32_addsubps", IX86_BUILTIN_ADDSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) > -BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_addsubv2df3, "__builtin_ia32_addsubpd", IX86_BUILTIN_ADDSUBPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) > +BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_vec_subaddv4sf3, "__builtin_ia32_addsubps", IX86_BUILTIN_ADDSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) > +BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_vec_subaddv2df3, "__builtin_ia32_addsubpd", IX86_BUILTIN_ADDSUBPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) > BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_haddv4sf3, "__builtin_ia32_haddps", IX86_BUILTIN_HADDPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) > BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_haddv2df3, "__builtin_ia32_haddpd", IX86_BUILTIN_HADDPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) > BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_hsubv4sf3, "__builtin_ia32_hsubps", IX86_BUILTIN_HSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) > @@ -996,8 +996,8 @@ BDESC (OPTION_MASK_ISA_SSE2, 0, CODE_FOR_pclmulqdq, 0, IX86_BUILTIN_PCLMULQDQ128 > /* AVX */ > BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_addv4df3, "__builtin_ia32_addpd256", IX86_BUILTIN_ADDPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) > BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_addv8sf3, "__builtin_ia32_addps256", IX86_BUILTIN_ADDPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) > -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_addsubv4df3, "__builtin_ia32_addsubpd256", IX86_BUILTIN_ADDSUBPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) > -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_addsubv8sf3, "__builtin_ia32_addsubps256", IX86_BUILTIN_ADDSUBPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) > +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_vec_subaddv4df3, "__builtin_ia32_addsubpd256", IX86_BUILTIN_ADDSUBPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) > +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_vec_subaddv8sf3, "__builtin_ia32_addsubps256", IX86_BUILTIN_ADDSUBPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) > BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_andv4df3, "__builtin_ia32_andpd256", IX86_BUILTIN_ANDPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) > BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_andv8sf3, "__builtin_ia32_andps256", IX86_BUILTIN_ANDPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) > BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_andnotv4df3, "__builtin_ia32_andnpd256", IX86_BUILTIN_ANDNPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 73238c9874a..48818dcf8cb 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -2396,8 +2396,6 @@ > (set_attr "prefix" "") > (set_attr "mode" "")]) > > -/* ??? Rename avx_addsubv4df3, but its name is used throughout the backed, > - so avoid this churn for the moment. */ > (define_insn "vec_subaddv4df3" > [(set (match_operand:V4DF 0 "register_operand" "=x") > (vec_merge:V4DF > @@ -2412,21 +2410,7 @@ > (set_attr "prefix" "vex") > (set_attr "mode" "V4DF")]) > > -(define_insn "avx_addsubv4df3" > - [(set (match_operand:V4DF 0 "register_operand" "=x") > - (vec_merge:V4DF > - (minus:V4DF > - (match_operand:V4DF 1 "register_operand" "x") > - (match_operand:V4DF 2 "nonimmediate_operand" "xm")) > - (plus:V4DF (match_dup 1) (match_dup 2)) > - (const_int 5)))] > - "TARGET_AVX" > - "vaddsubpd\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "type" "sseadd") > - (set_attr "prefix" "vex") > - (set_attr "mode" "V4DF")]) > - > -(define_insn "sse3_addsubv2df3" > +(define_insn "vec_subaddv2df3" > [(set (match_operand:V2DF 0 "register_operand" "=x,x") > (vec_merge:V2DF > (minus:V2DF > @@ -2444,7 +2428,7 @@ > (set_attr "prefix" "orig,vex") > (set_attr "mode" "V2DF")]) > > -(define_insn "avx_addsubv8sf3" > +(define_insn "vec_subaddv8sf3" > [(set (match_operand:V8SF 0 "register_operand" "=x") > (vec_merge:V8SF > (minus:V8SF > @@ -2458,7 +2442,7 @@ > (set_attr "prefix" "vex") > (set_attr "mode" "V8SF")]) > > -(define_insn "sse3_addsubv4sf3" > +(define_insn "vec_subaddv4sf3" > [(set (match_operand:V4SF 0 "register_operand" "=x,x") > (vec_merge:V4SF > (minus:V4SF