From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by sourceware.org (Postfix) with ESMTPS id 82F9F3843883 for ; Thu, 30 Jun 2022 12:01:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 82F9F3843883 Received: by mail-qv1-xf2b.google.com with SMTP id p31so29293814qvp.5 for ; Thu, 30 Jun 2022 05:01:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=41x6QhHhE2OiYS2vauOHZvHdiRiBqFJB5My7zUdn4Yw=; b=NLKhKoX0AUL+y1kCtkS9DsylVpiCePBWBa3MlzaVzaU96WiP+7OkLdkLBl96uLCwAv 6LiiEhD3leU0T27tt+zDlTsWZP+1gwfQ85cT7rhaXWvSCPhAtKHtAwBkKseScjN9pG9B utJxS4e8nn/sm00srmMCytqugR6737YFBKMitYn7irKvwcEYAjFOlrfpbB3rkvnsm8r9 ELbp3DzwfuqWI33lHDb6JWDFugN2iaE1p+w0ywgqAhbYhZpSf6FEOlKGRMOE3uGKHjFa ey7v1hljGm+UyTOoKb3O3JWWTT6qdfzDeffF5lJfXrk1vPf7D2trDxevxHtxoBU9TaJN miJQ== X-Gm-Message-State: AJIora/LrsTaKbt7khUduLGezO9oOmH6STczGGww7/TxFnrRFANXWZrI o3Rb0vPVhw3vuZN2MDFQB3W9L9QoZRlvRERJ5+Y0o1lJ41A= X-Google-Smtp-Source: AGRyM1viq2Mx1etXAlDgVevGvhCV+/HcRKeBs1/jfyNMvqHyLG10M/v48L8atEoRoPoEC/5PSGtvZf3TQj55wiH+nsw= X-Received: by 2002:a0c:ef0a:0:b0:470:42e7:44dc with SMTP id t10-20020a0cef0a000000b0047042e744dcmr12479707qvr.2.1656590473657; Thu, 30 Jun 2022 05:01:13 -0700 (PDT) MIME-Version: 1.0 References: <006b01d88c70$0dbc7530$29355f90$@nextmovesoftware.com> In-Reply-To: <006b01d88c70$0dbc7530$29355f90$@nextmovesoftware.com> From: Uros Bizjak Date: Thu, 30 Jun 2022 14:01:02 +0200 Message-ID: Subject: Re: [x86 PATCH take #2] Double word logical operation clean-ups in i386.md. To: Roger Sayle Cc: "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=0.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, MEDICAL_SUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jun 2022 12:01:17 -0000 On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle w= rote: > > > Hi Uros, > Many thanks for your review of the "double word logical operation clean-u= p" patch. > The revision below incorporates the majority of your feedback, but with o= ne or two > exceptions (required to allow the patch to bootstrap) that I thought I'd = double check > with you before pushing. > > Firstly, great catch that we no longer need to test rtx_equal (operands[0= ], operands[1]) > when moving a splitter from before reload to after reload, as this is gua= ranteed by the > "0" constraints. I've cleaned this up in all the doubleword splitters (i= ncluding the > case that's now moved). Also, as you've suggested, this patch u= ses > a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split post-= reload) > and the other for !TARGET_BMI (that's lowered rather than split, pre-load= /post-STV). > > Unfortunately, the "force_reg of tricky immediate constants" checks reall= y are > required for these expanders. I agree normally the predicate is checked/= guaranteed > for a define_insn, but in this case the gen_iordi3 function and related e= xpanders are > frequently called directly by the middle-end or from i386-expand, which b= ypasses > the checks made by the later RTL passes. When given arbitrary immediate = constants, > this results in ICEs from insns not matching their predicates soon after = expand > (breaking bootstrap with an ICE). It's only "standard name" expanders th= at require > this treatment, define_insn{_and_split} templates do enforce their predic= ates. > > And finally, we can't/shouldn't use in the actual > doubleword splitters, as the mode being iterated over is DWIH (not DWI), > where we require the predicate for the corresponding mode. It turn= s > out that it's always appropriate to use x86_64_hilo_general_operand where= ver > we use the "r" constraint, and that's used consistently in this patch= . > > I hope these exceptions are acceptable. The attached revised patch has > been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check > both with and with --target_board=3Dunix{-m32} with no new failures. > Are these revisions OK for mainline? Thanks for your explanation of the particularities of the patch! Yes, the patch is OK. Thanks, Uros. > > 2022-06-30 Roger Sayle > Uro=C5=A1 Bizjak > > gcc/ChangeLog > * config/i386/i386.md (general_szext_operand): Add TImode > support using x86_64_hilo_general_operand predicate. > (*cmp_doubleword): Use x86_64_hilo_general_operand predicate= . > (*add3_doubleword): Improved optimization of zero addition. > (and3): Use SDWIM mode iterator to add support for double > word bit-wise AND in TImode. Use force_reg when double word > immediate operand isn't x86_64_hilo_general_operand. > (and3_doubleword): Generalized from anddi3_doubleword and > converted into a post-reload splitter. > (*andndi3_doubleword): Old define_insn deleted. > (*andn3_doubleword_bmi): New define_insn_and_split for > TARGET_BMI that splits post-reload. > (*andn3_doubleword): New define_insn_and_split for > !TARGET_BMI, that lowers/splits before reload. > (3): Use SDWIM mode iterator to add suppport for > double word bit-wise XOR and bit-wise IOR in TImode. Use > force_reg when double word immediate operand isn't > x86_64_hilo_general_operand. > (*di3_doubleword): Generalized from di3_doublewor= d. > (one_cmpl2): Use SDWIM mode iterator to add support for > double word bit-wise NOT in TImode. > (one_cmpl2_doubleword): Generalize from one_cmpldi2_doublewo= rd > and converted into a post-reload splitter. > > > Thanks again, > Roger > -- > > > -----Original Message----- > > From: Uros Bizjak > > Sent: 28 June 2022 16:38 > > To: Roger Sayle > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i38= 6.md. > > > > On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle > > wrote: > > > > > > > > > Hi Uros, > > > As you've requested/suggested, here's a patch that tidies up and > > > unifies doubleword handling in i386.md; converting all doubleword > > > splitters for logic operations to post-reload form, generalizing thei= r > > > define_insn_and_split templates to form (supporting TARGET_64BI= T > > > ? TImode : DImode), and where required tweaking the corresponding > > > expanders to use SDWIM to support TImode doubleword operations. Thes= e > > > changes incorporate your feedback from > > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html > > > where I included many/several of these clean-ups, in a patch to add a > > > new optimization. I agree, it's better to split these out (this > > > patch), and I'll resubmit the (smaller) optimization patch as a > > > follow-up. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=3Dunix{-m32}, > > > with no new failures. Ok for mainline? > > > > > > > > > 2022-06-28 Roger Sayle > > > > > > gcc/ChangeLog > > > * config/i386/i386.md (general_szext_operand): Add TImode > > > support using x86_64_hilo_general_operand predicate. > > > (*cmp_doubleword): Use x86_64_hilo_general_operand predi= cate. > > > (*add3_doubleword): Improved optimization of zero additi= on. > > > (and3): Use SDWIM mode iterator to add support for doub= le > > > word bit-wise AND in TImode. Use force_reg when double word > > > immediate operand isn't x86_64_hilo_general_operand. > > > (and3_doubleword): Generalized from anddi3_doubleword an= d > > > converted into a post-reload splitter. > > > (*andn3_doubleword): Generalized from *andndi3_doublewo= rd. > > > (define_split): Generalize DImode splitters for andn to = . > > > One splitter for TARGET_BMI, the other for !TARGET_BMI. > > > (3): Use SDWIM mode iterator to add suppport fo= r > > > double word bit-wise XOR and bit-wise IOR in TImode. Use > > > force_reg when double word immediate operand isn't > > > x86_64_hilo_general_operand. > > > (*di3_doubleword): Generalized from di3_doubl= eword. > > > (one_cmpl2): Use SDWIM mode iterator to add support for > > > double word bit-wise NOT in TImode. > > > (one_cmpl2_doubleword): Generalize from > > one_cmpldi2_doubleword > > > and converted into a post-reload splitter. > > > > > > (define_expand "and3" > > - [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") > > - (and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand") > > - (match_operand:SWIM1248x 2 "")))= ] > > + [(set (match_operand:SDWIM 0 "nonimmediate_operand") > > + (and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") > > + (match_operand:SDWIM 2 "")))] > > "" > > { > > machine_mode mode =3D mode; > > > > - if (mode =3D=3D DImode && !TARGET_64BIT) > > - ; > > - else if (const_int_operand (operands[2], mode) > > - && register_operand (operands[0], mode) > > - && !(TARGET_ZERO_EXTEND_WITH_AND > > - && optimize_function_for_speed_p (cfun))) > > + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD > > + && !x86_64_hilo_general_operand (operands[2], mode)) > > + operands[2] =3D force_reg (mode, operands[2]); > > > > You don't have to do that - when the predicate can't be satisfied, the = middle-end > > pushes the value to a register as a last resort by default. > > > > + bool emit_insn_deleted_note_p =3D false; > > + > > + split_double_mode (mode, &operands[0], 3, &operands[0], > > + &operands[3]); > > > > if (operands[2] =3D=3D const0_rtx) > > emit_move_insn (operands[0], const0_rtx); > > else if (operands[2] =3D=3D constm1_rtx) > > - emit_move_insn (operands[0], operands[1]); > > + { > > + if (!rtx_equal_p (operands[0], operands[1])) > > + emit_move_insn (operands[0], operands[1]); > > + else > > + emit_insn_deleted_note_p =3D true; > > + } > > > > Please note that when operands[2] is an immediate, constraints after re= load > > *guarantee* that operands[1] match operands[0]. So, the insn should alw= ays be > > deleted (I think that this functionality was in your patch - i= t is > > unneeded there, too). > > > > +(define_insn "*andn3_doubleword" > > + [(set (match_operand:DWI 0 "register_operand") > > + (and:DWI > > + (not:DWI (match_operand:DWI 1 "register_operand")) > > + (match_operand:DWI 2 "nonimmediate_operand"))) > > (clobber (reg:CC FLAGS_REG))] > > - "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 > > - && ix86_pre_reload_split ()" > > + "ix86_pre_reload_split ()" > > "#") > > > > Please introduce two ANDN double-word insn-and-split patterns, one for = BMI > > and one for !BMI. The one for BMI should be moved to a post-reload spli= tter, > > too. As we figured out, *all* double-word patterns should either be of = pre- > > reload or of post-reload type. > > > > (define_split > > - [(set (match_operand:DI 0 "register_operand") > > - (and:DI > > - (not:DI (match_operand:DI 1 "register_operand")) > > - (match_operand:DI 2 "nonimmediate_operand"))) > > + [(set (match_operand:DWI 0 "register_operand") > > + (and:DWI > > + (not:DWI (match_operand:DWI 1 "register_operand")) > > + (match_operand:DWI 2 "nonimmediate_operand"))) > > (clobber (reg:CC FLAGS_REG))] > > - "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2 > > + "!TARGET_BMI > > > > Without BMI, the ANDN should be split to a double-word NOT + AND before > > reload (and these two insns are split to single-word operations after r= eload). > > This simplifies splitting logic quite a bit. > > > > (define_expand "3" > > - [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") > > - (any_or:SWIM1248x (match_operand:SWIM1248x 1 > > "nonimmediate_operand") > > - (match_operand:SWIM1248x 2 "")))] > > + [(set (match_operand:SDWIM 0 "nonimmediate_operand") > > + (any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") > > + (match_operand:SDWIM 2 "")))] > > > > Use here ... > > > > "" > > - "ix86_expand_binary_operator (, mode, operands); DONE;") > > +{ > > > > -(define_insn_and_split "*di3_doubleword" > > - [(set (match_operand:DI 0 "nonimmediate_operand" "=3Dro,r") > > - (any_or:DI > > - (match_operand:DI 1 "nonimmediate_operand" "0,0") > > - (match_operand:DI 2 "x86_64_szext_general_operand" "re,o"))) > > + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD > > + && !x86_64_hilo_general_operand (operands[2], mode)) > > + operands[2] =3D force_reg (mode, operands[2]); > > > > ... to avoid the above fixup. > > > > +(define_insn_and_split "*3_doubleword" > > + [(set (match_operand: 0 "nonimmediate_operand" "=3Dro,r") > > + (any_or: > > + (match_operand: 1 "nonimmediate_operand" "%0,0") > > + (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) > > > > for consistency. > > > > Otherwise OK. > > > > Uros.