From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 3270E3858D35 for ; Fri, 31 Jul 2020 12:08:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3270E3858D35 Received: by mail-ej1-x630.google.com with SMTP id jp10so4622402ejb.0 for ; Fri, 31 Jul 2020 05:08:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lPNOhpQqgsH39yBbdxoWXc+5l7TZ6/7mkrh/GvxsBIQ=; b=MJd+Q53OxwRAPpBStjHHm+Tqd/O86PksZjxpgmiBTTSFrMH2DRuqNwltsLRtCM82ds GsFQs8/V1Nhfg2/DOtWSB7jby44Zl1PwiWIF4HttHgcm502tp0pPXv8IASwA2u86hN29 aeqLHnNTGx4vNiV8zHV8O8CnkkDcQzpXyrQqpxGvmAOAicAZZUsmbp3N/19L99kIr7wZ 4dZCcWbbPvQLbPdXYl8ue8bPwn6mVo9diQ17lDOCr//NYERDU5NkUIdIXLudK4cMBFrv RPV9UZZvcRgPLW45sIMNbJkiTJL0J4UrihYA7jtQp+5VJxgWKOAWxIY9lPOnTUMFNBWu eySQ== X-Gm-Message-State: AOAM533UZ+7etaCKXgs76mF0IkniRud+nan/u1Xjt2x/LJMycrrW2YUV snuGaEPXmSrkvIIDTpIZyhyTtTjf6pa5hUpUD8o= X-Google-Smtp-Source: ABdhPJxrRQtq/vFEjAdiL6YfCQeamCElzvcDr95uLijnVqvwewFyv5Hft7qTimwbEutHah2tbtoDFdyopWJjDAmZRGU= X-Received: by 2002:a17:906:38c7:: with SMTP id r7mr3885105ejd.118.1596197312128; Fri, 31 Jul 2020 05:08:32 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Fri, 31 Jul 2020 14:08:20 +0200 Message-ID: Subject: Re: VEC_COND_EXPR optimizations To: Marc Glisse Cc: GCC Patches Content-Type: multipart/mixed; boundary="000000000000c82ee705abbba5ca" X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2020 12:08:34 -0000 --000000000000c82ee705abbba5ca Content-Type: text/plain; charset="UTF-8" On Fri, Jul 31, 2020 at 1:47 PM Richard Biener wrote: > > On Fri, Jul 31, 2020 at 1:39 PM Richard Biener > wrote: > > > > On Fri, Jul 31, 2020 at 1:35 PM Richard Biener > > wrote: > > > > > > On Thu, Jul 30, 2020 at 9:49 AM Marc Glisse wrote: > > > > > > > > When vector comparisons were forced to use vec_cond_expr, we lost a number > > > > of optimizations (my fault for not adding enough testcases to prevent > > > > that). This patch tries to unwrap vec_cond_expr a bit so some > > > > optimizations can still happen. > > > > > > > > I wasn't planning to add all those transformations together, but adding > > > > one caused a regression, whose fix introduced a second regression, etc. > > > > > > > > Using a simple fold_binary internally looks like an ok compromise to me. > > > > It remains cheap enough (not recursive, and vector instructions are not > > > > that frequent), while still allowing more than const_binop (X|0 or X&X for > > > > instance). The transformations are quite conservative with :s and folding > > > > only if everything simplifies, we may want to relax this later. And of > > > > course we are going to miss things like a?b:c + a?c:b -> b+c. > > > > > > > > In terms of number of operations, some transformations turning 2 > > > > VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not > > > > look like a gain... I expect the bit_not disappears in most cases, and > > > > VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR. > > > > > > > > I am a bit confused that with avx512 we get types like "vector(4) > > > > " with :2 and not :1 (is it a hack so true is 1 and not > > > > -1?), but that doesn't matter for this patch. > > > > > > > > Regtest+bootstrap on x86_64-pc-linux-gnu > > > > > > + (with > > > + { > > > + tree rhs1, rhs2 = NULL; > > > + rhs1 = fold_binary (op, type, @1, @3); > > > + if (rhs1 && is_gimple_val (rhs1)) > > > + rhs2 = fold_binary (op, type, @2, @3); > > > > > > ICK. I guess a more match-and-simplify way would be > > > > > > (with > > > { > > > tree rhs1, rhs2; > > > gimple_match_op op (gimple_match_cond::UNCOND, op, > > > type, @1, @3); > > > if (op.resimplify (NULL, valueize) > > > && gimple_simplified_result_is_gimple_val (op)) > > > { > > > rhs1 = op.ops[0]; > > > ... other operand ... > > > } > > > > > > now in theory we could invent some new syntax for this, like > > > > > > (simplify > > > (op (vec_cond:s @0 @1 @2) @3) > > > (vec_cond @0 (op:x @1 @3) (op:x @2 @3))) > > > > > > and pick something better instead of :x (:s is taken, > > > would be 'simplified', :c is taken would be 'constexpr', ...). > > > > > > _Maybe_ just > > > > > > (simplify > > > (op (vec_cond:s @0 @1 @2) @3) > > > (vec_cond:x @0 (op @1 @3) (op @2 @3))) > > > > > > which would have the same practical meaning as passing > > > NULL for the seq argument to simplification - do not allow > > > any intermediate stmt to be generated. > > > > Note I specifically do not like those if (it-simplifies) checks > > because we already would code-generate those anyway. For > > > > (simplify > > (plus (vec_cond:s @0 @1 @2) @3) > > (vec_cond @0 (plus @1 @3) (plus @2 @3))) > > > > we get > > > > res_op->set_op (VEC_COND_EXPR, type, 3); > > res_op->ops[0] = captures[1]; > > res_op->ops[0] = unshare_expr (res_op->ops[0]); > > { > > tree _o1[2], _r1; > > _o1[0] = captures[2]; > > _o1[1] = captures[4]; > > gimple_match_op tem_op (res_op->cond.any_else > > (), PLUS_EXPR, TREE_TYPE (_o1[0]), _o1[0], _o1[1]); > > tem_op.resimplify (lseq, valueize); > > _r1 = maybe_push_res_to_seq (&tem_op, lseq); (****) > > if (!_r1) return false; > > res_op->ops[1] = _r1; > > } > > { > > tree _o1[2], _r1; > > _o1[0] = captures[3]; > > _o1[1] = captures[4]; > > gimple_match_op tem_op (res_op->cond.any_else > > (), PLUS_EXPR, TREE_TYPE (_o1[0]), _o1[0], _o1[1]); > > tem_op.resimplify (lseq, valueize); > > _r1 = maybe_push_res_to_seq (&tem_op, lseq); (***) > > if (!_r1) return false; > > res_op->ops[2] = _r1; > > } > > res_op->resimplify (lseq, valueize); > > return true; > > > > and the only change required would be to pass NULL to maybe_push_res_to_seq > > here instead of lseq at the (***) marked points. > > (simplify > (plus (vec_cond:s @0 @1 @2) @3) > (vec_cond:l @0 (plus @1 @3) (plus @2 @3))) > > 'l' for 'force leaf'. I'll see if I can quickly cme up with a patch. The attached prototype works for (simplify (plus (vec_cond:s @0 @1 @2) @3) (vec_cond @0 (plus:l @1 @3) (plus:l @2 @3))) but ':...' is already taken for an explicitly specified type so I have to think about sth better. As you see I've also moved it to the actual ops that should simplify. It doesn't work on the outermost expression but I guess it doesn't make sense there (adding support would be possible). Now I need some non-ambiguous syntax... it currently is id[:type][@cid] so maybe id[!][:type][@cid]. I guess non-ambiguous is good enough? Richard. > Richard. > > > > > Richard. > > > > > The other "simple" patterns look good, you can commit > > > them separately if you like. > > > > > > Richard. > > > > > > > 2020-07-30 Marc Glisse > > > > > > > > PR tree-optimization/95906 > > > > PR target/70314 > > > > * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e), > > > > (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations. > > > > > > > > * gcc.dg/tree-ssa/andnot-2.c: New file. > > > > * gcc.dg/tree-ssa/pr95906.c: Likewise. > > > > * gcc.target/i386/pr70314.c: Likewise. > > > > > > > > -- > > > > Marc Glisse --000000000000c82ee705abbba5ca Content-Type: application/octet-stream; name=p Content-Disposition: attachment; filename=p Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_kda6j1zz0 ZGlmZiAtLWdpdCBhL2djYy9nZW5tYXRjaC5jIGIvZ2NjL2dlbm1hdGNoLmMKaW5kZXggMGE4Y2Jh NjJlMGMuLjlhMzRmZTcxZTc4IDEwMDY0NAotLS0gYS9nY2MvZ2VubWF0Y2guYworKysgYi9nY2Mv Z2VubWF0Y2guYwpAQCAtNjk3LDEyICs2OTcsMTMgQEAgcHVibGljOgogICBleHByIChpZF9iYXNl ICpvcGVyYXRpb25fLCBsb2NhdGlvbl90IGxvYywgYm9vbCBpc19jb21tdXRhdGl2ZV8gPSBmYWxz ZSkKICAgICA6IG9wZXJhbmQgKE9QX0VYUFIsIGxvYyksIG9wZXJhdGlvbiAob3BlcmF0aW9uXyks CiAgICAgICBvcHMgKHZOVUxMKSwgZXhwcl90eXBlIChOVUxMKSwgaXNfY29tbXV0YXRpdmUgKGlz X2NvbW11dGF0aXZlXyksCi0gICAgICBpc19nZW5lcmljIChmYWxzZSksIGZvcmNlX3NpbmdsZV91 c2UgKGZhbHNlKSwgb3B0X2dycCAoMCkge30KKyAgICAgIGlzX2dlbmVyaWMgKGZhbHNlKSwgZm9y Y2Vfc2luZ2xlX3VzZSAoZmFsc2UpLCBmb3JjZV9sZWFmIChmYWxzZSksCisgICAgICBvcHRfZ3Jw ICgwKSB7fQogICBleHByIChleHByICplKQogICAgIDogb3BlcmFuZCAoT1BfRVhQUiwgZS0+bG9j YXRpb24pLCBvcGVyYXRpb24gKGUtPm9wZXJhdGlvbiksCiAgICAgICBvcHMgKHZOVUxMKSwgZXhw cl90eXBlIChlLT5leHByX3R5cGUpLCBpc19jb21tdXRhdGl2ZSAoZS0+aXNfY29tbXV0YXRpdmUp LAogICAgICAgaXNfZ2VuZXJpYyAoZS0+aXNfZ2VuZXJpYyksIGZvcmNlX3NpbmdsZV91c2UgKGUt PmZvcmNlX3NpbmdsZV91c2UpLAotICAgICAgb3B0X2dycCAoZS0+b3B0X2dycCkge30KKyAgICAg IGZvcmNlX2xlYWYgKGUtPmZvcmNlX2xlYWYpLCBvcHRfZ3JwIChlLT5vcHRfZ3JwKSB7fQogICB2 b2lkIGFwcGVuZF9vcCAob3BlcmFuZCAqb3ApIHsgb3BzLnNhZmVfcHVzaCAob3ApOyB9CiAgIC8q IFRoZSBvcGVyYXRvciBhbmQgaXRzIG9wZXJhbmRzLiAgKi8KICAgaWRfYmFzZSAqb3BlcmF0aW9u OwpAQCAtNzE3LDYgKzcxOCw5IEBAIHB1YmxpYzoKICAgLyogV2hldGhlciBwdXNoaW5nIGFueSBz dG10IHRvIHRoZSBzZXF1ZW5jZSBzaG91bGQgYmUgY29uZGl0aW9uYWwKICAgICAgb24gdGhpcyBl eHByZXNzaW9uIGhhdmluZyBhIHNpbmdsZS11c2UuICAqLwogICBib29sIGZvcmNlX3NpbmdsZV91 c2U7CisgIC8qIFdoZXRoZXIgaW4gdGhlIHJlc3VsdCBleHByZXNzaW9uIHRoaXMgc2hvdWxkIGJl IGEgbGVhZiBub2RlCisgICAgIHdpdGggYW55IGNoaWxkcmVuIHNpbXBsaWZpZWQgZG93biB0byBz aW1wbGUgb3BlcmFuZHMuICAqLworICBib29sIGZvcmNlX2xlYWY7CiAgIC8qIElmIG5vbi16ZXJv LCB0aGUgZ3JvdXAgZm9yIG9wdGlvbmFsIGhhbmRsaW5nLiAgKi8KICAgdW5zaWduZWQgY2hhciBv cHRfZ3JwOwogICB2aXJ0dWFsIHZvaWQgZ2VuX3RyYW5zZm9ybSAoRklMRSAqZiwgaW50LCBjb25z dCBjaGFyICosIGJvb2wsIGludCwKQEAgLTI1MjAsNyArMjUyNCw4IEBAIGV4cHI6Omdlbl90cmFu c2Zvcm0gKEZJTEUgKmYsIGludCBpbmRlbnQsIGNvbnN0IGNoYXIgKmRlc3QsIGJvb2wgZ2ltcGxl LAogICAgICAgZnByaW50ZiAoZiwgIik7XG4iKTsKICAgICAgIGZwcmludGZfaW5kZW50IChmLCBp bmRlbnQsICJ0ZW1fb3AucmVzaW1wbGlmeSAobHNlcSwgdmFsdWVpemUpO1xuIik7CiAgICAgICBm cHJpbnRmX2luZGVudCAoZiwgaW5kZW50LAotCQkgICAgICAiX3IlZCA9IG1heWJlX3B1c2hfcmVz X3RvX3NlcSAoJnRlbV9vcCwgbHNlcSk7XG4iLCBkZXB0aCk7CisJCSAgICAgICJfciVkID0gbWF5 YmVfcHVzaF9yZXNfdG9fc2VxICgmdGVtX29wLCAlcyk7XG4iLCBkZXB0aCwKKwkJICAgICAgIWZv cmNlX2xlYWYgPyAibHNlcSIgOiAiTlVMTCIpOwogICAgICAgZnByaW50Zl9pbmRlbnQgKGYsIGlu ZGVudCwKIAkJICAgICAgImlmICghX3IlZCkgcmV0dXJuIGZhbHNlO1xuIiwKIAkJICAgICAgZGVw dGgpOwpAQCAtNDI1MCw3ICs0MjU1LDEyIEBAIHBhcnNlcjo6cGFyc2VfZXhwciAoKQogCXsKIAkg IGNvbnN0IGNoYXIgKnMgPSBnZXRfaWRlbnQgKCk7CiAJICBpZiAoIXBhcnNpbmdfbWF0Y2hfb3Bl cmFuZCkKLQkgICAgZXhwcl90eXBlID0gczsKKwkgICAgeworCSAgICAgIGlmICgqcyA9PSAnbCcp CisJCWUtPmZvcmNlX2xlYWYgPSB0cnVlOworCSAgICAgIGVsc2UKKwkJZXhwcl90eXBlID0gczsK KwkgICAgfQogCSAgZWxzZQogCSAgICB7CiAJICAgICAgY29uc3QgY2hhciAqc3AgPSBzOwo= --000000000000c82ee705abbba5ca--