From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 1658A3858C56 for ; Mon, 11 Apr 2022 11:19:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1658A3858C56 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.90,251,1643702400"; d="c'?scan'208";a="76905317" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 11 Apr 2022 03:19:29 -0800 IronPort-SDR: FZS8jmG5ZPAG68qJdWBwV+7+m6LdrpYkWiau+Swcx0+Byjw74e9r8pKZiT2dHfVjnB1cY/jYRs qXbqrATzCj7Nn/YVsFAurBELxtrZI6cyHPEiDqeY7EL3/qT6FZI60s5yhrBogk2sGJvTvPrZUR SvZPsSZlwrIoIjBtLD9ZtwSdTrTPtssV6Dz67jIW3LeYlDOFB0+w2BfR7QX2Y+TgdPS/IwTgcu AkLYGE/7QnNFSydAOvkmsSxPHSUeyL7ck2MDGa6h2BFlRLJZnW56Ex+f2bQDTBStgTqU5/GBX3 osE= Content-Type: multipart/mixed; boundary="------------GAwp2mQE2zzLnvZ7q03IGtjo" Message-ID: <19966fd3-8bfe-5a1a-41cf-4d95d99e69fd@codesourcery.com> Date: Mon, 11 Apr 2022 12:19:24 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 X-Mozilla-News-Host: news://news.gmane.org:119 Content-Language: en-GB To: GCC Development From: Andrew Stubbs Subject: Complex multiply optimization working? X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2022 11:19:33 -0000 --------------GAwp2mQE2zzLnvZ7q03IGtjo Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit Hi all, I've been looking at implementing the complex multiply patterns for the amdgcn port, but I'm not getting the code I was hoping for. When I try to use the patterns on x86_64 or AArch64 they don't seem to work there either, so is there something wrong with the middle-end? I've tried both current HEAD and GCC 11. The example shown in the internals manual is a simple loop multiplying two arrays of complex numbers, and writing the results to a third. I had expected that it would use the largest vectorization factor available, with the real/imaginary numbers in even/odd lanes as described, but the vectorization factor is only 2 (so, a single complex number), and I have to set -fvect-cost-model=unlimited to get even that. I tried another example with SLP and that too uses the cmul patterns only for a single real/imaginary pair. Did proper vectorization of cmul ever really work? There is a case in the testsuite for the pattern match, but it isn't in a loop. Thanks Andrew P.S. I attached my testcase, in case I'm doing something stupid. P.P.S. The manual says the pattern is "cmulm4", etc., but it's actually "cmulm3" in the implementation. --------------GAwp2mQE2zzLnvZ7q03IGtjo Content-Type: text/plain; charset="UTF-8"; name="t.c" Content-Disposition: attachment; filename="t.c" Content-Transfer-Encoding: base64 dHlwZWRlZiBfQ29tcGxleCBkb3VibGUgY29tcGxleFQ7CiNkZWZpbmUgYXJyYXlzaXplIDI1 NgoKdm9pZCBmKApjb21wbGV4VCBhW3Jlc3RyaWN0IGFycmF5c2l6ZV0sCmNvbXBsZXhUIGJb cmVzdHJpY3QgYXJyYXlzaXplXSwKY29tcGxleFQgY1tyZXN0cmljdCBhcnJheXNpemVdCiAg ICAgICApCnsKI2lmIGRlZmluZWQoTE9PUCkKICBmb3IgKGludCBpID0gMDsgaSA8IGFycmF5 c2l6ZTsgaSsrKQogICAgY1tpXSA9IGFbaV0gKiBiW2ldOwojZWxzZQoKICAgIGNbMF0gPSBh WzBdICogYlswXTsKICAgIGNbMV0gPSBhWzFdICogYlsxXTsKICAgIGNbMl0gPSBhWzJdICog YlsyXTsKICAgIGNbM10gPSBhWzNdICogYlszXTsKICAgIGNbNF0gPSBhWzRdICogYls0XTsK ICAgIGNbNV0gPSBhWzVdICogYls1XTsKICAgIGNbNl0gPSBhWzZdICogYls2XTsKICAgIGNb N10gPSBhWzddICogYls3XTsKICAgIGNbOF0gPSBhWzhdICogYls4XTsKICAgIGNbOV0gPSBh WzldICogYls5XTsKICAgIGNbMTBdID0gYVsxMF0gKiBiWzEwXTsKICAgIGNbMTFdID0gYVsx MV0gKiBiWzExXTsKICAgIGNbMTJdID0gYVsxMl0gKiBiWzEyXTsKICAgIGNbMTNdID0gYVsx M10gKiBiWzEzXTsKICAgIGNbMTRdID0gYVsxNF0gKiBiWzE0XTsKICAgIGNbMTVdID0gYVsx NV0gKiBiWzE1XTsKICAgIGNbMTZdID0gYVsxNl0gKiBiWzE2XTsKICAgIGNbMTddID0gYVsx N10gKiBiWzE3XTsKICAgIGNbMThdID0gYVsxOF0gKiBiWzE4XTsKICAgIGNbMTldID0gYVsx OV0gKiBiWzE5XTsKICAgIGNbMjBdID0gYVsyMF0gKiBiWzIwXTsKICAgIGNbMjFdID0gYVsy MV0gKiBiWzIxXTsKICAgIGNbMjJdID0gYVsyMl0gKiBiWzIyXTsKICAgIGNbMjNdID0gYVsy M10gKiBiWzIzXTsKICAgIGNbMjRdID0gYVsyNF0gKiBiWzI0XTsKICAgIGNbMjVdID0gYVsy NV0gKiBiWzI1XTsKICAgIGNbMjZdID0gYVsyNl0gKiBiWzI2XTsKICAgIGNbMjddID0gYVsy N10gKiBiWzI3XTsKICAgIGNbMjhdID0gYVsyOF0gKiBiWzI4XTsKICAgIGNbMjldID0gYVsy OV0gKiBiWzI5XTsKICAgIGNbMzBdID0gYVszMF0gKiBiWzMwXTsKICAgIGNbMzFdID0gYVsz MV0gKiBiWzMxXTsKICAgIGNbMzJdID0gYVszMl0gKiBiWzMyXTsKI2VuZGlmCn0K --------------GAwp2mQE2zzLnvZ7q03IGtjo--