From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70070.outbound.protection.outlook.com [40.107.7.70]) by sourceware.org (Postfix) with ESMTPS id 9448338708EF for ; Mon, 28 Dec 2020 13:37:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9448338708EF Received: from AM6PR08CA0045.eurprd08.prod.outlook.com (2603:10a6:20b:c0::33) by DBBPR08MB4473.eurprd08.prod.outlook.com (2603:10a6:10:ce::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.31; Mon, 28 Dec 2020 13:37:48 +0000 Received: from VE1EUR03FT016.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:c0:cafe::c9) by AM6PR08CA0045.outlook.office365.com (2603:10a6:20b:c0::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27 via Frontend Transport; Mon, 28 Dec 2020 13:37:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT016.mail.protection.outlook.com (10.152.18.115) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27 via Frontend Transport; Mon, 28 Dec 2020 13:37:47 +0000 Received: ("Tessian outbound 8b6e0bb22f1c:v71"); Mon, 28 Dec 2020 13:37:46 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: c6590a985668284c X-CR-MTA-TID: 64aa7808 Received: from e6c2ff9a4e01.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 87754F2E-30F7-4AE9-B78D-3D27332A23C4.1; Mon, 28 Dec 2020 13:37:31 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e6c2ff9a4e01.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 28 Dec 2020 13:37:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jckAqinT1Xiwha8wLyp8kTYemURQVYpO3fd0/mS556YYgrQgNF3yh1vIU/dqxZPfO3d7PJtwr07BJZ/Tg+V8tniOSBf89IT6uvoc5a1at42wSrXlnYsO/GokuIZTf/taVphxaC1mCVnfnoXBYdkV0HQK3BzVHo1ooe4VUdoyRbUHabCzGLk8IYHMj51skq+w1i0v572YmCrjH0qgr3gyFfDib1n1D2whzKoMyWWH4aoFQP03BJ63Z+IyXZ9ezTHKi0ZpBYd7sxWROFT450kI37Uxuu3Ort7RhUv28JvbqO8tz5p6ke+yBctfbK475OXcGYoEFNM5zQoMXcSnv3bVQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Besd6FmWy2kiPHgITTxaeOCbnfq3d2Jvj80o/vFalD4=; b=hcuhfw/BvNSKsCo+SG2KUejXko260OnXqOvwWEQviwMooSsSGsI4HAuAFAprg3BUJH7vWlMd1MZah+4fbSSblrfzZcmH1iNDCGbCyXK+F3QF87Iln8wpAQbwPwBmx5wJiZvkZBjsHRP3HtUeE3KwfpNaOBVB09bH48kM3IpVQIl5kIYt2SSQN3kHxdDOYWWDPQwee6xh/7y/iR6CK31eHuyBrFlbaD/p1c+iaUlvtYmWlu8BFekeresGG6KRIDRMMOunFd0gZ2rgGPenujwg0OHvUzxBdUe7olNQMbWxsZKRJrmkYAg7TmF678xju6l7XZOxRESrSuWlhuk6XEJJNg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5135.eurprd08.prod.outlook.com (2603:10a6:803:108::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27; Mon, 28 Dec 2020 13:37:29 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::f937:5b3:12e1:8297]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::f937:5b3:12e1:8297%7]) with mapi id 15.20.3700.031; Mon, 28 Dec 2020 13:37:29 +0000 Date: Mon, 28 Dec 2020 13:37:26 +0000 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Subject: [PATCH 5/8 v9]middle-end slp: support complex multiply and complex multiply conjugate Message-ID: <20201228133722.GA27314@arm.com> Content-Type: multipart/mixed; boundary="YZ5djTAD1cGYuMQK" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.53] X-ClientProxiedBy: LO2P265CA0280.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::28) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0280.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3700.27 via Frontend Transport; Mon, 28 Dec 2020 13:37:28 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: ecac5349-68c8-4c35-ffa8-08d8ab35c3af X-MS-TrafficTypeDiagnostic: VE1PR08MB5135:|DBBPR08MB4473: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 0D5A956CQyIN5+dGxIMP3LaxxC1B/aLDjQ+/HTXun2MQuBORUdQ+l8fuzN8Q9CGtcS5FByLs1XB9xdSsFlidXaMWt89FoSWQLGSzJPhcXocuZw8IvstL4Vk2+S7rmKj870qki4OmTlHbiFhiMKIFCaqfBgrSVmI78cczw3YptF7eWDKO7QIPoqEIcirxqsw61RY5cWaWZPQppULtAYUKiH7pXMngbiYACt1l/YBD7i7DrLUWf2voSjGwJwi3O83keEWbQ9xGtzeKf+TpWuIWXurkmQBQyOQdx2YuBTyPERtiltZuUA1HZ6OGfMp1c223w7FukGCc29SBQUeiO2VNqCF13KgbLdPQjgB6Kd0wmbrvyvj6NOy+33DILcpvMgqf8p62CTHlZgz/gZLFQgZhp2EZz9BLXCKNFPFZXz9WKT7Iu7+r8H8M3w/P6Xpu2/en X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(136003)(39860400002)(366004)(346002)(376002)(478600001)(44144004)(956004)(16526019)(186003)(5660300002)(8936002)(6916009)(66476007)(66556008)(55016002)(33964004)(30864003)(66946007)(66616009)(83380400001)(33656002)(2616005)(36756003)(8676002)(4743002)(52116002)(7696005)(235185007)(2906002)(86362001)(26005)(4326008)(316002)(44832011)(8886007)(1076003)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?aG1VeTM1a3VHbkUvc2J2ZExVYVZvMGt4MGw5T0JOeE5GQWRWWGpyM1M5aXpW?= =?utf-8?B?aml0TjlZaVBqclIyMVo2RFlmb29YUGROMWxDdXdxRGQ3bWJ6Y1VzTWEyejc0?= =?utf-8?B?M2UzUkEyT2VIMWFtZGIyYWdGa1M0QVVLUFNQQThDM2NIL2RZaWwzSTRZdnV4?= =?utf-8?B?dUczQ3dQYndTUktUR05FRWdwUTFoY2J1N0s4TUZyWndlRThjaWZwUit2T28x?= =?utf-8?B?cHc3SnI1aXh5RlZZMWFIRjBQWFhzNGNlaVlCbVZtaXp4aXo1T3RwSnRFN3Zy?= =?utf-8?B?WmdzVHREOHd5V2UwbGc2dDl0UG1iU1N3cVFzdHdGRWZUanVJUlE2MGw2L2R6?= =?utf-8?B?K3h0aVFEcFFFUUw1SnVQcGViUXZRWW5pOGNQUkxzbUMzS1VDVHg1UTdaM0FH?= =?utf-8?B?SkJZVjNRQnZyWUJad2pvd2JiY0RpbTJLbFE0MjZiQWdZWVRrc1dFN2hCOFRF?= =?utf-8?B?SUlPRmQrWkZLZ2VaMS9hNm45UWNtU01XMXkydjdqOWFocCtjMk9pTnFoR3pF?= =?utf-8?B?eVY1amJROVpRd0pwd21FZytHaURRTVAxUERpTnhDVElpSmhlZTBrRXJ2MCs5?= =?utf-8?B?TzJUWkQwb0NZZmlOZkFmZG5MV0FyNW1PakUyYUhFMEx1aHhMY0NMWUliVXdD?= =?utf-8?B?Tjlsb2ZudCs2OVpqdTFMOG1iZU5LcWROazBIZUdWWG14dnFZdE04VXBacEtR?= =?utf-8?B?bUZzaGdGSXpMSHNMMHZGRVk2UFFydERUVGlkMWJpRHluWWcvd3J0SkQxTVlR?= =?utf-8?B?a1Y2R0dqeFdXMTlVK05Ed1dNS285SGQ2QWNvb3RwcHlNMC9naVJwS3dyRlpu?= =?utf-8?B?bFpBTHc2b3FqRWVxblNhNDB5Tzg2SmFTTnAra2l6SzdJVW45c3FJbmptNll4?= =?utf-8?B?alRsUWpubzN1Q2ZjOHhxRytCTUVaT3hONTlHVnVzc1dOTjhzWWx2ZlNUWFRP?= =?utf-8?B?UmY4dnJyV0x0Y20yS0ZwMEJlQTJnRUJobWJPUmJQaWNBRkJ5QkpiZjF0Kzdi?= =?utf-8?B?MHZLb0FsZCszR3Z2OXhaMHFkL0FTUGhQbTI1NGI2cmdVSDBVbTFjWXE1MWgy?= =?utf-8?B?dUc2NURLMVUyUFZsK3FBTW5leWZaR21OMU5mMEtWNHlLZk43S3BmSnpKWWpP?= =?utf-8?B?cWNWcHkrWmlMWDJBamtTVjhaK1NidzZ4eDk1eGdVb3EyOXFhNTFMaEUza2w0?= =?utf-8?B?YmJQU2w2dXEzZDhPWmxlNFRGbWNSVm1Db3pveUdabDdLS3lRdlA2Nnc5WTIv?= =?utf-8?B?VUptMmNCUmZhMVJQTlNsUHZrT3Z1ZzhObmRqZ3g1K3FDMnI4ZHFVZG5yYWt2?= =?utf-8?Q?ZUCHlckcpTlOErk3QpDz1Te3zUJl7laE+B?= X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5135 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 1988a106-1442-4895-e8bd-08d8ab35b890 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: eXv/URFcvR3Tw3jai1IKJqFyPGh38kypGcjoXYpQ/3IVbpArTPLz+4JPx/s1TFb9vBqOow7CHCPLHDk7cbskA077dZU15VmDNyWtq4XmBGFVe6CbMVE8zgzl8sDfIaAEmg7gyBKTnYkxcqs8yLyIa1Ry8HAH7NjuL70MvaPkNjKGiB8PzWLebPAeGkyNaXh2nPcgsBMiMWLXnciJs/orBhiHK1vLkV8yTzJbZucziA5531cWK8RcCbqSbNq+geytBjH1XTg5VT4CBzMdeupz+MmRiTdumNwfonVfQcAIUkTZjY3Uvlu+gTxTN0ShX16zslIcud8jV2LeW9m1C+Ix5yir9uVabCSUvT5IyVopsj+/I3okLORJZdG7NMxzjYt+nKUDhOx94KfZslWqzAAnQYzFAmbs79nrv2muSF11+aMckHUmevpxl/RsQrLjfAOvLvYNbYcALtRm3IkFuy1yFO5K+8qZp6iWt1h5Pa//tEc= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(396003)(346002)(136003)(39860400002)(46966006)(956004)(33964004)(4743002)(44144004)(6916009)(8676002)(2616005)(66616009)(235185007)(55016002)(36756003)(70206006)(478600001)(81166007)(86362001)(2906002)(5660300002)(70586007)(8886007)(7696005)(82310400003)(83380400001)(336012)(316002)(44832011)(16526019)(186003)(1076003)(4326008)(30864003)(356005)(26005)(33656002)(47076005)(8936002)(82740400003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Dec 2020 13:37:47.5094 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ecac5349-68c8-4c35-ffa8-08d8ab35c3af X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4473 X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Dec 2020 13:38:03 -0000 --YZ5djTAD1cGYuMQK Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Hi All, This adds support for complex multiply and complex multiply and accumulate to the vect pattern detector. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * internal-fn.def (COMPLEX_MUL, COMPLEX_MUL_CONJ): New. * optabs.def (cmul_optab, cmul_conj_optab): New. * doc/md.texi: Document them. * tree-vect-slp-patterns.c (vect_match_call_complex_mla, vect_normalize_conj_loc, is_eq_or_top, vect_validate_multiplication, vect_build_combine_node, class complex_mul_pattern, complex_mul_pattern::matches, complex_mul_pattern::recognize, complex_mul_pattern::build): New. --- inline copy of patch -- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index ec6ec180b91fcf9f481b6754c044483787fd923c..b8cc90e1a75e402abbf8a8cf2efefc1a333f8b3a 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6202,6 +6202,50 @@ The operation is only supported for vector modes @var{m}. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmul@var{m}4} instruction pattern +@item @samp{cmul@var{m}4} +Perform a vector multiply that is semantically the same as multiply of +complex numbers. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] = a[i] * b[i]; + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmul_conj@var{m}4} instruction pattern +@item @samp{cmul_conj@var{m}4} +Perform a vector multiply by conjugate that is semantically the same as a +multiply of complex numbers where the second multiply arguments is conjugated. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] = a[i] * conj (b[i]); + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{ffs@var{m}2} instruction pattern @item @samp{ffs@var{m}2} Store into operand 0 one plus the index of the least significant 1-bit diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 511fe70162b5d9db3a61a5285d31c008f6835487..5a0bbe3fe5dee591d54130e60f6996b28164ae38 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -279,6 +279,8 @@ DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) /* FP scales. */ diff --git a/gcc/optabs.def b/gcc/optabs.def index e9727def4dbf941bb9ac8b56f83f8ea0f52b262c..e82396bae1117c6de91304761a560b7fbcb69ce1 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -292,6 +292,8 @@ OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") +OPTAB_D (cmul_optab, "cmul$a3") +OPTAB_D (cmul_conj_optab, "cmul_conj$a3") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index dbc58f7c53868ed431fc67de1f0162eb0d3b2c24..82721acbab8cf81c4d6f9954c98fb913a7bb6282 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -719,6 +719,368 @@ complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, return new complex_add_pattern (node, &ops, ifn); } +/******************************************************************************* + * complex_mul_pattern + ******************************************************************************/ + +/* Helper function of that looks for a match in the CHILDth child of NODE. The + child used is stored in RES. + + If the match is successful then ARGS will contain the operands matched + and the complex_operation_t type is returned. If match is not successful + then CMPLX_NONE is returned and ARGS is left unmodified. */ + +static inline complex_operation_t +vect_match_call_complex_mla (slp_tree node, unsigned child, + vec *args = NULL, slp_tree *res = NULL) +{ + gcc_assert (child < SLP_TREE_CHILDREN (node).length ()); + + slp_tree data = SLP_TREE_CHILDREN (node)[child]; + + if (res) + *res = data; + + return vect_detect_pair_op (data, false, args); +} + +/* Check to see if either of the trees in ARGS are a NEGATE_EXPR. If the first + child (args[0]) is a NEGATE_EXPR then NEG_FIRST_P is set to TRUE. + + If a negate is found then the values in ARGS are reordered such that the + negate node is always the second one and the entry is replaced by the child + of the negate node. */ + +static inline bool +vect_normalize_conj_loc (vec args, bool *neg_first_p = NULL) +{ + gcc_assert (args.length () == 2); + bool neg_found = false; + + if (vect_match_expression_p (args[0], NEGATE_EXPR)) + { + std::swap (args[0], args[1]); + neg_found = true; + if (neg_first_p) + *neg_first_p = true; + } + else if (vect_match_expression_p (args[1], NEGATE_EXPR)) + { + neg_found = true; + if (neg_first_p) + *neg_first_p = false; + } + + if (neg_found) + args[1] = SLP_TREE_CHILDREN (args[1])[0]; + + return neg_found; +} + +/* Helper function to check if PERM is KIND or PERM_TOP. */ + +static inline bool +is_eq_or_top (complex_load_perm_t perm, complex_perm_kinds_t kind) +{ + return perm.first == kind || perm.first == PERM_TOP; +} + +/* Helper function that checks to see if LEFT_OP and RIGHT_OP are both MULT_EXPR + nodes but also that they represent an operation that is either a complex + multiplication or a complex multiplication by conjugated value. + + Of the negation is expected to be in the first half of the tree (As required + by an FMS pattern) then NEG_FIRST is true. If the operation is a conjugate + operation then CONJ_FIRST_OPERAND is set to indicate whether the first or + second operand contains the conjugate operation. */ + +static inline bool +vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache, + vec left_op, vec right_op, + bool neg_first, bool *conj_first_operand, + bool fms) +{ + /* The presence of a negation indicates that we have either a conjugate or a + rotation. We need to distinguish which one. */ + *conj_first_operand = false; + complex_perm_kinds_t kind; + + /* Complex conjugates have the negation on the imaginary part of the + number where rotations affect the real component. So check if the + negation is on a dup of lane 1. */ + if (fms) + { + /* Canonicalization for fms is not consistent. So have to test both + variants to be sure. This needs to be fixed in the mid-end so + this part can be simpler. */ + kind = linear_loads_p (perm_cache, right_op[0]).first; + if (!((kind == PERM_ODDODD + && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]), + PERM_ODDEVEN)) + || (kind == PERM_ODDEVEN + && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]), + PERM_ODDODD)))) + return false; + } + else + { + if (linear_loads_p (perm_cache, right_op[1]).first != PERM_ODDODD + && !is_eq_or_top (linear_loads_p (perm_cache, right_op[0]), + PERM_ODDEVEN)) + return false; + } + + /* Deal with differences in indexes. */ + int index1 = fms ? 1 : 0; + int index2 = fms ? 0 : 1; + + /* Check if the conjugate is on the second first or second operand. The + order of the node with the conjugate value determines this, and the dup + node must be one of lane 0 of the same DR as the neg node. */ + kind = linear_loads_p (perm_cache, left_op[index1]).first; + if (kind == PERM_TOP) + { + if (linear_loads_p (perm_cache, left_op[index2]).first == PERM_EVENODD) + return true; + } + else if (kind == PERM_EVENODD) + { + if ((kind = linear_loads_p (perm_cache, left_op[index2]).first) == PERM_EVENODD) + return false; + } + else if (!neg_first) + *conj_first_operand = true; + else + return false; + + if (kind != PERM_EVENEVEN) + return false; + + return true; +} + +/* Helper function to help distinguish between a conjugate and a rotation in a + complex multiplication. The operations have similar shapes but the order of + the load permutes are different. This function returns TRUE when the order + is consistent with a multiplication or multiplication by conjugated + operand but returns FALSE if it's a multiplication by rotated operand. */ + +static inline bool +vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache, + vec op, complex_perm_kinds_t permKind) +{ + /* The left node is the more common case, test it first. */ + if (!is_eq_or_top (linear_loads_p (perm_cache, op[0]), permKind)) + { + if (!is_eq_or_top (linear_loads_p (perm_cache, op[1]), permKind)) + return false; + } + return true; +} + +/* This function combines two nodes containing only even and only odd lanes + together into a single node which contains the nodes in even/odd order + by using a lane permute. */ + +static slp_tree +vect_build_combine_node (slp_tree even, slp_tree odd, slp_tree rep) +{ + auto_vec nodes; + nodes.create (2); + vec > perm; + perm.create (SLP_TREE_LANES (rep)); + + for (unsigned x = 0; x < SLP_TREE_LANES (rep); x+=2) + { + perm.quick_push (std::make_pair (0, x)); + perm.quick_push (std::make_pair (1, x)); + } + + nodes.quick_push (even); + nodes.quick_push (odd); + + SLP_TREE_REF_COUNT (even)++; + SLP_TREE_REF_COUNT (odd)++; + + slp_tree vnode = vect_create_new_slp_node (2, SLP_TREE_CODE (even)); + SLP_TREE_CODE (vnode) = VEC_PERM_EXPR; + SLP_TREE_LANE_PERMUTATION (vnode) = perm; + SLP_TREE_CHILDREN (vnode).safe_splice (nodes); + SLP_TREE_REF_COUNT (vnode) = 1; + SLP_TREE_LANES (vnode) = SLP_TREE_LANES (rep); + gcc_assert (perm.length () == SLP_TREE_LANES (vnode)); + /* Representation is set to that of the current node as the vectorizer + can't deal with VEC_PERMs with no representation, as would be the + case with invariants. */ + SLP_TREE_REPRESENTATIVE (vnode) = SLP_TREE_REPRESENTATIVE (rep); + SLP_TREE_VECTYPE (vnode) = SLP_TREE_VECTYPE (rep); + return vnode; +} + +class complex_mul_pattern : public complex_pattern +{ + protected: + complex_mul_pattern (slp_tree *node, vec *m_ops, internal_fn ifn) + : complex_pattern (node, m_ops, ifn) + { + this->m_num_args = 2; + } + + public: + void build (vec_info *); + static internal_fn + matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *, + vec *); + + static vect_pattern* + recognize (slp_tree_to_load_perm_map_t *, slp_tree *); + + static vect_pattern* + mkInstance (slp_tree *node, vec *m_ops, internal_fn ifn) + { + return new complex_mul_pattern (node, m_ops, ifn); + } + +}; + +/* Pattern matcher for trying to match complex multiply pattern in SLP tree + If the operation matches then IFN is set to the operation it matched + and the arguments to the two replacement statements are put in m_ops. + + If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]); + double bx = (a[i+1] * b[i]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. The initial match is + expected to be in OP1 and the initial match operands in args0. */ + +internal_fn +complex_mul_pattern::matches (complex_operation_t op, + slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node, vec *ops) +{ + internal_fn ifn = IFN_LAST; + + if (op != MINUS_PLUS) + return IFN_LAST; + + slp_tree root = *node; + /* First two nodes must be a multiply. */ + auto_vec muls; + if (vect_match_call_complex_mla (root, 0) != MULT_MULT + || vect_match_call_complex_mla (root, 1, &muls) != MULT_MULT) + return IFN_LAST; + + /* Now operand2+4 may lead to another expression. */ + auto_vec left_op, right_op; + left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); + right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + + if (linear_loads_p (perm_cache, left_op[1]).first == PERM_ODDEVEN) + return IFN_LAST; + + bool neg_first; + bool is_neg = vect_normalize_conj_loc (right_op, &neg_first); + + if (!is_neg) + { + /* A multiplication needs to multiply agains the real pair, otherwise + the pattern matches that of FMS. */ + if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN) + || vect_normalize_conj_loc (left_op)) + return IFN_LAST; + ifn = IFN_COMPLEX_MUL; + } + else if (is_neg) + { + bool conj_first_operand; + if (!vect_validate_multiplication (perm_cache, left_op, right_op, + neg_first, &conj_first_operand, + false)) + return IFN_LAST; + + ifn = IFN_COMPLEX_MUL_CONJ; + } + + if (!vect_pattern_validate_optab (ifn, *node)) + return IFN_LAST; + + ops->truncate (0); + ops->create (3); + + complex_perm_kinds_t kind = linear_loads_p (perm_cache, left_op[0]).first; + if (kind == PERM_EVENODD) + { + ops->quick_push (left_op[1]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else if (kind == PERM_TOP) + { + ops->quick_push (left_op[1]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else + { + ops->quick_push (left_op[0]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[1]); + } + + return ifn; +} + +/* Attempt to recognize a complex mul pattern. */ + +vect_pattern* +complex_mul_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node) +{ + auto_vec ops; + complex_operation_t op + = vect_detect_pair_op (*node, true, &ops); + internal_fn ifn + = complex_mul_pattern::matches (op, perm_cache, node, &ops); + if (ifn == IFN_LAST) + return NULL; + + return new complex_mul_pattern (node, &ops, ifn); +} + +/* Perform a replacement of the detected complex mul pattern with the new + instruction sequences. */ + +void +complex_mul_pattern::build (vec_info *vinfo) +{ + auto_vec nodes; + + /* First re-arrange the children. */ + nodes.create (2); + + nodes.quick_push (this->m_ops[2]); + nodes.quick_push ( + vect_build_combine_node (this->m_ops[0], this->m_ops[1], *this->m_node)); + SLP_TREE_REF_COUNT (this->m_ops[2])++; + + slp_tree node; + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + SLP_TREE_CHILDREN (*this->m_node).truncate (0); + SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes); + + complex_pattern::build (vinfo); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ -- --YZ5djTAD1cGYuMQK Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename="rb13960.patch" diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index ec6ec180b91fcf9f481b6754c044483787fd923c..b8cc90e1a75e402abbf8a8cf2efefc1a333f8b3a 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6202,6 +6202,50 @@ The operation is only supported for vector modes @var{m}. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmul@var{m}4} instruction pattern +@item @samp{cmul@var{m}4} +Perform a vector multiply that is semantically the same as multiply of +complex numbers. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] = a[i] * b[i]; + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmul_conj@var{m}4} instruction pattern +@item @samp{cmul_conj@var{m}4} +Perform a vector multiply by conjugate that is semantically the same as a +multiply of complex numbers where the second multiply arguments is conjugated. + +@smallexample + complex TYPE c[N]; + complex TYPE a[N]; + complex TYPE b[N]; + for (int i = 0; i < N; i += 1) + @{ + c[i] = a[i] * conj (b[i]); + @} +@end smallexample + +In GCC lane ordering the real part of the number must be in the even lanes with +the imaginary part in the odd lanes. + +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{ffs@var{m}2} instruction pattern @item @samp{ffs@var{m}2} Store into operand 0 one plus the index of the least significant 1-bit diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 511fe70162b5d9db3a61a5285d31c008f6835487..5a0bbe3fe5dee591d54130e60f6996b28164ae38 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -279,6 +279,8 @@ DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) /* FP scales. */ diff --git a/gcc/optabs.def b/gcc/optabs.def index e9727def4dbf941bb9ac8b56f83f8ea0f52b262c..e82396bae1117c6de91304761a560b7fbcb69ce1 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -292,6 +292,8 @@ OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") +OPTAB_D (cmul_optab, "cmul$a3") +OPTAB_D (cmul_conj_optab, "cmul_conj$a3") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index dbc58f7c53868ed431fc67de1f0162eb0d3b2c24..82721acbab8cf81c4d6f9954c98fb913a7bb6282 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -719,6 +719,368 @@ complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, return new complex_add_pattern (node, &ops, ifn); } +/******************************************************************************* + * complex_mul_pattern + ******************************************************************************/ + +/* Helper function of that looks for a match in the CHILDth child of NODE. The + child used is stored in RES. + + If the match is successful then ARGS will contain the operands matched + and the complex_operation_t type is returned. If match is not successful + then CMPLX_NONE is returned and ARGS is left unmodified. */ + +static inline complex_operation_t +vect_match_call_complex_mla (slp_tree node, unsigned child, + vec *args = NULL, slp_tree *res = NULL) +{ + gcc_assert (child < SLP_TREE_CHILDREN (node).length ()); + + slp_tree data = SLP_TREE_CHILDREN (node)[child]; + + if (res) + *res = data; + + return vect_detect_pair_op (data, false, args); +} + +/* Check to see if either of the trees in ARGS are a NEGATE_EXPR. If the first + child (args[0]) is a NEGATE_EXPR then NEG_FIRST_P is set to TRUE. + + If a negate is found then the values in ARGS are reordered such that the + negate node is always the second one and the entry is replaced by the child + of the negate node. */ + +static inline bool +vect_normalize_conj_loc (vec args, bool *neg_first_p = NULL) +{ + gcc_assert (args.length () == 2); + bool neg_found = false; + + if (vect_match_expression_p (args[0], NEGATE_EXPR)) + { + std::swap (args[0], args[1]); + neg_found = true; + if (neg_first_p) + *neg_first_p = true; + } + else if (vect_match_expression_p (args[1], NEGATE_EXPR)) + { + neg_found = true; + if (neg_first_p) + *neg_first_p = false; + } + + if (neg_found) + args[1] = SLP_TREE_CHILDREN (args[1])[0]; + + return neg_found; +} + +/* Helper function to check if PERM is KIND or PERM_TOP. */ + +static inline bool +is_eq_or_top (complex_load_perm_t perm, complex_perm_kinds_t kind) +{ + return perm.first == kind || perm.first == PERM_TOP; +} + +/* Helper function that checks to see if LEFT_OP and RIGHT_OP are both MULT_EXPR + nodes but also that they represent an operation that is either a complex + multiplication or a complex multiplication by conjugated value. + + Of the negation is expected to be in the first half of the tree (As required + by an FMS pattern) then NEG_FIRST is true. If the operation is a conjugate + operation then CONJ_FIRST_OPERAND is set to indicate whether the first or + second operand contains the conjugate operation. */ + +static inline bool +vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache, + vec left_op, vec right_op, + bool neg_first, bool *conj_first_operand, + bool fms) +{ + /* The presence of a negation indicates that we have either a conjugate or a + rotation. We need to distinguish which one. */ + *conj_first_operand = false; + complex_perm_kinds_t kind; + + /* Complex conjugates have the negation on the imaginary part of the + number where rotations affect the real component. So check if the + negation is on a dup of lane 1. */ + if (fms) + { + /* Canonicalization for fms is not consistent. So have to test both + variants to be sure. This needs to be fixed in the mid-end so + this part can be simpler. */ + kind = linear_loads_p (perm_cache, right_op[0]).first; + if (!((kind == PERM_ODDODD + && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]), + PERM_ODDEVEN)) + || (kind == PERM_ODDEVEN + && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]), + PERM_ODDODD)))) + return false; + } + else + { + if (linear_loads_p (perm_cache, right_op[1]).first != PERM_ODDODD + && !is_eq_or_top (linear_loads_p (perm_cache, right_op[0]), + PERM_ODDEVEN)) + return false; + } + + /* Deal with differences in indexes. */ + int index1 = fms ? 1 : 0; + int index2 = fms ? 0 : 1; + + /* Check if the conjugate is on the second first or second operand. The + order of the node with the conjugate value determines this, and the dup + node must be one of lane 0 of the same DR as the neg node. */ + kind = linear_loads_p (perm_cache, left_op[index1]).first; + if (kind == PERM_TOP) + { + if (linear_loads_p (perm_cache, left_op[index2]).first == PERM_EVENODD) + return true; + } + else if (kind == PERM_EVENODD) + { + if ((kind = linear_loads_p (perm_cache, left_op[index2]).first) == PERM_EVENODD) + return false; + } + else if (!neg_first) + *conj_first_operand = true; + else + return false; + + if (kind != PERM_EVENEVEN) + return false; + + return true; +} + +/* Helper function to help distinguish between a conjugate and a rotation in a + complex multiplication. The operations have similar shapes but the order of + the load permutes are different. This function returns TRUE when the order + is consistent with a multiplication or multiplication by conjugated + operand but returns FALSE if it's a multiplication by rotated operand. */ + +static inline bool +vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache, + vec op, complex_perm_kinds_t permKind) +{ + /* The left node is the more common case, test it first. */ + if (!is_eq_or_top (linear_loads_p (perm_cache, op[0]), permKind)) + { + if (!is_eq_or_top (linear_loads_p (perm_cache, op[1]), permKind)) + return false; + } + return true; +} + +/* This function combines two nodes containing only even and only odd lanes + together into a single node which contains the nodes in even/odd order + by using a lane permute. */ + +static slp_tree +vect_build_combine_node (slp_tree even, slp_tree odd, slp_tree rep) +{ + auto_vec nodes; + nodes.create (2); + vec > perm; + perm.create (SLP_TREE_LANES (rep)); + + for (unsigned x = 0; x < SLP_TREE_LANES (rep); x+=2) + { + perm.quick_push (std::make_pair (0, x)); + perm.quick_push (std::make_pair (1, x)); + } + + nodes.quick_push (even); + nodes.quick_push (odd); + + SLP_TREE_REF_COUNT (even)++; + SLP_TREE_REF_COUNT (odd)++; + + slp_tree vnode = vect_create_new_slp_node (2, SLP_TREE_CODE (even)); + SLP_TREE_CODE (vnode) = VEC_PERM_EXPR; + SLP_TREE_LANE_PERMUTATION (vnode) = perm; + SLP_TREE_CHILDREN (vnode).safe_splice (nodes); + SLP_TREE_REF_COUNT (vnode) = 1; + SLP_TREE_LANES (vnode) = SLP_TREE_LANES (rep); + gcc_assert (perm.length () == SLP_TREE_LANES (vnode)); + /* Representation is set to that of the current node as the vectorizer + can't deal with VEC_PERMs with no representation, as would be the + case with invariants. */ + SLP_TREE_REPRESENTATIVE (vnode) = SLP_TREE_REPRESENTATIVE (rep); + SLP_TREE_VECTYPE (vnode) = SLP_TREE_VECTYPE (rep); + return vnode; +} + +class complex_mul_pattern : public complex_pattern +{ + protected: + complex_mul_pattern (slp_tree *node, vec *m_ops, internal_fn ifn) + : complex_pattern (node, m_ops, ifn) + { + this->m_num_args = 2; + } + + public: + void build (vec_info *); + static internal_fn + matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *, + vec *); + + static vect_pattern* + recognize (slp_tree_to_load_perm_map_t *, slp_tree *); + + static vect_pattern* + mkInstance (slp_tree *node, vec *m_ops, internal_fn ifn) + { + return new complex_mul_pattern (node, m_ops, ifn); + } + +}; + +/* Pattern matcher for trying to match complex multiply pattern in SLP tree + If the operation matches then IFN is set to the operation it matched + and the arguments to the two replacement statements are put in m_ops. + + If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]); + double bx = (a[i+1] * b[i]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. The initial match is + expected to be in OP1 and the initial match operands in args0. */ + +internal_fn +complex_mul_pattern::matches (complex_operation_t op, + slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node, vec *ops) +{ + internal_fn ifn = IFN_LAST; + + if (op != MINUS_PLUS) + return IFN_LAST; + + slp_tree root = *node; + /* First two nodes must be a multiply. */ + auto_vec muls; + if (vect_match_call_complex_mla (root, 0) != MULT_MULT + || vect_match_call_complex_mla (root, 1, &muls) != MULT_MULT) + return IFN_LAST; + + /* Now operand2+4 may lead to another expression. */ + auto_vec left_op, right_op; + left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); + right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + + if (linear_loads_p (perm_cache, left_op[1]).first == PERM_ODDEVEN) + return IFN_LAST; + + bool neg_first; + bool is_neg = vect_normalize_conj_loc (right_op, &neg_first); + + if (!is_neg) + { + /* A multiplication needs to multiply agains the real pair, otherwise + the pattern matches that of FMS. */ + if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN) + || vect_normalize_conj_loc (left_op)) + return IFN_LAST; + ifn = IFN_COMPLEX_MUL; + } + else if (is_neg) + { + bool conj_first_operand; + if (!vect_validate_multiplication (perm_cache, left_op, right_op, + neg_first, &conj_first_operand, + false)) + return IFN_LAST; + + ifn = IFN_COMPLEX_MUL_CONJ; + } + + if (!vect_pattern_validate_optab (ifn, *node)) + return IFN_LAST; + + ops->truncate (0); + ops->create (3); + + complex_perm_kinds_t kind = linear_loads_p (perm_cache, left_op[0]).first; + if (kind == PERM_EVENODD) + { + ops->quick_push (left_op[1]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else if (kind == PERM_TOP) + { + ops->quick_push (left_op[1]); + ops->quick_push (right_op[1]); + ops->quick_push (left_op[0]); + } + else + { + ops->quick_push (left_op[0]); + ops->quick_push (right_op[0]); + ops->quick_push (left_op[1]); + } + + return ifn; +} + +/* Attempt to recognize a complex mul pattern. */ + +vect_pattern* +complex_mul_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, + slp_tree *node) +{ + auto_vec ops; + complex_operation_t op + = vect_detect_pair_op (*node, true, &ops); + internal_fn ifn + = complex_mul_pattern::matches (op, perm_cache, node, &ops); + if (ifn == IFN_LAST) + return NULL; + + return new complex_mul_pattern (node, &ops, ifn); +} + +/* Perform a replacement of the detected complex mul pattern with the new + instruction sequences. */ + +void +complex_mul_pattern::build (vec_info *vinfo) +{ + auto_vec nodes; + + /* First re-arrange the children. */ + nodes.create (2); + + nodes.quick_push (this->m_ops[2]); + nodes.quick_push ( + vect_build_combine_node (this->m_ops[0], this->m_ops[1], *this->m_node)); + SLP_TREE_REF_COUNT (this->m_ops[2])++; + + slp_tree node; + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + SLP_TREE_CHILDREN (*this->m_node).truncate (0); + SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes); + + complex_pattern::build (vinfo); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ --YZ5djTAD1cGYuMQK--